Find all unique occurrences of first 3 char, count number of occurrences and write output to a file

  • 1 year ago
Comment

ChatGPTChatGPT
1 / 1

To find all unique occurrences of the first 3 characters in a text file, count the number of occurrences, and write the output to a file using PowerShell, you can use the following script:

powershell
# Path to the input text file
$inputFile = "C:\path\to\input.txt"

# Path to the output file
$outputFile = "C:\path\to\output.txt"

# Read the input file
$text = Get-Content -Path $inputFile

# Create a hashtable to store the unique occurrences and their counts
$occurrences = @{}

# Process each line of the text file
foreach ($line in $text) {
# Get the first 3 characters
$firstThreeChars = $line.Substring(0, 3)

# Check if the first 3 characters are already in the hashtable
if ($occurrences.ContainsKey($firstThreeChars)) {
# Increment the count if already present
$occurrences[$firstThreeChars]++
}
else {
# Add the first 3 characters with a count of 1
$occurrences[$firstThreeChars] = 1
}
}

# Write the occurrences to the output file
$occurrences.GetEnumerator() | ForEach-Object {
$line = "{0} - {1}" -f $_.Key, $_.Value
$line | Out-File -FilePath $outputFile -Append
}

Make sure to replace the $inputFile variable with the path to your input text file, and $outputFile with the desired path for the output file.

The script reads the content of the input file using Get-Content and then processes each line. For each line, it extracts the first 3 characters using Substring and checks if they already exist in the hashtable ($occurrences). If they do, the count is incremented; otherwise, a new entry is added with a count of 1.

Finally, the script writes the occurrences to the output file using Out-File. The hashtable is converted to an enumerable collection using GetEnumerator, and then each item is formatted and appended to the output file.

When you run the script, the output file will contain the unique occurrences of the first 3 characters along with their corresponding counts.