Read Text File And Total/ Each Word Count Using C#

I have written the code for reading a text file from a local machine. It reads all the text from the file and counts total word count and each unique word frequency/occurrence in the file.

The code first uses ReadAllText(inFileName) to copy the file’s text into a string. Next, the code uses regular expressions to replace non-letter and non-number characters with spaces. It uses the pattern [^a-zA-Z0-9]. 
The ^ means “not the following characters.

The a-zA-Z0-9 part means any lowercase or uppercase letter or a digit.

The code uses the Regex object’s Replace method to replace the characters that match the pattern with a space character. 

The code then uses Split to break the text into an array of words, removing any duplicates.

The code uses LINQ to select all of the words from the array and sort them. It uses the Distinct method to remove duplicates.

CODE

  1. using System;  
  2. using System.IO;  
  3. using System.Text.RegularExpressions;  
  4. using System.Linq;  
  5. class WordCounter {  
  6.     static void Main() {  
  7.         string inFileName = "C:\\Files\\MyFile.txt";  
  8.         StreamReader sr = new StreamReader(inFileName);  
  9.         string text = System.IO.File.ReadAllText(@ "C:\Files\MyFile.txt");  
  10.         Regex reg_exp = new Regex("[^a-zA-Z0-9]");  
  11.         text = reg_exp.Replace(text, " ");  
  12.         string[] words = text.Split(new char[] {  
  13.             ' '  
  14.         }, StringSplitOptions.RemoveEmptyEntries);  
  15.         var word_query = (from string word in words orderby word select word).Distinct();  
  16.         string[] result = word_query.ToArray();  
  17.         int counter = 0;  
  18.         string delim = " ,.";  
  19.         string[] fields = null;  
  20.         string line = null;  
  21.         while (!sr.EndOfStream) {  
  22.             line = sr.ReadLine(); //each time you read a line you should split it into the words  
  23.             line.Trim();  
  24.             fields = line.Split(delim.ToCharArray(), StringSplitOptions.RemoveEmptyEntries);  
  25.             counter += fields.Length; //and just add how many of them there is  
  26.             foreach(string word in result) {  
  27.                 CountStringOccurrences(text, word);  
  28.             }  
  29.         }  
  30.         sr.Close();  
  31.         Console.WriteLine("The total word count is {0}", counter);  
  32.         Console.ReadLine();  
  33.     }  
  34.     //Count the frequency of each unique word.  
  35.     public static void CountStringOccurrences(string text, string word) {  
  36.         int count = 0;  
  37.         int i = 0;  
  38.         while ((i = text.IndexOf(word, i)) != -1) {  
  39.             i += word.Length;  
  40.             count++;  
  41.         }  
  42.         Console.WriteLine("{0} {1}", count, word);  
  43.     }  
  44. }  
The following output will display in the console.

 
X

Build smarter apps with Machine Learning, Bots, Cognitive Services - Start free.

Start Learning Now