Find the count of occurrences of all the words in a given sentence

Let's say we have a given sentence consisting of more than one repeating words, we need to print all the distinct words present in the sentence and how many times each word is repeated inside the sentence.

If the input sentence is "the cat and the dog had great food in the dog house for lunch" the output should be:

the : 3
cat : 1  
and : 1  
dog : 2  
had : 1  
great : 1
food : 1 
in : 1   
house : 1
for : 1  
lunch : 1

What questions can I ask here?

  • Are all the characters in the sentence in the same case?
  • Will there be any special character (such as a comma, dot) or any other punctuation marks in the sentence?

How to solve it?:

Let’s write down how to solve it step by step. We can do this in the below steps:

  1. Split the input sentence into a String Array of Words
  2. Maintain a Dictionary for all the Words and their count
  3. Loop through each word in the Array
  4. Check if the word is inside the Dictionary
  5. If word in the Dictionary exists – increment counter by 1
  6. Else add a new entry for the word in the Table and counter 1
  7. Finally, once all the elements in the array are passed, print the Dictionary

Code in C#:

   public void DoWordCount(string sentence)
   {
     string[] arrayOfWords = sentence.Split(" ");
     Dictionary<string, int> wordTable = new Dictionary<string, int>();
     
     for (int i = 0; i < arrayOfWords.Length; i++)
     {
         string currentWord = arrayOfWords[i];
         if (wordTable.ContainsKey(currentWord))
         {
              wordTable[currentWord] += 1;
         }
         else
         {
              wordTable.Add(currentWord, 1);
          }
      }
      
      foreach (var entry in wordTable)
      {
           Console.WriteLine($"{entry.Key} : {entry.Value}");
      }
   }

Why Dictionary and not any other DataStructure?
Because Dictionary has the fastest key-lookup, and so we can do this in a lesser time and memory.

What if I need to exclude the special characters?
We can use a RegExp to first replace all the special characters in the string (except the spaces) and then continue with the split.

What if I need to ignore the case?
While adding the word into the Dictionary, we can convert it to lower case and do the same while lookup in the dictionary.


Buy Me A Coffee

Found this article helpful? Please consider supporting!

Ram
Ram

I'm a full-stack developer and a software enthusiast who likes to play around with cloud and tech stack out of curiosity. You can connect with me on Medium, Twitter or LinkedIn.

Leave a Reply

Your email address will not be published. Required fields are marked *