If the input sentence is "the cat and the dog had great food in the dog house for lunch" the output should be:
the : 3
cat : 1
and : 1
dog : 2
had : 1
great : 1
food : 1
in : 1
house : 1
for : 1
lunch : 1
What questions can I ask here?
- Are all the characters in the sentence in the same case?
- Will there be any special character (such as a comma, dot) or any other punctuation marks in the sentence?
How to solve it?:
Let’s write down how to solve it step by step. We can do this in the below steps:
- Split the input sentence into a String Array of Words
- Maintain a Dictionary for all the Words and their count
- Loop through each word in the Array
- Check if the word is inside the Dictionary
- If word in the Dictionary exists – increment counter by 1
- Else add a new entry for the word in the Table and counter 1
- Finally, once all the elements in the array are passed, print the Dictionary
Code in C#:
public void DoWordCount(string sentence)
{
string[] arrayOfWords = sentence.Split(" ");
Dictionary<string, int> wordTable = new Dictionary<string, int>();
for (int i = 0; i < arrayOfWords.Length; i++)
{
string currentWord = arrayOfWords[i];
if (wordTable.ContainsKey(currentWord))
{
wordTable[currentWord] += 1;
}
else
{
wordTable.Add(currentWord, 1);
}
}
foreach (var entry in wordTable)
{
Console.WriteLine($"{entry.Key} : {entry.Value}");
}
}
Why Dictionary and not any other DataStructure?
Because Dictionary has the fastest key-lookup, and so we can do this in a lesser time and memory.
What if I need to exclude the special characters?
We can use a RegExp to first replace all the special characters in the string (except the spaces) and then continue with the split.
What if I need to ignore the case?
While adding the word into the Dictionary, we can convert it to lower case and do the same while lookup in the dictionary.