Compute and generate a compressed string for a given string containing repetitions

For a given string return a compressed string which replaces all the repetitions of the characters with their counts. If the compressed string is not smaller than the initial string return the original string as it is. For example, if the input string is "aabbbbbcccdddeef", then the resultant string should be "a2b5c3d3e2f1", whereas if the input string is "abcdef" then the output should be "abcdef".

How do we solve this? or let’s think about, what are we doing here? we’re picking each character from the string and checking if there is a repetition for it. if there is a repetition we’re replacing that entire repetition with a count of the number. if none of the characters are repeated, then the length of the output shall be longer than the input (since we would have 1 adjoining each character which increases the length) so we simply print the number as is.

How do we memorize the repetition counts for every character in the string. By using a HashTable (Dictionary in C#) that helps us do the math as well as seek the respective character which shall be the key, faster.

public string PrintCompressed(string input)
{
    var characterCounts = new Dictionary<char, int>();
    for (int i = 0; i < input.Length; i++)
    {
      if (characterCounts.ContainsKey(input[i]))
      {
          characterCounts[input[i]]++;
      }
      else
      {
          characterCounts[input[i]] = 1;
      }
    }
    
    StringBuilder outputBuilder = new StringBuilder();
    foreach (var pair in characterCounts)
    {
        outputBuilder.Append(pair.Key);
        outputBuilder.Append(pair.Value);
    }
    
    if (outputBuilder.Length > input.Length)
        return input;
    else
        return outputBuilder.ToString();
}

Finally when the Memorized table is ready, we’d just loop through each character and concatenate the character with its count. Remember, we’ve to use StringBuilder here instead of normal String concatenation. Why? Because, String concatenation creates a new output string from the input strings causing a time and memory issue. StringBuilder avoids this by managing it in the form of an ArrayList.

Finally when the String is created, we’d just compare to the original string for length. If the length is higher than the input (meaning there was no repetition and instead 1s were appended), we’d return the original string, else we return the "compressed" string.

*Question from the book Cracking the Coding Interviews


Buy Me A Coffee

Found this article helpful? Please consider supporting!

Ram
Ram

I'm a full-stack developer and a software enthusiast who likes to play around with cloud and tech stack out of curiosity. You can connect with me on Medium, Twitter or LinkedIn.

Leave a Reply

Your email address will not be published. Required fields are marked *