Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

String Compression

Problem Statement

Write a Java program that implements a method to compress a string by replacing sequences of repeated characters with the character followed by the count of its consecutive occurrences (e.g., "aabbb" becomes "a2b3"). Use StringBuilder for efficiency in building the compressed string. If the compressed string is not shorter than the original, return the original string. Test the implementation with various inputs, including empty strings, single characters, and strings with no repeated characters. You can visualize this as shrinking a word by summarizing repeated letters, like condensing a repetitive chant into a shorter code, ensuring the result is as compact as possible.

Input: A string (e.g., "aabbb", "abcd", ""). Output: The compressed string (e.g., "a2b3") or the original string if compression does not reduce the length. Constraints:

  • String length is between 0 and 10^5.
  • The string contains only lowercase letters (a-z).
  • The input may be empty or null. Example:
  • Input: "aabbb"
  • Output: "a2b3"
  • Explanation: 'a' appears twice, 'b' appears three times, so the compressed string is "a2b3".
  • Input: "abcd"
  • Output: "abcd"
  • Explanation: No repeated characters, and "a1b1c1d1" is longer, so return "abcd".
  • Input: ""
  • Output: ""
  • Explanation: Empty string returns empty string.

Pseudocode

FUNCTION compressString(input)
    IF input is null THEN
        RETURN null
    ENDIF
    IF input is empty THEN
        RETURN empty string
    ENDIF
    CREATE stringBuilder for result
    SET count to 1
    SET currentChar to input[0]
    FOR i from 1 to length of input - 1
        IF input[i] equals currentChar THEN
            INCREMENT count
        ELSE
            APPEND currentChar to stringBuilder
            IF count > 1 THEN
                APPEND count to stringBuilder
            ENDIF
            SET currentChar to input[i]
            SET count to 1
        ENDFOR
    APPEND currentChar to stringBuilder
    IF count > 1 THEN
        APPEND count to stringBuilder
    ENDIF
    SET compressed to stringBuilder as string
    IF length of compressed >= length of input THEN
        RETURN input
    ENDIF
    RETURN compressed
ENDFUNCTION

FUNCTION main()
    SET testStrings to array of strings including various cases
    FOR each string in testStrings
        PRINT input string
        CALL compressString(string)
        PRINT compressed string
    ENDFOR
ENDFUNCTION

Algorithm Steps

  1. Check if the input string is null; if so, return null.
  2. Check if the input string is empty; if so, return an empty string.
  3. Initialize a StringBuilder for the result, a count for consecutive characters, and the first character as currentChar.
  4. Iterate through the string from index 1: a. If the current character equals currentChar, increment the count. b. Otherwise, append currentChar to StringBuilder; if count > 1, append the count. c. Update currentChar to the current character and reset count to 1.
  5. After the loop, append the final currentChar and its count (if > 1).
  6. If the compressed string’s length is not less than the original, return the original string.
  7. Return the compressed string.
  8. In the main method, test with various strings, including empty, single-character, repeated characters, and no repeats.

Java Implementation

public class StringCompression {
    // Compresses a string using StringBuilder
    public String compressString(String input) {
        if (input == null) {
            return null;
        }
        if (input.isEmpty()) {
            return "";
        }
        StringBuilder result = new StringBuilder();
        int count = 1;
        char currentChar = input.charAt(0);
        // Iterate through string starting from index 1
        for (int i = 1; i < input.length(); i++) {
            if (input.charAt(i) == currentChar) {
                count++;
            } else {
                result.append(currentChar);
                if (count > 1) {
                    result.append(count);
                }
                currentChar = input.charAt(i);
                count = 1;
            }
        }
        // Append the last character and its count
        result.append(currentChar);
        if (count > 1) {
            result.append(count);
        }
        // Return original if compressed length is not shorter
        String compressed = result.toString();
        return compressed.length() >= input.length() ? input : compressed;
    }

    // Main method to test compressString with various inputs
    public static void main(String[] args) {
        StringCompression compressor = new StringCompression();

        // Test cases
        String[] testStrings = {
            "aabbb",           // Repeated characters
            "abcd",           // No repeats
            "",               // Empty string
            "a",              // Single character
            "aabbcc",         // Multiple repeated characters
            "aaaa",           // All same character
            null              // Null input
        };

        for (int i = 0; i < testStrings.length; i++) {
            System.out.println("Test case " + (i + 1) + ":");
            System.out.println("Input: \"" + testStrings[i] + "\"");
            String result = compressor.compressString(testStrings[i]);
            System.out.println("Compressed: \"" + result + "\"\n");
        }
    }
}

Output

Running the main method produces:

Test case 1:
Input: "aabbb"
Compressed: "a2b3"

Test case 2:
Input: "abcd"
Compressed: "abcd"

Test case 3:
Input: ""
Compressed: ""

Test case 4:
Input: "a"
Compressed: "a"

Test case 5:
Input: "aabbcc"
Compressed: "a2b2c2"

Test case 6:
Input: "aaaa"
Compressed: "a4"

Test case 7:
Input: "null"
Compressed: "null"

Explanation:

  • Test case 1: "aabbb""a2b3" (2 'a's, 3 'b's).
  • Test case 2: "abcd""abcd" (no repeats, "a1b1c1d1" is longer).
  • Test case 3: Empty string """".
  • Test case 4: "a""a" (single character, "a1" is longer).
  • Test case 5: "aabbcc""a2b2c2" (each character repeated twice).
  • Test case 6: "aaaa""a4" (4 'a's).
  • Test case 7: nullnull.

How It Works

  • Step 1: Check for null or empty input; return null or empty string as needed.
  • Step 2: Initialize StringBuilder, set count = 1, and currentChar to the first character.
  • Step 3: Iterate from index 1:
    • If current character matches currentChar, increment count.
    • Else, append currentChar and count (if > 1), reset currentChar and count.
  • Step 4: Append final currentChar and count (if > 1).
  • Step 5: Compare lengths; return original if compressed length is not shorter.
  • Example Trace (Test case 1):
    • Input: "aabbb".
    • Initialize: result = "", currentChar = 'a', count = 1.
    • i=1: 'a' matches 'a', count = 2.
    • i=2: 'b' differs, append "a2", set currentChar = 'b', count = 1.
    • i=3: 'b' matches, count = 2.
    • i=4: 'b' matches, count = 3.
    • End: Append "b3", result = "a2b3".
    • Length check: "a2b3" (4) < "aabbb" (5) → return "a2b3".
  • Main Method: Tests with repeated characters, no repeats, empty, single character, and null inputs.

Complexity Analysis Table

OperationTime ComplexitySpace Complexity
CompressionO(n)O(n)
Full AlgorithmO(n)O(n)

Note:

  • n is the length of the input string.
  • Time complexity: O(n) for iterating through the string once.
  • Space complexity: O(n) for the StringBuilder to store the compressed string.
  • Best, average, and worst cases are O(n) time and O(n) space.

✅ Tip: Use StringBuilder to efficiently build the compressed string. Test with strings that have no repeats or all repeats to verify the length comparison logic.

⚠ Warning: Ensure the compressed string is compared with the original length to return the shorter one. Check for null input to avoid NullPointerException.