Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Substring Frequency

Problem Statement

Write a Java program that implements two methods to count the occurrences of a substring within a string using the indexOf method: one for non-overlapping occurrences and one for overlapping occurrences. The program should return the number of times the substring appears in the string and test the implementation with various inputs, including cases where the substring has overlapping and non-overlapping occurrences, as well as edge cases like empty strings or substrings. You can visualize this as searching for a specific word in a book, counting how many times it appears, either by skipping the word’s length to avoid overlaps or checking every possible position for potential overlaps.

Input: A string and a substring (e.g., string = "aaa", substring = "aa"). Output: Two integers representing the number of non-overlapping and overlapping occurrences of the substring (e.g., non-overlapping: 1, overlapping: 2). Constraints:

  • String and substring lengths are between 0 and 10^5.
  • The string and substring contain any ASCII characters.
  • The input may be empty or null. Example:
  • Input: string = "aaa", substring = "aa"
  • Output: Non-overlapping: 1, Overlapping: 2
  • Explanation: Non-overlapping counts "aa" once (positions 0-1), skipping to position 2; overlapping counts "aa" at positions 0-1 and 1-2.
  • Input: string = "abcabc", substring = "abc"
  • Output: Non-overlapping: 2, Overlapping: 2
  • Explanation: "abc" appears at positions 0-2 and 3-5, with no overlap possible.
  • Input: string = "", substring = "a"
  • Output: Non-overlapping: 0, Overlapping: 0
  • Explanation: Empty string has no occurrences.

Pseudocode

FUNCTION countNonOverlapping(input, sub)
    IF input is null OR sub is null OR input is empty OR sub is empty THEN
        RETURN 0
    ENDIF
    SET count to 0
    SET index to 0
    WHILE indexOf(sub in input starting at index) is not -1
        INCREMENT count
        SET index to indexOf(sub) + length of sub
    ENDWHILE
    RETURN count
ENDFUNCTION

FUNCTION countOverlapping(input, sub)
    IF input is null OR sub is null OR input is empty OR sub is empty THEN
        RETURN 0
    ENDIF
    SET count to 0
    SET index to 0
    WHILE indexOf(sub in input starting at index) is not -1
        INCREMENT count
        INCREMENT index
    ENDWHILE
    RETURN count
ENDFUNCTION

FUNCTION main()
    SET testCases to array of (string, substring) pairs
    FOR each (string, sub) in testCases
        PRINT string and substring
        CALL countNonOverlapping(string, sub)
        PRINT non-overlapping count
        CALL countOverlapping(string, sub)
        PRINT overlapping count
    ENDFOR
ENDFUNCTION

Algorithm Steps

  1. Non-Overlapping Count: a. Check if the input string or substring is null or empty; if so, return 0. b. Initialize count to 0 and index to 0. c. While indexOf(substring, index) returns a valid index:
    • Increment count.
    • Update index to the found index plus the substring length (to skip overlaps). d. Return count.
  2. Overlapping Count: a. Check if the input string or substring is null or empty; if so, return 0. b. Initialize count to 0 and index to 0. c. While indexOf(substring, index) returns a valid index:
    • Increment count.
    • Increment index by 1 (to check for overlaps). d. Return count.
  3. In the main method, test both methods with various string-substring pairs, including overlapping cases (e.g., "aaa", "aa"), non-overlapping cases (e.g., "abcabc", "abc"), and edge cases (e.g., empty or null inputs).

Java Implementation

public class SubstringFrequency {
    // Counts non-overlapping occurrences of substring in string
    public int countNonOverlapping(String input, String sub) {
        if (input == null || sub == null || input.isEmpty() || sub.isEmpty()) {
            return 0;
        }
        int count = 0;
        int index = 0;
        while ((index = input.indexOf(sub, index)) != -1) {
            count++;
            index += sub.length();
        }
        return count;
    }

    // Counts overlapping occurrences of substring in string
    public int countOverlapping(String input, String sub) {
        if (input == null || sub == null || input.isEmpty() || sub.isEmpty()) {
            return 0;
        }
        int count = 0;
        int index = 0;
        while ((index = input.indexOf(sub, index)) != -1) {
            count++;
            index++;
        }
        return count;
    }

    // Main method to test both count methods
    public static void main(String[] args) {
        SubstringFrequency counter = new SubstringFrequency();

        // Test cases
        String[][] testCases = {
            {"aaa", "aa"},           // Overlapping possible
            {"abcabc", "abc"},       // No overlapping
            {"", "a"},               // Empty string
            {"hello", "l"},          // Multiple occurrences
            {"aaaaaa", "aaa"},       // Multiple overlapping
            {null, "a"},             // Null input
            {"abc", ""}              // Empty substring
        };

        for (int i = 0; i < testCases.length; i++) {
            String input = testCases[i][0];
            String sub = testCases[i][1];
            System.out.println("Test case " + (i + 1) + ":");
            System.out.println("String: \"" + (input != null && input.length() > 20 ? input.substring(0, 20) + "..." : input) + "\"");
            System.out.println("Substring: \"" + (sub != null && sub.length() > 20 ? sub.substring(0, 20) + "..." : sub) + "\"");
            int nonOverlapping = counter.countNonOverlapping(input, sub);
            int overlapping = counter.countOverlapping(input, sub);
            System.out.println("Non-overlapping count: " + nonOverlapping);
            System.out.println("Overlapping count: " + overlapping + "\n");
        }
    }
}

Output

Running the main method produces:

Test case 1:
String: "aaa"
Substring: "aa"
Non-overlapping count: 1
Overlapping count: 2

Test case 2:
String: "abcabc"
Substring: "abc"
Non-overlapping count: 2
Overlapping count: 2

Test case 3:
String: ""
Substring: "a"
Non-overlapping count: 0
Overlapping count: 0

Test case 4:
String: "hello"
Substring: "l"
Non-overlapping count: 2
Overlapping count: 2

Test case 5:
String: "aaaaaa"
Substring: "aaa"
Non-overlapping count: 2
Overlapping count: 4

Test case 6:
String: "null"
Substring: "a"
Non-overlapping count: 0
Overlapping count: 0

Test case 7:
String: "abc"
Substring: ""
Non-overlapping count: 0
Overlapping count: 0

Explanation:

  • Test case 1: "aaa", "aa": Non-overlapping finds "aa" at 0-1 (skips to 2) → 1; overlapping finds at 0-1, 1-2 → 2.
  • Test case 2: "abcabc", "abc": Both find "abc" at 0-2, 3-5 → 2 (no overlap possible).
  • Test case 3: Empty string, "a": No occurrences → 0.
  • Test case 4: "hello", "l": Finds "l" at positions 2, 3 → 2 (no overlap for single character).
  • Test case 5: "aaaaaa", "aaa": Non-overlapping finds at 0-2, 3-5 → 2; overlapping finds at 0-2, 1-3, 2-4, 3-5 → 4.
  • Test case 6: Null string, "a": No occurrences → 0.
  • Test case 7: "abc", empty substring: No valid occurrences → 0.

How It Works

  • Non-Overlapping:
    • Uses indexOf(sub, index) to find the next occurrence.
    • After each match, skips sub.length() positions to avoid counting overlaps.
  • Overlapping:
    • Uses indexOf(sub, index) but increments index by 1 to check every starting position.
  • Example Trace (Test case 1):
    • Input: "aaa", sub: "aa".
    • Non-overlapping: index = 0, finds at 0, index = 0 + 2 = 2, no more matches → count = 1.
    • Overlapping: index = 0, finds at 0, index = 1, finds at 1, index = 2, no more matches → count = 2.
  • Main Method: Tests with overlapping cases (e.g., "aaa", "aa"), non-overlapping cases (e.g., "abcabc", "abc"), and edge cases (empty, null).
  • IndexOf Property: Efficiently finds substrings, with overlapping/non-overlapping handled by index increment.

Complexity Analysis Table

OperationTime ComplexitySpace Complexity
Non-OverlappingO(n * m)O(1)
OverlappingO(n * m)O(1)
Full AlgorithmO(n * m)O(1)

Note:

  • n is the length of the input string, m is the length of the substring.
  • Time complexity: O(n * m) for indexOf, which may scan up to m characters per call, with up to n/m calls (non-overlapping) or n calls (overlapping).
  • Space complexity: O(1), as only a few variables are used.
  • Worst case: O(n * m) when substring is short and string is long.

✅ Tip: Use indexOf for efficient substring searching. Test with overlapping cases (e.g., "aaa", "aa") and non-overlapping cases to verify both counts.

⚠ Warning: Check for null or empty inputs to avoid NullPointerException or unexpected behavior. Ensure the substring is not longer than the input string.