Substring Frequency
Problem Statement
Write a Java program that implements two methods to count the occurrences of a substring within a string using the indexOf method: one for non-overlapping occurrences and one for overlapping occurrences. The program should return the number of times the substring appears in the string and test the implementation with various inputs, including cases where the substring has overlapping and non-overlapping occurrences, as well as edge cases like empty strings or substrings. You can visualize this as searching for a specific word in a book, counting how many times it appears, either by skipping the word’s length to avoid overlaps or checking every possible position for potential overlaps.
Input: A string and a substring (e.g., string = "aaa", substring = "aa").
Output: Two integers representing the number of non-overlapping and overlapping occurrences of the substring (e.g., non-overlapping: 1, overlapping: 2).
Constraints:
- String and substring lengths are between 0 and 10^5.
- The string and substring contain any ASCII characters.
- The input may be empty or null. Example:
- Input: string =
"aaa", substring ="aa" - Output: Non-overlapping: 1, Overlapping: 2
- Explanation: Non-overlapping counts
"aa"once (positions 0-1), skipping to position 2; overlapping counts"aa"at positions 0-1 and 1-2. - Input: string =
"abcabc", substring ="abc" - Output: Non-overlapping: 2, Overlapping: 2
- Explanation:
"abc"appears at positions 0-2 and 3-5, with no overlap possible. - Input: string =
"", substring ="a" - Output: Non-overlapping: 0, Overlapping: 0
- Explanation: Empty string has no occurrences.
Pseudocode
FUNCTION countNonOverlapping(input, sub)
IF input is null OR sub is null OR input is empty OR sub is empty THEN
RETURN 0
ENDIF
SET count to 0
SET index to 0
WHILE indexOf(sub in input starting at index) is not -1
INCREMENT count
SET index to indexOf(sub) + length of sub
ENDWHILE
RETURN count
ENDFUNCTION
FUNCTION countOverlapping(input, sub)
IF input is null OR sub is null OR input is empty OR sub is empty THEN
RETURN 0
ENDIF
SET count to 0
SET index to 0
WHILE indexOf(sub in input starting at index) is not -1
INCREMENT count
INCREMENT index
ENDWHILE
RETURN count
ENDFUNCTION
FUNCTION main()
SET testCases to array of (string, substring) pairs
FOR each (string, sub) in testCases
PRINT string and substring
CALL countNonOverlapping(string, sub)
PRINT non-overlapping count
CALL countOverlapping(string, sub)
PRINT overlapping count
ENDFOR
ENDFUNCTION
Algorithm Steps
- Non-Overlapping Count:
a. Check if the input string or substring is null or empty; if so, return 0.
b. Initialize
countto 0 andindexto 0. c. WhileindexOf(substring, index)returns a valid index:- Increment
count. - Update
indexto the found index plus the substring length (to skip overlaps). d. Returncount.
- Increment
- Overlapping Count:
a. Check if the input string or substring is null or empty; if so, return 0.
b. Initialize
countto 0 andindexto 0. c. WhileindexOf(substring, index)returns a valid index:- Increment
count. - Increment
indexby 1 (to check for overlaps). d. Returncount.
- Increment
- In the
mainmethod, test both methods with various string-substring pairs, including overlapping cases (e.g.,"aaa","aa"), non-overlapping cases (e.g.,"abcabc","abc"), and edge cases (e.g., empty or null inputs).
Java Implementation
public class SubstringFrequency {
// Counts non-overlapping occurrences of substring in string
public int countNonOverlapping(String input, String sub) {
if (input == null || sub == null || input.isEmpty() || sub.isEmpty()) {
return 0;
}
int count = 0;
int index = 0;
while ((index = input.indexOf(sub, index)) != -1) {
count++;
index += sub.length();
}
return count;
}
// Counts overlapping occurrences of substring in string
public int countOverlapping(String input, String sub) {
if (input == null || sub == null || input.isEmpty() || sub.isEmpty()) {
return 0;
}
int count = 0;
int index = 0;
while ((index = input.indexOf(sub, index)) != -1) {
count++;
index++;
}
return count;
}
// Main method to test both count methods
public static void main(String[] args) {
SubstringFrequency counter = new SubstringFrequency();
// Test cases
String[][] testCases = {
{"aaa", "aa"}, // Overlapping possible
{"abcabc", "abc"}, // No overlapping
{"", "a"}, // Empty string
{"hello", "l"}, // Multiple occurrences
{"aaaaaa", "aaa"}, // Multiple overlapping
{null, "a"}, // Null input
{"abc", ""} // Empty substring
};
for (int i = 0; i < testCases.length; i++) {
String input = testCases[i][0];
String sub = testCases[i][1];
System.out.println("Test case " + (i + 1) + ":");
System.out.println("String: \"" + (input != null && input.length() > 20 ? input.substring(0, 20) + "..." : input) + "\"");
System.out.println("Substring: \"" + (sub != null && sub.length() > 20 ? sub.substring(0, 20) + "..." : sub) + "\"");
int nonOverlapping = counter.countNonOverlapping(input, sub);
int overlapping = counter.countOverlapping(input, sub);
System.out.println("Non-overlapping count: " + nonOverlapping);
System.out.println("Overlapping count: " + overlapping + "\n");
}
}
}
Output
Running the main method produces:
Test case 1:
String: "aaa"
Substring: "aa"
Non-overlapping count: 1
Overlapping count: 2
Test case 2:
String: "abcabc"
Substring: "abc"
Non-overlapping count: 2
Overlapping count: 2
Test case 3:
String: ""
Substring: "a"
Non-overlapping count: 0
Overlapping count: 0
Test case 4:
String: "hello"
Substring: "l"
Non-overlapping count: 2
Overlapping count: 2
Test case 5:
String: "aaaaaa"
Substring: "aaa"
Non-overlapping count: 2
Overlapping count: 4
Test case 6:
String: "null"
Substring: "a"
Non-overlapping count: 0
Overlapping count: 0
Test case 7:
String: "abc"
Substring: ""
Non-overlapping count: 0
Overlapping count: 0
Explanation:
- Test case 1:
"aaa","aa": Non-overlapping finds"aa"at 0-1 (skips to 2) → 1; overlapping finds at 0-1, 1-2 → 2. - Test case 2:
"abcabc","abc": Both find"abc"at 0-2, 3-5 → 2 (no overlap possible). - Test case 3: Empty string,
"a": No occurrences → 0. - Test case 4:
"hello","l": Finds"l"at positions 2, 3 → 2 (no overlap for single character). - Test case 5:
"aaaaaa","aaa": Non-overlapping finds at 0-2, 3-5 → 2; overlapping finds at 0-2, 1-3, 2-4, 3-5 → 4. - Test case 6: Null string,
"a": No occurrences → 0. - Test case 7:
"abc", empty substring: No valid occurrences → 0.
How It Works
- Non-Overlapping:
- Uses
indexOf(sub, index)to find the next occurrence. - After each match, skips
sub.length()positions to avoid counting overlaps.
- Uses
- Overlapping:
- Uses
indexOf(sub, index)but incrementsindexby 1 to check every starting position.
- Uses
- Example Trace (Test case 1):
- Input:
"aaa", sub:"aa". - Non-overlapping:
index = 0, finds at 0,index = 0 + 2 = 2, no more matches → count = 1. - Overlapping:
index = 0, finds at 0,index = 1, finds at 1,index = 2, no more matches → count = 2.
- Input:
- Main Method: Tests with overlapping cases (e.g.,
"aaa","aa"), non-overlapping cases (e.g.,"abcabc","abc"), and edge cases (empty, null). - IndexOf Property: Efficiently finds substrings, with overlapping/non-overlapping handled by index increment.
Complexity Analysis Table
| Operation | Time Complexity | Space Complexity |
|---|---|---|
| Non-Overlapping | O(n * m) | O(1) |
| Overlapping | O(n * m) | O(1) |
| Full Algorithm | O(n * m) | O(1) |
Note:
- n is the length of the input string, m is the length of the substring.
- Time complexity: O(n * m) for
indexOf, which may scan up to m characters per call, with up to n/m calls (non-overlapping) or n calls (overlapping). - Space complexity: O(1), as only a few variables are used.
- Worst case: O(n * m) when substring is short and string is long.
✅ Tip: Use
indexOffor efficient substring searching. Test with overlapping cases (e.g.,"aaa","aa") and non-overlapping cases to verify both counts.
⚠ Warning: Check for null or empty inputs to avoid
NullPointerExceptionor unexpected behavior. Ensure the substring is not longer than the input string.