Collision Analysis
Problem Statement
Write a Java program that implements a custom hash table to track the number of collisions, where multiple keys map to the same array index. The hash table should use an array with chaining (linked lists) to handle collisions, with strings as keys and integers as values. A collision occurs when a key hashes to an index that already contains one or more keys. The program should provide methods to insert keys and report the total number of collisions, then test the implementation with different sets of keys, including cases with high collision rates, few collisions, and edge cases like empty sets or single keys. You can visualize this as organizing items into labeled bins, counting how often multiple items land in the same bin due to their labels hashing to the same position.
Input:
- A set of string keys to insert into the hash table. Output: The number of collisions and the hash table contents (key-value pairs per index). Constraints:
- The hash table size (array length) is fixed at 10 for simplicity.
- Keys are non-empty strings of up to 100 characters.
- Values are integers (e.g., 1 for each key inserted).
- The number of keys is between 0 and 10^5. Example:
- Input: Keys = ["cat", "act", "dog", "god"]
- Output:
Collisions: 2 Hash Table: Index 0: [] Index 1: [] Index 2: [cat: 1, act: 1] Index 3: [] Index 4: [] Index 5: [dog: 1, god: 1] Index 6: [] Index 7: [] Index 8: [] Index 9: [] - Explanation: "cat" and "act" collide at index 2, "dog" and "god" collide at index 5, resulting in 2 collisions.
- Input: Keys = []
- Output:
Collisions: 0 Hash Table: Index 0: [] ... Index 9: []
Pseudocode
CLASS Node
SET key to string
SET value to integer
SET next to Node (null by default)
ENDCLASS
CLASS HashTable
SET table to array of Node pointers (size 10)
SET collisionCount to 0
FUNCTION hash(key)
SET sum to 0
FOR each character in key
ADD character ASCII value to sum
ENDFOR
RETURN sum mod 10
ENDFUNCTION
FUNCTION insert(key, value)
SET index to hash(key)
IF table[index] is null THEN
SET table[index] to new Node(key, value)
ELSE
INCREMENT collisionCount
SET current to table[index]
WHILE current is not null
IF current.key equals key THEN
SET current.value to value
RETURN
ENDIF
SET current to current.next
ENDWHILE
CREATE newNode as new Node(key, value)
SET newNode.next to table[index]
SET table[index] to newNode
ENDIF
ENDFUNCTION
FUNCTION getCollisionCount()
RETURN collisionCount
ENDFUNCTION
FUNCTION toString()
CREATE result as new StringBuilder
FOR i from 0 to table size - 1
APPEND "Index " and i and ": [" to result
SET current to table[i]
WHILE current is not null
APPEND current.key and ": " and current.value to result
IF current.next is not null THEN
APPEND ", " to result
ENDIF
SET current to current.next
ENDWHILE
APPEND "]" to result
IF i is not last index THEN
APPEND newline to result
ENDIF
ENDFOR
RETURN result as string
ENDFUNCTION
ENDCLASS
FUNCTION main()
SET testCases to array of key sets
FOR each testCase in testCases
PRINT test case details
CREATE hashTable as new HashTable
FOR each key in testCase
CALL hashTable.insert(key, 1)
ENDFOR
PRINT collision count
PRINT hash table contents using toString
ENDFOR
ENDFUNCTION
Algorithm Steps
- Define a
Nodeclass with: a. A stringkey. b. An integervalue. c. Anextpointer for chaining. - Define a
HashTableclass with: a. A fixed-size array (table) of size 10. b. AcollisionCountto track collisions. c.hash: Compute index by summing ASCII values of key characters modulo 10. d.insert: Insert key-value pair, incrementcollisionCountif index is occupied, handle duplicates. e.getCollisionCount: Return collision count. f.toString: Print key-value pairs per index. - In
main, test with key sets: a. Keys with collisions (e.g., anagrams). b. Keys with no collisions. c. Empty set. d. Single key. e. Large set with high collision rate.
Java Implementation
import java.util.*;
public class CollisionAnalysis {
// Node class for chaining
static class Node {
String key;
int value;
Node next;
Node(String key, int value) {
this.key = key;
this.value = value;
this.next = null;
}
}
// Custom hash table with collision tracking
static class HashTable {
private Node[] table;
private int collisionCount;
public HashTable() {
table = new Node[10];
collisionCount = 0;
}
private int hash(String key) {
int sum = 0;
for (char c : key.toCharArray()) {
sum += c;
}
return sum % 10;
}
public void insert(String key, int value) {
int index = hash(key);
if (table[index] == null) {
table[index] = new Node(key, value);
} else {
collisionCount++;
Node current = table[index];
while (current != null) {
if (current.key.equals(key)) {
current.value = value;
return;
}
current = current.next;
}
Node newNode = new Node(key, value);
newNode.next = table[index];
table[index] = newNode;
}
}
public int getCollisionCount() {
return collisionCount;
}
public String toString() {
StringBuilder result = new StringBuilder();
for (int i = 0; i < table.length; i++) {
result.append("Index ").append(i).append(": [");
Node current = table[i];
while (current != null) {
result.append(current.key).append(": ").append(current.value);
if (current.next != null) {
result.append(", ");
}
current = current.next;
}
result.append("]");
if (i < table.length - 1) {
result.append("\n");
}
}
return result.toString();
}
}
// Main method to test collision analysis
public static void main(String[] args) {
CollisionAnalysis analyzer = new CollisionAnalysis();
// Test cases
String[][] testCases = {
{"cat", "act", "dog", "god"}, // High collisions (anagrams)
{"apple", "banana", "cherry"}, // No collisions
{}, // Empty set
{"single"}, // Single key
{"key1", "key2", "key3", "key4", "key5", "yek1", "yek2"} // High collisions
};
// Run test cases
for (int i = 0; i < testCases.length; i++) {
System.out.println("Test case " + (i + 1) + ":");
System.out.println("Input keys: " + Arrays.toString(testCases[i]));
HashTable hashTable = new HashTable();
for (String key : testCases[i]) {
hashTable.insert(key, 1);
}
System.out.println("Collisions: " + hashTable.getCollisionCount());
System.out.println("Hash Table:\n" + hashTable.toString() + "\n");
}
}
}
Output
Running the main method produces:
Test case 1:
Input keys: [cat, act, dog, god]
Collisions: 2
Hash Table:
Index 0: []
Index 1: []
Index 2: [act: 1, cat: 1]
Index 3: []
Index 4: []
Index 5: [god: 1, dog: 1]
Index 6: []
Index 7: []
Index 8: []
Index 9: []
Test case 2:
Input keys: [apple, banana, cherry]
Collisions: 0
Hash Table:
Index 0: []
Index 1: []
Index 2: []
Index 3: []
Index 4: [cherry: 1]
Index 5: []
Index 6: []
Index 7: [apple: 1]
Index 8: []
Index 9: [banana: 1]
Test case 3:
Input keys: []
Collisions: 0
Hash Table:
Index 0: []
Index 1: []
Index 2: []
Index 3: []
Index 4: []
Index 5: []
Index 6: []
Index 7: []
Index 8: []
Index 9: []
Test case 4:
Input keys: [single]
Collisions: 0
Hash Table:
Index 0: []
Index 1: []
Index 2: []
Index 3: []
Index 4: []
Index 5: []
Index 6: [single: 1]
Index 7: []
Index 8: []
Index 9: []
Test case 5:
Input keys: [key1, key2, key3, key4, key5, yek1, yek2]
Collisions: 4
Hash Table:
Index 0: []
Index 1: []
Index 2: []
Index 3: []
Index 4: [key5: 1, key4: 1, key3: 1]
Index 5: []
Index 6: []
Index 7: []
Index 8: []
Index 9: [yek2: 1, yek1: 1, key2: 1, key1: 1]
Explanation:
- Test case 1: "cat" and "act" collide at index 2, "dog" and "god" at index 5 (2 collisions).
- Test case 2: No collisions, keys map to distinct indices.
- Test case 3: Empty set, no collisions.
- Test case 4: Single key, no collisions.
- Test case 5: Multiple keys collide at indices 4 and 9 (4 collisions).
How It Works
- Node: Stores a string key, integer value, and
nextpointer for chaining. - HashTable:
- Uses a fixed-size array (size 10) with linked lists for chaining.
hash: Sums ASCII values of key characters modulo 10.insert: Inserts key-value pair, incrementscollisionCountif index occupied, updates value for duplicates.getCollisionCount: Returns collision count.toString: Prints key-value pairs per index.
- Example Trace (Test case 1):
- Insert "cat": index=2, table[2]=[cat:1], collisions=0.
- Insert "act": index=2, table[2]=[act:1,cat:1], collisions=1.
- Insert "dog": index=5, table[5]=[dog:1], collisions=1.
- Insert "god": index=5, table[5]=[god:1,dog:1], collisions=2.
- Output: Collisions=2, table shows lists at indices 2 and 5.
- Main Method: Tests high collisions, no collisions, empty set, single key, and large set.
Complexity Analysis Table
| Operation | Time Complexity | Space Complexity |
|---|---|---|
| Insert | O(1) average | O(1) |
| Get Collision Count | O(1) | O(1) |
| To String | O(n) | O(n) |
Note:
- n is the number of keys in the hash table.
- Time complexity: O(1) average for insert (hashing and list insertion); O(1) for getCollisionCount; O(n) for toString (iterate all keys).
- Space complexity: O(1) for insert and getCollisionCount; O(n) for toString (StringBuilder output).
- Worst case: O(n) time for insert (many keys in one index), O(n) space for toString.
✅ Tip: Use a simple hash function for educational purposes, but ensure it distributes keys evenly to minimize collisions. Track collisions by incrementing a counter when an index is already occupied.
⚠ Warning: A poor hash function (e.g., summing ASCII values) may cause excessive collisions. Test with diverse key sets to verify collision behavior, and handle duplicates to avoid redundant entries.