Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Collision Analysis

Problem Statement

Write a Java program that implements a custom hash table to track the number of collisions, where multiple keys map to the same array index. The hash table should use an array with chaining (linked lists) to handle collisions, with strings as keys and integers as values. A collision occurs when a key hashes to an index that already contains one or more keys. The program should provide methods to insert keys and report the total number of collisions, then test the implementation with different sets of keys, including cases with high collision rates, few collisions, and edge cases like empty sets or single keys. You can visualize this as organizing items into labeled bins, counting how often multiple items land in the same bin due to their labels hashing to the same position.

Input:

  • A set of string keys to insert into the hash table. Output: The number of collisions and the hash table contents (key-value pairs per index). Constraints:
  • The hash table size (array length) is fixed at 10 for simplicity.
  • Keys are non-empty strings of up to 100 characters.
  • Values are integers (e.g., 1 for each key inserted).
  • The number of keys is between 0 and 10^5. Example:
  • Input: Keys = ["cat", "act", "dog", "god"]
  • Output:
    Collisions: 2
    Hash Table:
    Index 0: []
    Index 1: []
    Index 2: [cat: 1, act: 1]
    Index 3: []
    Index 4: []
    Index 5: [dog: 1, god: 1]
    Index 6: []
    Index 7: []
    Index 8: []
    Index 9: []
    
  • Explanation: "cat" and "act" collide at index 2, "dog" and "god" collide at index 5, resulting in 2 collisions.
  • Input: Keys = []
  • Output:
    Collisions: 0
    Hash Table:
    Index 0: []
    ...
    Index 9: []
    

Pseudocode

CLASS Node
    SET key to string
    SET value to integer
    SET next to Node (null by default)
ENDCLASS

CLASS HashTable
    SET table to array of Node pointers (size 10)
    SET collisionCount to 0

    FUNCTION hash(key)
        SET sum to 0
        FOR each character in key
            ADD character ASCII value to sum
        ENDFOR
        RETURN sum mod 10
    ENDFUNCTION

    FUNCTION insert(key, value)
        SET index to hash(key)
        IF table[index] is null THEN
            SET table[index] to new Node(key, value)
        ELSE
            INCREMENT collisionCount
            SET current to table[index]
            WHILE current is not null
                IF current.key equals key THEN
                    SET current.value to value
                    RETURN
                ENDIF
                SET current to current.next
            ENDWHILE
            CREATE newNode as new Node(key, value)
            SET newNode.next to table[index]
            SET table[index] to newNode
        ENDIF
    ENDFUNCTION

    FUNCTION getCollisionCount()
        RETURN collisionCount
    ENDFUNCTION

    FUNCTION toString()
        CREATE result as new StringBuilder
        FOR i from 0 to table size - 1
            APPEND "Index " and i and ": [" to result
            SET current to table[i]
            WHILE current is not null
                APPEND current.key and ": " and current.value to result
                IF current.next is not null THEN
                    APPEND ", " to result
                ENDIF
                SET current to current.next
            ENDWHILE
            APPEND "]" to result
            IF i is not last index THEN
                APPEND newline to result
            ENDIF
        ENDFOR
        RETURN result as string
    ENDFUNCTION
ENDCLASS

FUNCTION main()
    SET testCases to array of key sets
    FOR each testCase in testCases
        PRINT test case details
        CREATE hashTable as new HashTable
        FOR each key in testCase
            CALL hashTable.insert(key, 1)
        ENDFOR
        PRINT collision count
        PRINT hash table contents using toString
    ENDFOR
ENDFUNCTION

Algorithm Steps

  1. Define a Node class with: a. A string key. b. An integer value. c. A next pointer for chaining.
  2. Define a HashTable class with: a. A fixed-size array (table) of size 10. b. A collisionCount to track collisions. c. hash: Compute index by summing ASCII values of key characters modulo 10. d. insert: Insert key-value pair, increment collisionCount if index is occupied, handle duplicates. e. getCollisionCount: Return collision count. f. toString: Print key-value pairs per index.
  3. In main, test with key sets: a. Keys with collisions (e.g., anagrams). b. Keys with no collisions. c. Empty set. d. Single key. e. Large set with high collision rate.

Java Implementation

import java.util.*;

public class CollisionAnalysis {
    // Node class for chaining
    static class Node {
        String key;
        int value;
        Node next;

        Node(String key, int value) {
            this.key = key;
            this.value = value;
            this.next = null;
        }
    }

    // Custom hash table with collision tracking
    static class HashTable {
        private Node[] table;
        private int collisionCount;

        public HashTable() {
            table = new Node[10];
            collisionCount = 0;
        }

        private int hash(String key) {
            int sum = 0;
            for (char c : key.toCharArray()) {
                sum += c;
            }
            return sum % 10;
        }

        public void insert(String key, int value) {
            int index = hash(key);
            if (table[index] == null) {
                table[index] = new Node(key, value);
            } else {
                collisionCount++;
                Node current = table[index];
                while (current != null) {
                    if (current.key.equals(key)) {
                        current.value = value;
                        return;
                    }
                    current = current.next;
                }
                Node newNode = new Node(key, value);
                newNode.next = table[index];
                table[index] = newNode;
            }
        }

        public int getCollisionCount() {
            return collisionCount;
        }

        public String toString() {
            StringBuilder result = new StringBuilder();
            for (int i = 0; i < table.length; i++) {
                result.append("Index ").append(i).append(": [");
                Node current = table[i];
                while (current != null) {
                    result.append(current.key).append(": ").append(current.value);
                    if (current.next != null) {
                        result.append(", ");
                    }
                    current = current.next;
                }
                result.append("]");
                if (i < table.length - 1) {
                    result.append("\n");
                }
            }
            return result.toString();
        }
    }

    // Main method to test collision analysis
    public static void main(String[] args) {
        CollisionAnalysis analyzer = new CollisionAnalysis();

        // Test cases
        String[][] testCases = {
            {"cat", "act", "dog", "god"},           // High collisions (anagrams)
            {"apple", "banana", "cherry"},          // No collisions
            {},                                     // Empty set
            {"single"},                             // Single key
            {"key1", "key2", "key3", "key4", "key5", "yek1", "yek2"} // High collisions
        };

        // Run test cases
        for (int i = 0; i < testCases.length; i++) {
            System.out.println("Test case " + (i + 1) + ":");
            System.out.println("Input keys: " + Arrays.toString(testCases[i]));
            HashTable hashTable = new HashTable();
            for (String key : testCases[i]) {
                hashTable.insert(key, 1);
            }
            System.out.println("Collisions: " + hashTable.getCollisionCount());
            System.out.println("Hash Table:\n" + hashTable.toString() + "\n");
        }
    }
}

Output

Running the main method produces:

Test case 1:
Input keys: [cat, act, dog, god]
Collisions: 2
Hash Table:
Index 0: []
Index 1: []
Index 2: [act: 1, cat: 1]
Index 3: []
Index 4: []
Index 5: [god: 1, dog: 1]
Index 6: []
Index 7: []
Index 8: []
Index 9: []

Test case 2:
Input keys: [apple, banana, cherry]
Collisions: 0
Hash Table:
Index 0: []
Index 1: []
Index 2: []
Index 3: []
Index 4: [cherry: 1]
Index 5: []
Index 6: []
Index 7: [apple: 1]
Index 8: []
Index 9: [banana: 1]

Test case 3:
Input keys: []
Collisions: 0
Hash Table:
Index 0: []
Index 1: []
Index 2: []
Index 3: []
Index 4: []
Index 5: []
Index 6: []
Index 7: []
Index 8: []
Index 9: []

Test case 4:
Input keys: [single]
Collisions: 0
Hash Table:
Index 0: []
Index 1: []
Index 2: []
Index 3: []
Index 4: []
Index 5: []
Index 6: [single: 1]
Index 7: []
Index 8: []
Index 9: []

Test case 5:
Input keys: [key1, key2, key3, key4, key5, yek1, yek2]
Collisions: 4
Hash Table:
Index 0: []
Index 1: []
Index 2: []
Index 3: []
Index 4: [key5: 1, key4: 1, key3: 1]
Index 5: []
Index 6: []
Index 7: []
Index 8: []
Index 9: [yek2: 1, yek1: 1, key2: 1, key1: 1]

Explanation:

  • Test case 1: "cat" and "act" collide at index 2, "dog" and "god" at index 5 (2 collisions).
  • Test case 2: No collisions, keys map to distinct indices.
  • Test case 3: Empty set, no collisions.
  • Test case 4: Single key, no collisions.
  • Test case 5: Multiple keys collide at indices 4 and 9 (4 collisions).

How It Works

  • Node: Stores a string key, integer value, and next pointer for chaining.
  • HashTable:
    • Uses a fixed-size array (size 10) with linked lists for chaining.
    • hash: Sums ASCII values of key characters modulo 10.
    • insert: Inserts key-value pair, increments collisionCount if index occupied, updates value for duplicates.
    • getCollisionCount: Returns collision count.
    • toString: Prints key-value pairs per index.
  • Example Trace (Test case 1):
    • Insert "cat": index=2, table[2]=[cat:1], collisions=0.
    • Insert "act": index=2, table[2]=[act:1,cat:1], collisions=1.
    • Insert "dog": index=5, table[5]=[dog:1], collisions=1.
    • Insert "god": index=5, table[5]=[god:1,dog:1], collisions=2.
    • Output: Collisions=2, table shows lists at indices 2 and 5.
  • Main Method: Tests high collisions, no collisions, empty set, single key, and large set.

Complexity Analysis Table

OperationTime ComplexitySpace Complexity
InsertO(1) averageO(1)
Get Collision CountO(1)O(1)
To StringO(n)O(n)

Note:

  • n is the number of keys in the hash table.
  • Time complexity: O(1) average for insert (hashing and list insertion); O(1) for getCollisionCount; O(n) for toString (iterate all keys).
  • Space complexity: O(1) for insert and getCollisionCount; O(n) for toString (StringBuilder output).
  • Worst case: O(n) time for insert (many keys in one index), O(n) space for toString.

✅ Tip: Use a simple hash function for educational purposes, but ensure it distributes keys evenly to minimize collisions. Track collisions by incrementing a counter when an index is already occupied.

⚠ Warning: A poor hash function (e.g., summing ASCII values) may cause excessive collisions. Test with diverse key sets to verify collision behavior, and handle duplicates to avoid redundant entries.