Hashing
Hashing is a technique used to uniquely identify a specific object from a group of similar objects. In computer science, hashing is widely used in data retrieval operations where efficiency and constant-time access are critical, such as in databases, caches, and compiler symbol tables.
Unlike searching algorithms that rely on sequential or sorted access (like Linear or Binary Search), hashing uses a hash function to compute an index (called a hash code or hash value) into a hash table, where the desired data is stored or retrieved.
Hashing Logic
The core idea of hashing revolves around mapping keys to indices in a fixed-size table. Here's the general process:
A hash function takes an input (the key) and returns an integer — the hash code.
The hash code is modulo-divided by the table size to find the index:
index=hash(key)mod  tableSize\text{index} = \text{hash(key)} \mod \text{tableSize}
Store the value at that computed index in the hash table.
For retrieval, apply the same hash function to the key and check the corresponding index.
Handling Collisions
A collision occurs when two keys hash to the same index. Common collision resolution strategies include:
Chaining: Use a linked list at each index to store multiple values.
Open Addressing: Find another free slot using techniques like linear probing, quadratic probing, or double hashing.
Hashing in Java
Java provides built-in support for hashing via HashMap
:
Hashing in C
C does not have a built-in hash map, so you need to implement it manually. Here's a basic example using separate chaining:
Time Complexity
Let n
be the number of elements and m
be the size of the hash table.
Insert
O(1)
O(n) (in worst collisions)
Search
O(1)
O(n)
Delete
O(1)
O(n)
Note: With a good hash function and proper load factor (typically ≤ 0.75), the average-case time complexity stays O(1).
Space Complexity
Space complexity is O(m + n) where:
m
is the number of slots in the hash table.n
is the number of stored elements.
Use Cases of Hashing
HashMaps / HashTables in programming languages (Java, Python, C++).
Database indexing for fast lookup.
Symbol tables in compilers.
Caching with keys representing data identifiers.
Password storage (with cryptographic hash functions).
Deduplication and detecting data integrity.
Summary
Array Requirement
No sorting required
Search Time
O(1) average, O(n) worst
Insertion Time
O(1) average, O(n) worst
Data Structure Used
Hash Table
Space Efficiency
Depends on load factor
Key Benefit
Fast constant-time access
Limitation
Collisions, need a good hash fn
Hashing is a cornerstone of modern computing. When designed properly, it outperforms most searching algorithms in both time and efficiency. However, it demands careful design of the hash function and collision resolution strategy to maintain performance.
Last updated
Was this helpful?