Hashing

Hashing is a technique used to uniquely identify a specific object from a group of similar objects. In computer science, hashing is widely used in data retrieval operations where efficiency and constant-time access are critical, such as in databases, caches, and compiler symbol tables.

Unlike searching algorithms that rely on sequential or sorted access (like Linear or Binary Search), hashing uses a hash function to compute an index (called a hash code or hash value) into a hash table, where the desired data is stored or retrieved.

Hashing Logic

The core idea of hashing revolves around mapping keys to indices in a fixed-size table. Here's the general process:

A hash function takes an input (the key) and returns an integer — the hash code.
The hash code is modulo-divided by the table size to find the index:
index=hash(key)mod tableSize\text{index} = \text{hash(key)} \mod \text{tableSize}
Store the value at that computed index in the hash table.
For retrieval, apply the same hash function to the key and check the corresponding index.

Handling Collisions

A collision occurs when two keys hash to the same index. Common collision resolution strategies include:

Chaining: Use a linked list at each index to store multiple values.
Open Addressing: Find another free slot using techniques like linear probing, quadratic probing, or double hashing.

Hashing in Java

Java provides built-in support for hashing via HashMap:

import java.util.HashMap;

public class HashingExample {
    public static void main(String[] args) {
        HashMap<String, Integer> map = new HashMap<>();

        // Inserting values
        map.put("apple", 5);
        map.put("banana", 10);
        map.put("grape", 7);

        // Accessing values
        String key = "banana";
        if (map.containsKey(key)) {
            System.out.println("Value for '" + key + "': " + map.get(key));
        } else {
            System.out.println("Key not found.");
        }
    }
}

Hashing in C

C does not have a built-in hash map, so you need to implement it manually. Here's a basic example using separate chaining:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define TABLE_SIZE 10

typedef struct Node {
    char* key;
    int value;
    struct Node* next;
} Node;

Node* hashTable[TABLE_SIZE];

// Simple hash function
int hash(char* key) {
    int sum = 0;
    for (int i = 0; key[i]; i++) {
        sum += key[i];
    }
    return sum % TABLE_SIZE;
}

void insert(char* key, int value) {
    int idx = hash(key);
    Node* newNode = malloc(sizeof(Node));
    newNode->key = key;
    newNode->value = value;
    newNode->next = hashTable[idx];
    hashTable[idx] = newNode;
}

int search(char* key) {
    int idx = hash(key);
    Node* temp = hashTable[idx];
    while (temp) {
        if (strcmp(temp->key, key) == 0) {
            return temp->value;
        }
        temp = temp->next;
    }
    return -1; // Not found
}

int main() {
    insert("apple", 5);
    insert("banana", 10);
    insert("grape", 7);

    char* key = "banana";
    int result = search(key);
    if (result != -1) {
        printf("Value for '%s': %d\n", key, result);
    } else {
        printf("Key not found.\n");
    }

    return 0;
}

Time Complexity

Let n be the number of elements and m be the size of the hash table.

Operation

Average Case

Worst Case

Insert

O(1)

O(n) (in worst collisions)

O(1)

O(n)

Delete

O(1)

O(n)

Note: With a good hash function and proper load factor (typically ≤ 0.75), the average-case time complexity stays O(1).

Space Complexity

Space complexity is O(m + n) where:
- m is the number of slots in the hash table.
- n is the number of stored elements.

Use Cases of Hashing

HashMaps / HashTables in programming languages (Java, Python, C++).
Database indexing for fast lookup.
Symbol tables in compilers.
Caching with keys representing data identifiers.
Password storage (with cryptographic hash functions).
Deduplication and detecting data integrity.

Summary

Feature

Hashing

Array Requirement

No sorting required

Search Time

O(1) average, O(n) worst

Insertion Time

O(1) average, O(n) worst

Data Structure Used

Hash Table

Space Efficiency

Depends on load factor

Key Benefit

Fast constant-time access

Limitation

Collisions, need a good hash fn

Hashing is a cornerstone of modern computing. When designed properly, it outperforms most searching algorithms in both time and efficiency. However, it demands careful design of the hash function and collision resolution strategy to maintain performance.

PreviousBinary Search NextInterpolation Search

Last updated 2 months ago

Was this helpful?