Basics of Hashing for Arrays –
Introduction
Hashing is a technique used to store and retrieve data efficiently using a special function called a hash function.
In array-based problems, hashing allows us to track the frequency, presence, or mapping of elements in constant average time.
Hashing is one of the most frequently used concepts in:
- Data Structures
- Competitive programming
- Interview problem solving
- Real-time systems
Why Hashing is Needed
Many array problems require:
- Checking if an element exists
- Counting occurrences
- Finding duplicates or missing elements
- Tracking pairs or subarrays efficiently
A naive approach often involves nested loops, resulting in O(n²) time complexity.
Hashing reduces this to O(n) by trading a small amount of extra space.
What is a Hash Function?
A hash function maps a key to an index in a hash table.
index = hash(key)
Properties of a Good Hash Function
- Fast computation
- Uniform distribution
- Minimal collisions
- Deterministic output
Hash Table – Core Concept
A hash table is a data structure that stores key–value pairs.
| Key | Hash Function | Index |
|---|---|---|
| 15 | 15 % 10 | 5 |
| 27 | 27 % 10 | 7 |
Common implementations:
- Arrays (limited range)
- HashMap / Dictionary (dynamic range)
Types of Hashing Used in Arrays
Direct Addressing (Frequency Array)
Used when values are small and non-negative.
freq[x] = number of times x appears
Example:
arr = [1, 3, 2, 1, 3]
freq = [0,2,1,2]
Hash Map Based Hashing
Used when:
- Values are large
- Values are negative
- Range is unknown
Examples:
- unordered_map in C++
- HashMap in Java
- Dictionary in Python and C#
Common Use Cases of Hashing in Arrays
- Frequency counting
- Duplicate detection
- Pair sum problems
- Subarray sum problems
- Finding missing elements
- Intersection of arrays
Example Problem: Frequency of Elements
Problem Statement
Given an array of integers, print the frequency of each element.
Approach Using Hashing
- Initialize an empty hash table
- Traverse the array
- For each element:
- Increment its count in the hash table
- Output the frequencies
Implementation in All Languages
C++ Implementation
#include
#include
using namespace std;
int main() {
int arr[] = {1, 2, 2, 3, 1, 4};
int n = 6;
unordered_map freq;
for (int i = 0; i < n; i++) {
freq[arr[i]]++;
}
for (auto it : freq) {
cout << it.first << " -> " << it.second << endl;
}
return 0;
}
Output
1 -> 2
2 -> 2
3 -> 1
4 -> 1
C Implementation (Frequency Array)
#include
int main() {
int arr[] = {1, 2, 2, 3, 1, 4};
int n = 6;
int freq[10] = {0};
for (int i = 0; i < n; i++) {
freq[arr[i]]++;
}
for (int i = 0; i < 10; i++) {
if (freq[i] > 0)
printf("%d -> %d\n", i, freq[i]);
}
return 0;
}
Output
1 -> 2
2 -> 2
3 -> 1
4 -> 1
Java Implementation
import java.util.HashMap;
class HashingArray {
public static void main(String[] args) {
int[] arr = {1, 2, 2, 3, 1, 4};
HashMap freq = new HashMap<>();
for (int x : arr) {
freq.put(x, freq.getOrDefault(x, 0) + 1);
}
for (int key : freq.keySet()) {
System.out.println(key + " -> " + freq.get(key));
}
}
}
Output
1 -> 2
2 -> 2
3 -> 1
4 -> 1
Python Implementation
arr = [1, 2, 2, 3, 1, 4]
freq = {}
for x in arr:
freq[x] = freq.get(x, 0) + 1
for key, value in freq.items():
print(key, "->", value)
Output
1 -> 2
2 -> 2
3 -> 1
4 -> 1
C# Implementation
using System;
using System.Collections.Generic;
class Program {
static void Main() {
int[] arr = {1, 2, 2, 3, 1, 4};
Dictionary freq = new Dictionary();
foreach (int x in arr) {
if (freq.ContainsKey(x))
freq[x]++;
else
freq[x] = 1;
}
foreach (var pair in freq) {
Console.WriteLine(pair.Key + " -> " + pair.Value);
}
}
}
Output
1 -> 2
2 -> 2
3 -> 1
4 -> 1
JavaScript Implementation
let arr = [1, 2, 2, 3, 1, 4];
let freq = {};
for (let x of arr) {
freq[x] = (freq[x] || 0) + 1;
}
for (let key in freq) {
console.log(key + " -> " + freq[key]);
}
Output
1 -> 2
2 -> 2
3 -> 1
4 -> 1
Time and Space Complexity
| Operation | Complexity |
|---|---|
| Insertion | O(1) average |
| Search | O(1) average |
| Overall | O(n) |
| Extra Space | O(n) |
Collision Handling
Types of Collisions
- Two keys map to same index
Common Techniques
- Chaining
- Open Addressing
- Linear Probing
- Quadratic Probing
Hashing vs Sorting
| Aspect | Hashing | Sorting |
|---|---|---|
| Time | O(n) | O(n log n) |
| Space | O(n) | O(1) |
| Order preserved | No | Yes |
Common Interview Problems Using Hashing
- Two Sum
- Majority Element
- First repeating element
- Longest subarray with zero sum
- Intersection of two arrays
Common Mistakes
- Forgetting about collisions
- Using frequency array with large values
- Ignoring worst-case complexity
- Not handling negative keys properly
Summary
Hashing is one of the most powerful techniques for solving array-based problems efficiently. By using hash tables, we can drastically reduce time complexity from quadratic to linear for many common tasks such as frequency counting, duplicate detection, and pair searching.
Understanding the fundamentals of hashing, collision handling, and appropriate use cases enables developers to design highly optimized solutions. Mastery of hashing is essential for technical interviews, competitive programming, and real-world software development.
