How ConcurrentHashMap Works in Java
Deep Dive into Thread-Safe Hashing
Introduction
In Java, a HashMap is not thread-safe, meaning it cannot be safely used by multiple threads without external synchronization.
To address this, Java provides the ConcurrentHashMap, a high-performance, thread-safe variant of HashMap that allows concurrent read and write operations without locking the entire map.
This blog will cover:
- Why ConcurrentHashMap is needed.
- Internal working and structure.
- Key features and concurrency mechanisms.
- How to use it effectively.
- A custom simplified implementation concept.
1. Why Not Use HashMap in Multi-threading?
When multiple threads access and modify a HashMap concurrently:
- Data inconsistency can occur.
- Infinite loops can happen (due to structural modification during rehashing).
- Race conditions and corruption of data.
Example:
Map map = new HashMap<>();
Runnable task = () -> map.put(1, "Java");
new Thread(task).start();
new Thread(task).start();
This may result in inconsistent results because HashMap is not synchronized.
2. Introduction to ConcurrentHashMap
ConcurrentHashMap solves this problem by:
- Allowing multiple threads to read without locking.
- Allowing a limited number of threads to write concurrently using fine-grained locking.
Key benefits:
- High concurrency and performance.
- No need for external synchronization.
- Avoids ConcurrentModificationException in most cases.
3. Internal Working
Java 8 and Later
In Java 8+, ConcurrentHashMap uses a **combination of:
- Hash table (array of bins).
- Lock striping and CAS (Compare-And-Swap) for concurrency.
- Tree bins (red-black trees) for handling high-collision scenarios.
Structure:
Table → Array of Node
Each Node contains:
- hash
- key
- value
- next (for collision handling)
Segmentation Removed
Earlier Java versions (before Java 8) used Segment locks.
Java 8 replaced segments with a synchronized block at the bin level to improve performance and reduce memory overhead.
4. Concurrency Mechanisms
4.1 Lock Striping
Instead of locking the entire table, only a portion of the table (a bin) is locked during updates.
This allows multiple threads to operate on different bins without interference.
4.2 CAS (Compare-And-Swap)
A lock-free mechanism for reads and updates when possible.
It minimizes locking and improves performance for read-heavy workloads.
4.3 Tree Bins
When collisions increase in a single bin, the linked list of entries converts into a balanced tree for faster lookup (O(log n) instead of O(n)).
5. Key Methods and Their Behavior
Method | Thread-Safety | Notes |
---|---|---|
put() | Thread-safe | Uses bin-level locking |
get() | Lock-free read | Uses volatile reads and CAS |
remove() | Thread-safe | Bin-level locking |
size() | Approximate in concurrent context | Expensive due to locking |
computeIfAbsent() | Thread-safe | Computes value atomically if absent |
Example:
import java.util.concurrent.*;
public class ConcurrentHashMapExample {
public static void main(String[] args) {
ConcurrentHashMap map = new ConcurrentHashMap<>();
map.put(1, "Java");
map.put(2, "Python");
map.put(3, "C++");
map.forEach((k, v) -> System.out.println(k + " → " + v));
map.computeIfAbsent(4, key -> "Go");
System.out.println("After computeIfAbsent: " + map);
}
}
6. Iteration in ConcurrentHashMap
ConcurrentHashMap supports weakly consistent iterators, meaning:
- They do not throw ConcurrentModificationException.
- They may or may not reflect updates after the iterator was created.
- They guarantee no element is returned twice.
Example:
ConcurrentHashMap map = new ConcurrentHashMap<>();
map.put(1, "Java");
map.put(2, "Python");
for (Map.Entry entry : map.entrySet()) {
System.out.println(entry);
}
7. Performance Characteristics
- High performance in read-heavy workloads due to lock-free reads.
- Moderate contention in write-heavy workloads depending on bin locking.
- Better scalability than synchronized HashMap or Hashtable.
8. Custom Implementation Concept
A full custom ConcurrentHashMap is complex, but here’s a conceptual design:
Step 1 — Define the Bucket Structure
We need buckets to store entries and locks for thread safety.
import java.util.concurrent.locks.ReentrantLock;
static class Bucket {
private Node head;
private final ReentrantLock lock = new ReentrantLock();
void put(K key, V value) {
lock.lock();
try {
if (head == null) {
head = new Node<>(key, value);
} else {
Node current = head;
while (current != null) {
if (current.key.equals(key)) {
current.value = value; // update value
return;
}
if (current.next == null) {
current.next = new Node<>(key, value);
return;
}
current = current.next;
}
}
} finally {
lock.unlock();
}
}
V get(K key) {
lock.lock();
try {
Node current = head;
while (current != null) {
if (current.key.equals(key)) {
return current.value;
}
current = current.next;
}
return null;
} finally {
lock.unlock();
}
}
}
Step 2 — Define the Node Structure
Each bucket will store its entries as linked nodes.
static class Node {
K key;
V value;
Node next;
Node(K key, V value) {
this.key = key;
this.value = value;
}
}
Step 3 — Create the Main ConcurrentHashMap Class
This class will store an array of buckets and manage hashing.
public class CustomConcurrentHashMap {
private final int capacity = 16;
private final Bucket[] table;
public CustomConcurrentHashMap() {
table = new Bucket[capacity];
for (int i = 0; i < capacity; i++) {
table[i] = new Bucket<>();
}
}
private int getBucketIndex(K key) {
return Math.abs(key.hashCode() % capacity);
}
}
Step 4 — Implement put() Method
This method hashes the key and inserts it into the correct bucket.
public void put(K key, V value) {
int index = getBucketIndex(key);
table[index].put(key, value);
}
Step 5 — Implement get() Method
This method retrieves the value for a given key safely.
public V get(K key) {
int index = getBucketIndex(key);
return table[index].get(key);
}
Step 6 — Full CustomConcurrentHashMap Code
Here’s the complete working code combining all the steps.
import java.util.concurrent.locks.ReentrantLock;
public class CustomConcurrentHashMap {
private final int capacity = 16;
private final Bucket[] table;
public CustomConcurrentHashMap() {
table = new Bucket[capacity];
for (int i = 0; i < capacity; i++) {
table[i] = new Bucket<>();
}
}
private int getBucketIndex(K key) {
return Math.abs(key.hashCode() % capacity);
}
public void put(K key, V value) {
int index = getBucketIndex(key);
table[index].put(key, value);
}
public V get(K key) {
int index = getBucketIndex(key);
return table[index].get(key);
}
static class Bucket {
private Node head;
private final ReentrantLock lock = new ReentrantLock();
void put(K key, V value) {
lock.lock();
try {
if (head == null) {
head = new Node<>(key, value);
} else {
Node current = head;
while (current != null) {
if (current.key.equals(key)) {
current.value = value;
return;
}
if (current.next == null) {
current.next = new Node<>(key, value);
return;
}
current = current.next;
}
}
} finally {
lock.unlock();
}
}
V get(K key) {
lock.lock();
try {
Node current = head;
while (current != null) {
if (current.key.equals(key)) {
return current.value;
}
current = current.next;
}
return null;
} finally {
lock.unlock();
}
}
}
static class Node {
K key;
V value;
Node next;
Node(K key, V value) {
this.key = key;
this.value = value;
}
}
}
Step 7 — Testing the Custom ConcurrentHashMap
public class TestCustomConcurrentHashMap {
public static void main(String[] args) {
CustomConcurrentHashMap map = new CustomConcurrentHashMap<>();
map.put("Java", "Platform");
map.put("Python", "Language");
map.put("C++", "Programming");
System.out.println("Java: " + map.get("Java"));
System.out.println("Python: " + map.get("Python"));
System.out.println("C++: " + map.get("C++"));
}
}
Expected Output:
Java: Platform
Python: Language
C++: Programming
9. Advantages of ConcurrentHashMap
- High throughput in multi-threaded environments.
- Lock-free reads improve performance.
- Safe concurrent modification.
- Weakly consistent iterators avoid exceptions.
10. Disadvantages of ConcurrentHashMap
- Slightly higher memory consumption due to locks and internal structure.
- Slightly more complex iteration behavior.
- Writes are slower compared to HashMap because of locking.
11. Real-World Applications
- Caching systems (where high concurrency is needed).
- Shared state in multi-threaded applications.
- Real-time analytics processing.
- Thread-safe configuration maps.
Conclusion
In this blog, we have explored:
- Why ConcurrentHashMap is necessary for thread safety.
- How it works internally.
- Its concurrency mechanisms.
- Key advantages and disadvantages.
- A conceptual design of a custom concurrent hash map.
Understanding ConcurrentHashMap is crucial for designing highly concurrent applications in Java that require thread-safe access to shared resources.
Next Blog- Comparable vs Comparator in Java