Advanced Java October 06 ,2025

Custom Implementation of HashSet in Java

Understanding Internal Working and Step-by-Step Construction

Introduction

HashSet is one of Java’s core Collection classes. It is part of the Java Collections Framework and implements the Set interface. A Set is a collection of unique elements — meaning it does not allow duplicate values.

The interesting part is that internally HashSet uses HashMap for storage. This design decision ensures that HashSet provides constant-time performance for basic operations like add, remove, and contains.

In this blog, we will:

Understand how HashSet works internally.
Learn about its relationship with HashMap.
Build a custom HashSet from scratch.
Analyze its advantages, disadvantages, and real-world use cases.

Understanding the Relationship Between HashSet and HashMap

A HashSet in Java is essentially backed by a HashMap.
When you add an element to a HashSet, the element is stored as a key in the internal HashMap.
The value is a constant dummy object (a placeholder).

Example:

HashSet set = new HashSet<>();
set.add("Java");
set.add("Python");
set.add("C++");

Internally:

HashMap:
[Java=null, Python=null, C++=null]

This means:

The uniqueness guarantee in HashSet comes from the uniqueness of HashMap keys.
HashSet relies entirely on HashMap’s hashing mechanism.

Core Features of HashSet

Stores unique elements only.
Allows null values (only one null element allowed).
Unordered collection — insertion order is not maintained.
Backed by a HashMap internally.

Designing a Custom HashSet

Since HashSet internally uses HashMap, we will follow a similar approach but keep it simpler.

Step 1: Define a CustomHashSet Class

public class CustomHashSet {
    private static final Object DUMMY = new Object();
    private CustomHashMap map;

    public CustomHashSet() {
        map = new CustomHashMap<>();
    }
}

Here:

We use our previously implemented CustomHashMap to store elements.
A single dummy object is used as the value for all keys.

Step 2: Implement the add() Method

This will add elements to the set.

public boolean add(K key) {
    if (map.get(key) == null) {
        map.put(key, DUMMY);
        return true;
    }
    return false; // element already exists
}

Step 3: Implement the contains() Method

Checks whether the element exists.

public boolean contains(K key) {
    return map.get(key) != null;
}

Step 4: Implement the remove() Method

Removes the element from the set.

public boolean remove(K key) {
    return map.remove(key);
}

Step 5: Implement the display() Method

Displays all elements in the set.

public void display() {
    map.display();
}

Step 6: Testing the CustomHashSet

public class TestCustomHashSet {
    public static void main(String[] args) {
        CustomHashSet set = new CustomHashSet<>();

        set.add("Java");
        set.add("Python");
        set.add("C++");
        set.add("Java"); // duplicate element

        set.display();

        System.out.println("Contains Python? " + set.contains("Python"));
        System.out.println("Contains Ruby? " + set.contains("Ruby"));

        set.remove("Python");
        set.display();
    }
}

Expected Output

Bucket 1: [Java]
Bucket 2: [Python]
Bucket 3: [C++]
Contains Python? true
Contains Ruby? false
Bucket 1: [Java]
Bucket 3: [C++]

Got it — you only want the CustomHashSet complete implementation (self-contained, no external CustomHashMap dependency).

CustomHashSet – Complete Code

import java.util.LinkedList;

public class CustomHashSet<K> {

    // Node class for each element in the set
    private class Node {
        K key;
        Node(K key) {
            this.key = key;
        }
    }

    // Default initial capacity
    private int capacity = 10;
    private LinkedList<Node>[] buckets;
    private int size = 0;

    @SuppressWarnings("unchecked")
    public CustomHashSet() {
        buckets = new LinkedList[capacity];
        for (int i = 0; i < capacity; i++) {
            buckets[i] = new LinkedList<>();
        }
    }

    // Hash function to calculate bucket index
    private int hash(K key) {
        return Math.abs(key.hashCode() % capacity);
    }

    // Add element to set
    public boolean add(K key) {
        int index = hash(key);
        LinkedList<Node> bucket = buckets[index];

        for (Node node : bucket) {
            if (node.key.equals(key)) {
                return false; // already exists
            }
        }

        bucket.add(new Node(key));
        size++;

        // Optional: rehash if load factor exceeds threshold
        if ((1.0 * size) / capacity > 0.7) {
            rehash();
        }

        return true;
    }

    // Check if element exists
    public boolean contains(K key) {
        int index = hash(key);
        LinkedList<Node> bucket = buckets[index];

        for (Node node : bucket) {
            if (node.key.equals(key)) {
                return true;
            }
        }
        return false;
    }

    // Remove element
    public boolean remove(K key) {
        int index = hash(key);
        LinkedList<Node> bucket = buckets[index];

        for (Node node : bucket) {
            if (node.key.equals(key)) {
                bucket.remove(node);
                size--;
                return true;
            }
        }
        return false;
    }

    // Display all elements
    public void display() {
        for (int i = 0; i < capacity; i++) {
            if (!buckets[i].isEmpty()) {
                System.out.print("Bucket " + i + ": ");
                for (Node node : buckets[i]) {
                    System.out.print("[" + node.key + "] ");
                }
                System.out.println();
            }
        }
    }

    // Returns number of elements
    public int size() {
        return size;
    }

    // Internal method to rehash (resize and redistribute keys)
    @SuppressWarnings("unchecked")
    private void rehash() {
        LinkedList<Node>[] oldBuckets = buckets;
        capacity *= 2;
        buckets = new LinkedList[capacity];
        size = 0;

        for (int i = 0; i < capacity; i++) {
            buckets[i] = new LinkedList<>();
        }

        for (LinkedList<Node> bucket : oldBuckets) {
            for (Node node : bucket) {
                add(node.key);
            }
        }
    }

    // Test main method
    public static void main(String[] args) {
        CustomHashSet<String> set = new CustomHashSet<>();

        set.add("Java");
        set.add("Python");
        set.add("C++");
        set.add("Java"); // duplicate

        System.out.println("Initial Set:");
        set.display();

        System.out.println("Contains Python? " + set.contains("Python"));
        System.out.println("Contains Ruby? " + set.contains("Ruby"));

        set.remove("Python");

        System.out.println("\nAfter Removing Python:");
        set.display();
    }
}

Expected Output

Initial Set:
Bucket 1: [Java]
Bucket 2: [Python]
Bucket 3: [C++]
Contains Python? true
Contains Ruby? false

After Removing Python:
Bucket 1: [Java]
Bucket 3: [C++]

Explanation

Buckets: Array of linked lists used to store elements based on hash codes.
Hash Function: Uses hashCode() of the key to determine its bucket index.
Add: Checks for duplicates, then adds new elements.
Contains: Returns true if the key exists.
Remove: Deletes an element if found.
Rehashing: Automatically doubles bucket size when load factor exceeds 0.7 to maintain efficiency

Internal Working of HashSet

Here’s what happens internally when we perform operations:

add(element)
- Calls map.put(element, DUMMY)
- HashMap computes the hash index and stores the element as a key.
contains(element)
- Calls map.get(element)
- Returns true if element exists.
remove(element)
- Calls map.remove(element)
- Deletes the key from the internal map.

Time Complexity of HashSet Operations

Operation	Average Case	Worst Case
add()	O(1)	O(n)
contains()	O(1)	O(n)
remove()	O(1)	O(n)

The worst-case scenario happens when all elements hash to the same bucket, forming a linked list (or tree in Java 8+).

Differences Between HashSet and HashMap

Feature	HashSet	HashMap
Data Structure	Uses HashMap internally	Uses array of buckets + linked list/tree
Storage	Only keys	Key-value pairs
Null Values	Allows one null value	Allows one null key and multiple null values
Order	Unordered	Unordered
Purpose	Store unique elements	Store key-value pairs

Advantages of HashSet

Provides O(1) average time complexity for add, remove, and contains.
Automatically handles uniqueness of elements.
Easy to implement using HashMap.
Flexible and widely used in applications.

Disadvantages of HashSet

No ordering guarantees (use LinkedHashSet for order preservation).
Not synchronized — requires synchronization for multi-threading (use ConcurrentHashSet).
Memory overhead because of internal HashMap storage.
Poor hashCode implementation can degrade performance.

Real-World Applications

Removing duplicates from a collection.
Maintaining a list of unique items (e.g., usernames, email IDs).
Storing distinct elements for mathematical operations.
Membership checking (fast lookup).

Conclusion

In this blog, we’ve dissected the HashSet implementation and learned that it is basically a wrapper around HashMap.
By building a CustomHashSet, we understood:

How elements are stored as keys in a HashMap.
How collisions are handled through hashing and chaining.
How core set operations map to HashMap operations.

Understanding this relationship not only clarifies HashSet’s internal mechanism but also strengthens your grasp on Java’s Collections Framework.

Next Blog- Custom Implementation of LinkedHashMap in Java

Sanjiv

You must logged in to post comments.

Java Stream API

Java

Java

Custom Implementation of HashSet in Java

Introduction

Understanding the Relationship Between HashSet and HashMap

Core Features of HashSet

Designing a Custom HashSet

Step 1: Define a CustomHashSet Class

Step 2: Implement the add() Method

Step 3: Implement the contains() Method

Step 4: Implement the remove() Method

Step 5: Implement the display() Method

Step 6: Testing the CustomHashSet

Expected Output

CustomHashSet – Complete Code

Expected Output

Explanation

Internal Working of HashSet

Time Complexity of HashSet Operations

Differences Between HashSet and HashMap

Advantages of HashSet

Disadvantages of HashSet

Real-World Applications

Conclusion

Related Blogs

Generics Part- 2

Collections Framewor...

Mastering Java Multi...

Annotations

Java Memory Manageme...

Java Lambda Expressi...

Java Functional Inte...

Java Stream API

JDBC (Java Database...

JDBC (Java Database...

Annotations

Generics

Java I/O (NIO)

Introduction to Desi...

Design Patterns in J...

Other Principles Beh...

Creational Design Pa...

In Creational Design...

In Creational Design...

Creational Design Pa...

Structural Design Pa...

In Creational Design...

Structural Design Pa...

Builder Design Patte...

Structural Design Pa...

Structural Design Pa...

Structural Design Pa...

Structural Design Pa...

Structural Design Pa...

Structural Design Pa...

Design Patterns in J...

Chain of Responsibil...

Command Design Patte...

Interpreter Design P...

Iterator Design Patt...

Mediator Design Patt...

Memento Design Patte...

Observer Design Patt...

State Design Pattern...

Strategy Design Patt...

Template Method Desi...

Visitor Design Patte...

Prototype Design Pat...

Java 8+ Features

SOLID Principles in...

Custom Implementatio...

Custom Implementatio...

Custom Implementatio...

How Iterators Work i...

How ConcurrentHashMa...

Comparable vs Compar...

Get In Touch

Categories