Advanced Java October 06 ,2025

Custom Implementation of HashSet in Java

Understanding Internal Working and Step-by-Step Construction

Introduction

HashSet is one of Java’s core Collection classes. It is part of the Java Collections Framework and implements the Set interface. A Set is a collection of unique elements — meaning it does not allow duplicate values.

The interesting part is that internally HashSet uses HashMap for storage. This design decision ensures that HashSet provides constant-time performance for basic operations like add, remove, and contains.

In this blog, we will:

  • Understand how HashSet works internally.
  • Learn about its relationship with HashMap.
  • Build a custom HashSet from scratch.
  • Analyze its advantages, disadvantages, and real-world use cases.

Understanding the Relationship Between HashSet and HashMap

A HashSet in Java is essentially backed by a HashMap.
When you add an element to a HashSet, the element is stored as a key in the internal HashMap.
The value is a constant dummy object (a placeholder).

Example:

HashSet set = new HashSet<>();
set.add("Java");
set.add("Python");
set.add("C++");

Internally:

HashMap:
[Java=null, Python=null, C++=null]

This means:

  • The uniqueness guarantee in HashSet comes from the uniqueness of HashMap keys.
  • HashSet relies entirely on HashMap’s hashing mechanism.

Core Features of HashSet

  • Stores unique elements only.
  • Allows null values (only one null element allowed).
  • Unordered collection — insertion order is not maintained.
  • Backed by a HashMap internally.

Designing a Custom HashSet

Since HashSet internally uses HashMap, we will follow a similar approach but keep it simpler.

Step 1: Define a CustomHashSet Class

public class CustomHashSet {
    private static final Object DUMMY = new Object();
    private CustomHashMap map;

    public CustomHashSet() {
        map = new CustomHashMap<>();
    }
}

Here:

  • We use our previously implemented CustomHashMap to store elements.
  • A single dummy object is used as the value for all keys.

Step 2: Implement the add() Method

This will add elements to the set.

public boolean add(K key) {
    if (map.get(key) == null) {
        map.put(key, DUMMY);
        return true;
    }
    return false; // element already exists
}

Step 3: Implement the contains() Method

Checks whether the element exists.

public boolean contains(K key) {
    return map.get(key) != null;
}

Step 4: Implement the remove() Method

Removes the element from the set.

public boolean remove(K key) {
    return map.remove(key);
}

Step 5: Implement the display() Method

Displays all elements in the set.

public void display() {
    map.display();
}

Step 6: Testing the CustomHashSet

public class TestCustomHashSet {
    public static void main(String[] args) {
        CustomHashSet set = new CustomHashSet<>();

        set.add("Java");
        set.add("Python");
        set.add("C++");
        set.add("Java"); // duplicate element

        set.display();

        System.out.println("Contains Python? " + set.contains("Python"));
        System.out.println("Contains Ruby? " + set.contains("Ruby"));

        set.remove("Python");
        set.display();
    }
}

Expected Output

Bucket 1: [Java]
Bucket 2: [Python]
Bucket 3: [C++]
Contains Python? true
Contains Ruby? false
Bucket 1: [Java]
Bucket 3: [C++]

Got it — you only want the CustomHashSet complete implementation (self-contained, no external CustomHashMap dependency).

CustomHashSet – Complete Code

import java.util.LinkedList;

public class CustomHashSet<K> {

    // Node class for each element in the set
    private class Node {
        K key;
        Node(K key) {
            this.key = key;
        }
    }

    // Default initial capacity
    private int capacity = 10;
    private LinkedList<Node>[] buckets;
    private int size = 0;

    @SuppressWarnings("unchecked")
    public CustomHashSet() {
        buckets = new LinkedList[capacity];
        for (int i = 0; i < capacity; i++) {
            buckets[i] = new LinkedList<>();
        }
    }

    // Hash function to calculate bucket index
    private int hash(K key) {
        return Math.abs(key.hashCode() % capacity);
    }

    // Add element to set
    public boolean add(K key) {
        int index = hash(key);
        LinkedList<Node> bucket = buckets[index];

        for (Node node : bucket) {
            if (node.key.equals(key)) {
                return false; // already exists
            }
        }

        bucket.add(new Node(key));
        size++;

        // Optional: rehash if load factor exceeds threshold
        if ((1.0 * size) / capacity > 0.7) {
            rehash();
        }

        return true;
    }

    // Check if element exists
    public boolean contains(K key) {
        int index = hash(key);
        LinkedList<Node> bucket = buckets[index];

        for (Node node : bucket) {
            if (node.key.equals(key)) {
                return true;
            }
        }
        return false;
    }

    // Remove element
    public boolean remove(K key) {
        int index = hash(key);
        LinkedList<Node> bucket = buckets[index];

        for (Node node : bucket) {
            if (node.key.equals(key)) {
                bucket.remove(node);
                size--;
                return true;
            }
        }
        return false;
    }

    // Display all elements
    public void display() {
        for (int i = 0; i < capacity; i++) {
            if (!buckets[i].isEmpty()) {
                System.out.print("Bucket " + i + ": ");
                for (Node node : buckets[i]) {
                    System.out.print("[" + node.key + "] ");
                }
                System.out.println();
            }
        }
    }

    // Returns number of elements
    public int size() {
        return size;
    }

    // Internal method to rehash (resize and redistribute keys)
    @SuppressWarnings("unchecked")
    private void rehash() {
        LinkedList<Node>[] oldBuckets = buckets;
        capacity *= 2;
        buckets = new LinkedList[capacity];
        size = 0;

        for (int i = 0; i < capacity; i++) {
            buckets[i] = new LinkedList<>();
        }

        for (LinkedList<Node> bucket : oldBuckets) {
            for (Node node : bucket) {
                add(node.key);
            }
        }
    }

    // Test main method
    public static void main(String[] args) {
        CustomHashSet<String> set = new CustomHashSet<>();

        set.add("Java");
        set.add("Python");
        set.add("C++");
        set.add("Java"); // duplicate

        System.out.println("Initial Set:");
        set.display();

        System.out.println("Contains Python? " + set.contains("Python"));
        System.out.println("Contains Ruby? " + set.contains("Ruby"));

        set.remove("Python");

        System.out.println("\nAfter Removing Python:");
        set.display();
    }
}

Expected Output

Initial Set:
Bucket 1: [Java]
Bucket 2: [Python]
Bucket 3: [C++]
Contains Python? true
Contains Ruby? false

After Removing Python:
Bucket 1: [Java]
Bucket 3: [C++]

Explanation

  • Buckets: Array of linked lists used to store elements based on hash codes.
  • Hash Function: Uses hashCode() of the key to determine its bucket index.
  • Add: Checks for duplicates, then adds new elements.
  • Contains: Returns true if the key exists.
  • Remove: Deletes an element if found.
  • Rehashing: Automatically doubles bucket size when load factor exceeds 0.7 to maintain efficiency

 

Internal Working of HashSet

Here’s what happens internally when we perform operations:

  1. add(element)
    • Calls map.put(element, DUMMY)
    • HashMap computes the hash index and stores the element as a key.
  2. contains(element)
    • Calls map.get(element)
    • Returns true if element exists.
  3. remove(element)
    • Calls map.remove(element)
    • Deletes the key from the internal map.

Time Complexity of HashSet Operations

OperationAverage CaseWorst Case
add()O(1)O(n)
contains()O(1)O(n)
remove()O(1)O(n)

The worst-case scenario happens when all elements hash to the same bucket, forming a linked list (or tree in Java 8+).

Differences Between HashSet and HashMap

FeatureHashSetHashMap
Data StructureUses HashMap internallyUses array of buckets + linked list/tree
StorageOnly keysKey-value pairs
Null ValuesAllows one null valueAllows one null key and multiple null values
OrderUnorderedUnordered
PurposeStore unique elementsStore key-value pairs

Advantages of HashSet

  • Provides O(1) average time complexity for add, remove, and contains.
  • Automatically handles uniqueness of elements.
  • Easy to implement using HashMap.
  • Flexible and widely used in applications.

Disadvantages of HashSet

  • No ordering guarantees (use LinkedHashSet for order preservation).
  • Not synchronized — requires synchronization for multi-threading (use ConcurrentHashSet).
  • Memory overhead because of internal HashMap storage.
  • Poor hashCode implementation can degrade performance.

Real-World Applications

  • Removing duplicates from a collection.
  • Maintaining a list of unique items (e.g., usernames, email IDs).
  • Storing distinct elements for mathematical operations.
  • Membership checking (fast lookup).

Conclusion

In this blog, we’ve dissected the HashSet implementation and learned that it is basically a wrapper around HashMap.
By building a CustomHashSet, we understood:

  • How elements are stored as keys in a HashMap.
  • How collisions are handled through hashing and chaining.
  • How core set operations map to HashMap operations.

Understanding this relationship not only clarifies HashSet’s internal mechanism but also strengthens your grasp on Java’s Collections Framework.

 

Next Blog- Custom Implementation of LinkedHashMap in Java

 

 

Sanjiv
0

You must logged in to post comments.

Related Blogs

Generics P...
Advanced Java August 08 ,2025

Generics Part- 2

Collection...
Advanced Java July 07 ,2025

Collections Framewor...

Mastering...
Advanced Java August 08 ,2025

Mastering Java Multi...

Annotation...
Advanced Java August 08 ,2025

Annotations

Java Memor...
Advanced Java August 08 ,2025

Java Memory Manageme...

Java Lambd...
Advanced Java August 08 ,2025

Java Lambda Expressi...

Java Funct...
Advanced Java August 08 ,2025

Java Functional Inte...

Java Strea...
Advanced Java August 08 ,2025

Java Stream API

JDBC (Java...
Advanced Java August 08 ,2025

JDBC (Java Database...

JDBC (Java...
Advanced Java September 09 ,2025

JDBC (Java Database...

Annotation...
Advanced Java August 08 ,2025

Annotations

Generics
Advanced Java August 08 ,2025

Generics

Java I/O (...
Advanced Java August 08 ,2025

Java I/O (NIO)

Introducti...
Advanced Java September 09 ,2025

Introduction to Desi...

Design Pat...
Advanced Java September 09 ,2025

Design Patterns in J...

Other Prin...
Advanced Java September 09 ,2025

Other Principles Beh...

Creational...
Advanced Java September 09 ,2025

Creational Design Pa...

In Creatio...
Advanced Java September 09 ,2025

In Creational Design...

In Creatio...
Advanced Java September 09 ,2025

In Creational Design...

Creational...
Advanced Java September 09 ,2025

Creational Design Pa...

Structural...
Advanced Java September 09 ,2025

Structural Design Pa...

In Creatio...
Advanced Java September 09 ,2025

In Creational Design...

Structural...
Advanced Java September 09 ,2025

Structural Design Pa...

Builder De...
Advanced Java September 09 ,2025

Builder Design Patte...

Structural...
Advanced Java September 09 ,2025

Structural Design Pa...

Structural...
Advanced Java September 09 ,2025

Structural Design Pa...

Structural...
Advanced Java September 09 ,2025

Structural Design Pa...

Structural...
Advanced Java September 09 ,2025

Structural Design Pa...

Structural...
Advanced Java September 09 ,2025

Structural Design Pa...

Structural...
Advanced Java September 09 ,2025

Structural Design Pa...

Design Pat...
Advanced Java September 09 ,2025

Design Patterns in J...

Chain of R...
Advanced Java September 09 ,2025

Chain of Responsibil...

Command De...
Advanced Java September 09 ,2025

Command Design Patte...

Interprete...
Advanced Java September 09 ,2025

Interpreter Design P...

Iterator D...
Advanced Java September 09 ,2025

Iterator Design Patt...

Mediator D...
Advanced Java September 09 ,2025

Mediator Design Patt...

Memento De...
Advanced Java September 09 ,2025

Memento Design Patte...

Observer D...
Advanced Java September 09 ,2025

Observer Design Patt...

State Desi...
Advanced Java September 09 ,2025

State Design Pattern...

Strategy D...
Advanced Java September 09 ,2025

Strategy Design Patt...

Template M...
Advanced Java September 09 ,2025

Template Method Desi...

Visitor De...
Advanced Java September 09 ,2025

Visitor Design Patte...

Prototype...
Advanced Java September 09 ,2025

Prototype Design Pat...

Java 8+ Fe...
Advanced Java October 10 ,2025

Java 8+ Features

SOLID Prin...
Advanced Java October 10 ,2025

SOLID Principles in...

Custom Imp...
Advanced Java October 10 ,2025

Custom Implementatio...

Custom Imp...
Advanced Java October 10 ,2025

Custom Implementatio...

Custom Imp...
Advanced Java October 10 ,2025

Custom Implementatio...

How Iterat...
Advanced Java October 10 ,2025

How Iterators Work i...

How Concur...
Advanced Java October 10 ,2025

How ConcurrentHashMa...

Comparable...
Advanced Java October 10 ,2025

Comparable vs Compar...

Get In Touch

G06, Kristal Olivine Bellandur near Bangalore Central Mall, Bangalore Karnataka, 560103

+91-8076082435

techiefreak87@gmail.com