Implementation of Object Detection using SSD in OpenCV

Artificial intelligence March 23 ,2025

Implementation of Object Detection using SSD in OpenCV

Introduction

Object detection is a key component in computer vision that helps in identifying and locating objects in an image or video. Single Shot MultiBox Detector (SSD) is a popular object detection algorithm that offers a balance between speed and accuracy. In this guide, we will implement object detection using SSD in OpenCV.

What is SSD (Single Shot MultiBox Detector)?

SSD is a deep learning-based object detection model that:

Uses a single forward pass to detect objects.
Eliminates the need for region proposal networks (used in Faster R-CNN), making it faster.
Detects objects at multiple scales using different feature maps.

SSD consists of:

A base network (VGG16 or MobileNet) for feature extraction.
Additional convolutional layers to detect objects of different sizes.
Default anchor boxes to detect objects at multiple scales.

Why Use SSD with OpenCV?

OpenCV provides deep learning module (DNN) to load SSD models.
Supports pre-trained SSD models (Caffe, TensorFlow, ONNX, etc.).
Efficient for real-time object detection on CPU and GPU.

Step-by-Step Implementation of SSD Object Detection using OpenCV

1. Install Required Libraries

Make sure you have OpenCV installed:

pip install opencv-python opencv-python-headless numpy

2. Download the Pre-trained SSD Model

We will use the SSD with MobileNet backbone trained on the COCO dataset (Common Objects in Context).

Download Required Files:

SSD Model (Caffe Format)
- Model: SSD MobileNet V2
- Weights: MobileNetSSD_deploy.caffemodel

Save the following files:

MobileNetSSD_deploy.prototxt
MobileNetSSD_deploy.caffemodel

3. Load SSD Model in OpenCV

import cv2
import numpy as np

# Load the SSD model
net = cv2.dnn.readNetFromCaffe("MobileNetSSD_deploy.prototxt", "MobileNetSSD_deploy.caffemodel")

# Class labels for COCO dataset
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
           "bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
           "dog", "horse", "motorbike", "person", "pottedplant",
           "sheep", "sofa", "train", "tvmonitor"]

# Assign random colors to each class
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))

4. Perform Object Detection on an Image

def detect_objects(image_path):
    # Load the image
    image = cv2.imread(image_path)
    (h, w) = image.shape[:2]

    # Convert the image into a blob
    blob = cv2.dnn.blobFromImage(image, scalefactor=0.007843, size=(300, 300), mean=127.5)

    # Set the input to the SSD model
    net.setInput(blob)

    # Perform forward pass to get detections
    detections = net.forward()

    # Process the detections
    for i in range(detections.shape[2]):
        confidence = detections[0, 0, i, 2]  # Confidence score
        
        # Filter out weak detections
        if confidence > 0.5:
            idx = int(detections[0, 0, i, 1])  # Class index
            box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
            (startX, startY, endX, endY) = box.astype("int")

            # Draw the bounding box
            label = f"{CLASSES[idx]}: {confidence * 100:.2f}%"
            cv2.rectangle(image, (startX, startY), (endX, endY), COLORS[idx], 2)
            cv2.putText(image, label, (startX, startY - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)

    # Display the output
    cv2.imshow("SSD Object Detection", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

# Test the function
detect_objects("test_image.jpg")

Explanation:

Reads an image.
Converts it into a blob for the SSD model.
Runs the model to detect objects.
Filters results based on confidence score.
Draws bounding boxes with class labels.

5. Perform Real-time Object Detection on a Webcam

def real_time_detection():
    cap = cv2.VideoCapture(0)  # Open webcam

    while True:
        ret, frame = cap.read()
        if not ret:
            break

        (h, w) = frame.shape[:2]
        blob = cv2.dnn.blobFromImage(frame, scalefactor=0.007843, size=(300, 300), mean=127.5)
        net.setInput(blob)
        detections = net.forward()

        for i in range(detections.shape[2]):
            confidence = detections[0, 0, i, 2]
            if confidence > 0.5:
                idx = int(detections[0, 0, i, 1])
                box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
                (startX, startY, endX, endY) = box.astype("int")

                label = f"{CLASSES[idx]}: {confidence * 100:.2f}%"
                cv2.rectangle(frame, (startX, startY), (endX, endY), COLORS[idx], 2)
                cv2.putText(frame, label, (startX, startY - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)

        cv2.imshow("SSD Object Detection - Webcam", frame)
        
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break

    cap.release()
    cv2.destroyAllWindows()

# Run real-time detection
real_time_detection()

Explanation:

Opens a webcam and captures frames.
Processes frames in real-time using SSD.
Displays detected objects with bounding boxes.
Press q to exit.

Performance Optimization Tips

Use GPU for Acceleration

net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)

This enables GPU acceleration (requires OpenCV with CUDA support).

Resize Input Image Before Detection
- Resizing the input image to 300×300 speeds up detection.
Use a Faster Model
- SSD MobileNet is fast, but for even better performance, try YOLO or TensorFlow Lite SSD.

Conclusion

SSD (Single Shot Detector) is a great balance between speed and accuracy.
OpenCV's DNN module makes it easy to load and run SSD models.
Can be used for image and real-time video object detection.
Works well on CPU, but GPU acceleration improves performance.

Purnima

You must logged in to post comments.