Implementation of Object Detection using SSD in OpenCV
Introduction
Object detection is a key component in computer vision that helps in identifying and locating objects in an image or video. Single Shot MultiBox Detector (SSD) is a popular object detection algorithm that offers a balance between speed and accuracy. In this guide, we will implement object detection using SSD in OpenCV.
What is SSD (Single Shot MultiBox Detector)?
SSD is a deep learning-based object detection model that:
- Uses a single forward pass to detect objects.
- Eliminates the need for region proposal networks (used in Faster R-CNN), making it faster.
- Detects objects at multiple scales using different feature maps.
SSD consists of:
- A base network (VGG16 or MobileNet) for feature extraction.
- Additional convolutional layers to detect objects of different sizes.
- Default anchor boxes to detect objects at multiple scales.
Why Use SSD with OpenCV?
- OpenCV provides deep learning module (DNN) to load SSD models.
- Supports pre-trained SSD models (Caffe, TensorFlow, ONNX, etc.).
- Efficient for real-time object detection on CPU and GPU.
Step-by-Step Implementation of SSD Object Detection using OpenCV
1. Install Required Libraries
Make sure you have OpenCV installed:
pip install opencv-python opencv-python-headless numpy
2. Download the Pre-trained SSD Model
We will use the SSD with MobileNet backbone trained on the COCO dataset (Common Objects in Context).
Download Required Files:
- SSD Model (Caffe Format)
- Model: SSD MobileNet V2
- Weights: MobileNetSSD_deploy.caffemodel
Save the following files:
- MobileNetSSD_deploy.prototxt
- MobileNetSSD_deploy.caffemodel
3. Load SSD Model in OpenCV
import cv2
import numpy as np
# Load the SSD model
net = cv2.dnn.readNetFromCaffe("MobileNetSSD_deploy.prototxt", "MobileNetSSD_deploy.caffemodel")
# Class labels for COCO dataset
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
"dog", "horse", "motorbike", "person", "pottedplant",
"sheep", "sofa", "train", "tvmonitor"]
# Assign random colors to each class
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))
4. Perform Object Detection on an Image
def detect_objects(image_path):
# Load the image
image = cv2.imread(image_path)
(h, w) = image.shape[:2]
# Convert the image into a blob
blob = cv2.dnn.blobFromImage(image, scalefactor=0.007843, size=(300, 300), mean=127.5)
# Set the input to the SSD model
net.setInput(blob)
# Perform forward pass to get detections
detections = net.forward()
# Process the detections
for i in range(detections.shape[2]):
confidence = detections[0, 0, i, 2] # Confidence score
# Filter out weak detections
if confidence > 0.5:
idx = int(detections[0, 0, i, 1]) # Class index
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")
# Draw the bounding box
label = f"{CLASSES[idx]}: {confidence * 100:.2f}%"
cv2.rectangle(image, (startX, startY), (endX, endY), COLORS[idx], 2)
cv2.putText(image, label, (startX, startY - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)
# Display the output
cv2.imshow("SSD Object Detection", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Test the function
detect_objects("test_image.jpg")
Explanation:
- Reads an image.
- Converts it into a blob for the SSD model.
- Runs the model to detect objects.
- Filters results based on confidence score.
- Draws bounding boxes with class labels.
5. Perform Real-time Object Detection on a Webcam
def real_time_detection():
cap = cv2.VideoCapture(0) # Open webcam
while True:
ret, frame = cap.read()
if not ret:
break
(h, w) = frame.shape[:2]
blob = cv2.dnn.blobFromImage(frame, scalefactor=0.007843, size=(300, 300), mean=127.5)
net.setInput(blob)
detections = net.forward()
for i in range(detections.shape[2]):
confidence = detections[0, 0, i, 2]
if confidence > 0.5:
idx = int(detections[0, 0, i, 1])
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")
label = f"{CLASSES[idx]}: {confidence * 100:.2f}%"
cv2.rectangle(frame, (startX, startY), (endX, endY), COLORS[idx], 2)
cv2.putText(frame, label, (startX, startY - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)
cv2.imshow("SSD Object Detection - Webcam", frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
# Run real-time detection
real_time_detection()
Explanation:
- Opens a webcam and captures frames.
- Processes frames in real-time using SSD.
- Displays detected objects with bounding boxes.
- Press q to exit.
Performance Optimization Tips
Use GPU for Acceleration
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA) net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
- This enables GPU acceleration (requires OpenCV with CUDA support).
- Resize Input Image Before Detection
- Resizing the input image to 300×300 speeds up detection.
- Use a Faster Model
- SSD MobileNet is fast, but for even better performance, try YOLO or TensorFlow Lite SSD.
Conclusion
- SSD (Single Shot Detector) is a great balance between speed and accuracy.
- OpenCV's DNN module makes it easy to load and run SSD models.
- Can be used for image and real-time video object detection.
- Works well on CPU, but GPU acceleration improves performance.