YOLO V3 Video Stream Object Detection

1 year ago

Comment

1 / 1

To perform YOLOv3 object detection on a video stream, you'll need to use a combination of computer vision libraries and deep learning frameworks. Here's an example of how you can achieve YOLOv3 video stream object detection using Python and OpenCV:

Install the necessary libraries:
- OpenCV: pip install opencv-python
- numpy: pip install numpy
Download the YOLOv3 model files:
- YOLOv3 weights: https://pjreddie.com/media/files/yolov3.weights
- YOLOv3 configuration: https://github.com/pjreddie/darknet/blob/master/cfg/yolov3.cfg
- COCO class names: https://github.com/pjreddie/darknet/blob/master/data/coco.names
Create a Python script (e.g., yolov3_video_stream.py) and paste the following code into it:

python
import cv2

import numpy as np



# Load YOLOv3 network

net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")



# Load class labels

classes = []

with open("coco.names", "r") as f:

    classes = [line.strip() for line in f.readlines()]



# Generate random colors for each class

colors = np.random.uniform(0, 255, size=(len(classes), 3))



# Get output layer names

layer_names = net.getLayerNames()

output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]



# Open video stream

cap = cv2.VideoCapture(0)  # Change 0 to the path of your video file if not using webcam



while True:

    # Read frame from video stream

    ret, frame = cap.read()



    # Perform object detection

    height, width, channels = frame.shape

    blob = cv2.dnn.blobFromImage(frame, 0.00392, (416, 416), (0, 0, 0), True, crop=False)

    net.setInput(blob)

    outs = net.forward(output_layers)



    # Extract bounding box coordinates and class labels

    class_ids = []

    confidences = []

    boxes = []

    for out in outs:

        for detection in out:

            scores = detection[5:]

            class_id = np.argmax(scores)

            confidence = scores[class_id]

            if confidence > 0.5:

                center_x = int(detection[0] * width)

                center_y = int(detection[1] * height)

                w = int(detection[2] * width)

                h = int(detection[3] * height)



                # Rectangle coordinates

                x = int(center_x - w / 2)

                y = int(center_y - h / 2)



                boxes.append([x, y, w, h])

                confidences.append(float(confidence))

                class_ids.append(class_id)



    # Apply non-maximum suppression to remove redundant overlapping boxes

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)



    # Draw bounding boxes and labels on the frame

    for i in range(len(boxes)):

        if i in indexes:

            x, y, w, h = boxes[i]

            label = classes[class_ids[i]]

            confidence = confidences[i]

            color = colors[class_ids[i]]

            cv2.rectangle(frame, (x, y), (x + w, y + h), color, 2)

            cv2.putText(frame, f"{label} {confidence:.2f}", (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)



    # Display the resulting frame

    cv2.imshow("YOLOv3 Object Detection", frame)



    # Break the loop if 'q' is pressed

    if cv2.waitKey(1) & 0xFF == ord('q'):

        break



# Release resources

cap.release()

cv2.destroyAllWindows()

Make sure the script, YOLOv3 weights file (yolov3.weights), YOLOv3 configuration file (yolov3.cfg), and COCO class names file (coco.names) are in the same directory.
Run the script. It will open a video stream window showing the real-time object detection results using YOLOv3.

Note: This script assumes you're using a webcam as the video source. If you want to use a video file instead, change the cap = cv2.VideoCapture(0) line to specify the path of your video file (e.g., cap = cv2.VideoCapture("path/to/video.mp4")).

Make sure you have a compatible version of OpenCV installed, as the installation steps may vary depending on your system setup.