Learn how to build a real-time object detection system with Python using popular libraries like OpenCV and YOLO. Try the code yourself in Google Colab! π»π·
Hello! I'm **Madhusai**, and I led this real-time object detection project. This initiative was developed as part of my Project-Based Internship in Artificial Intelligence with Eduexpose. My passion for AI, Cloud, and Software Development drove this project, which focused on the practical application of AI and Machine Learning to build robust real-world systems. You can learn more about my professional journey on my LinkedIn profile.
My internship at Eduexpose ran from **June 1, 2025, to July 31, 2025**. It involved a structured work schedule and tasks designed to enhance skills in AI, reflecting the core role within the Artificial Intelligence project. The experience provided valuable insights into project development, execution, and adherence to professional standards, including maintaining confidentiality of proprietary information.
Real-time object detection involves identifying and localizing objects within video streams as they happen, with minimal latency. This technology is at the heart of many modern applications, including:
This project specifically leverages **YOLOv11** for its balance of speed and accuracy, and **OpenCV** for camera interaction and display.
Follow these steps to set up and run the real-time object detection project using your laptop's webcam in Visual Studio Code.
It's crucial to use a virtual environment to manage project dependencies. Open VS Code, navigate to your desired project folder, and open a new terminal (Ctrl+Shift+`).
Create and activate a virtual environment:
# Create virtual environment
python -m venv venv
# Activate on Windows (PowerShell)
.\venv\Scripts\Activate.ps1
# Activate on Windows (Command Prompt)
venv\Scripts\activate.bat
# Activate on macOS/Linux
source venv/bin/activate
Once activated (you'll see (venv) in your terminal prompt), install the required libraries:
# Install PyTorch (CPU version - suitable for most laptops)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
# Install OpenCV and other utilities
pip install opencv-python numpy matplotlib seaborn
# Install Ultralytics (includes YOLOv11 support)
pip install ultralytics
Note: If you have an NVIDIA GPU, visit PyTorch's website for the CUDA-enabled installation command for faster inference.
3.1 In VS Code, create a new file named yolo_realtime_webcam.py and paste the following code for Real-Time Object Detection Using Webcamπ·:
import cv2
from ultralytics import YOLO
# Print OpenCV version
print(f"OpenCV version: {cv2.__version__}")
# Load the latest YOLO11 model (choose model size: n, s, m, l, x)
model = YOLO("yolo11s.pt") # Alternatives: 'yolo11n.pt', 'yolo11m.pt', etc.
# Open the laptop webcam
cap = cv2.VideoCapture(0)
if not cap.isOpened():
print("Error: Could not access the webcam.")
exit()
print("Webcam opened successfully. Press 'q' to quit the detection window.")
try:
while True:
ret, frame = cap.read()
if not ret:
print("Failed to grab frame.")
break
# Run detection; YOLO expects BGR frames (default for OpenCV)
results = model.predict(source=frame, show=False, stream=False)
# Overlay boxes; results[0].plot() returns BGR image
annotated_frame = results[0].plot()
# Display result
cv2.imshow("YOLO11 Real-time Detection", annotated_frame)
# Exit on 'q'
if cv2.waitKey(1) & 0xFF == ord('q'):
break
finally:
cap.release()
cv2.destroyAllWindows()
print("Webcam released and windows closed.")
```
traffic_analyzer.py and paste the following code for Real-Time Object Detection Using Video Fileπ₯:import cv2
from ultralytics import YOLO
def traffic_analysis_yolo_bytetrack(video_path, output_path="output_traffic.mp4"):
model = YOLO("yolo11n.pt")
# Open the video file
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Could not open video file {video_path}")
return
# Get video properties
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))
# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'mp4v') # You can use 'XVID' or 'MJPG'
out = cv2.VideoWriter(output_path, fourcc, fps, (frame_width, frame_height))
print("Starting vehicle detection and tracking...")
while True:
ret, frame = cap.read()
if not ret:
print("End of video or error reading frame.")
break
results = model.track(frame, persist=True, tracker="bytetrack.yaml", conf=0.3, iou=0.5, show=False)
# Process results
if results and results[0].boxes.id is not None:
# Get annotated frame with bounding boxes and track IDs
annotated_frame = results[0].plot()
# Display the frame
cv2.imshow("Traffic Analysis", annotated_frame)
out.write(annotated_frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release resources
cap.release()
out.release()
cv2.destroyAllWindows()
print(f"Traffic analysis completed. Output saved to {output_path}")
if __name__ == "__main__":
# Replace with your video file path
input_video = "road_traffic.mp4"
traffic_analysis_yolo_bytetrack(input_video)
```
4.1 Save your yolo_realtime_webcam.py file. With your virtual environment still active in the VS Code terminal, run:
python yolo_realtime_webcam.py
4.2 Save your traffic_analyzer.py file. With your virtual environment still active in the VS Code terminal, run:
python traffic_analyzer.py
A new window will appear showing your webcam feed with real-time object detections. To exit, click on the detection window and press the q key.
Click the badge below to open and run the object detection code in Google Colab! π
(Replace your-notebook-url with your Colab notebook's shareable link if you create a new one.)
Here's the typical setup and inference code you'd run in a Google Colab notebook cell for YOLOv11 for Real-Time Object Detection using Image:
# Step 1: Install the Ultralytics package (YOLO11 is supported in ultralytics >= 8.0.x)
!pip install ultralytics --upgrade --quiet
# Step 2: Import the library
from ultralytics import YOLO
# Step 3: Load a YOLOv11 pretrained model (various sizes available, e.g., 'yolo11n.pt', 'yolo11s.pt', 'yolo11m.pt', 'yolo11x.pt')
model = YOLO('yolo11s.pt')
# Step 4: Inference on an image
results = model('path/to/image.jpg')
# Step 5: Show results (in a notebook or Python script)
results.show() # Visualize detection output
results.save() # Save results images to 'runs/detect/predict'
results.print() # Print results to console
Note: In Colab, results.show() typically displays the image directly within the notebook output, and webcam access requires specific Colab code (like cv2_imshow and JavaScript integration) which is more complex than a simple Python script.