This article shows how to use YOLOv8 for object detection with a web camera.
Following these steps…
Part 1 : Installation
- install python https://www.python.org/downloads/
- install anaconda https://conda.io/projects/conda/en/latest/user-guide/install/index.html
- create & activate a virtual environment https://docs.ultralytics.com/guides/conda-quickstart/#prerequisites
conda create --name ultralytics-env python=3.8 -y
conda activate ultralytics-env
4. install Ultralytics https://docs.ultralytics.com/guides/conda-quickstart/#setting-up-a-conda-environment
conda install -c conda-forge ultralytics
Additional ,
a) If you have a problem with torch , run this → https://pytorch.org/get-started/locally/#windows-python
pip3 install torch torchvision torchaudio
b) if you have a problem with ultralytics version , run this → Issue #2573
pip3 install --upgrade ultralytics
Part 2 : Download a model
We will use an exist model → YOLOv8n. You can download from https://docs.ultralytics.com/models/yolov8/#supported-modes then save it on a local drive.
Part 3: Create a project
- at Anaconda prompt (with ultralytics-env),
you can find from a start menu.
, then create a folder “yolov8_webcam”
mkdir yolov8_webcam
2. download file yolov8n.pt to this folder
3. open VS code
code .
Workshop 1 : detect everything from image
- put image in folder “/yolov8_webcam”
- coding
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # pretrained YOLOv8n model
# Run batched inference on a list of images
results = model(['image1.jpg', 'image2.jpg'], stream=True) # return a generator of Results objects
# Process results generator
for result in results:
boxes = result.boxes # Boxes object for bbox outputs
masks = result.masks # Masks object for segmentation masks outputs
keypoints = result.keypoints # Keypoints object for pose outputs
probs = result.probs # Probs object for classification outputs
Workshop 2 : detect everything from YouTube
- at Anaconda prompt (with ultralytics-env)
- use this command
yolo predict model=yolov8n.pt source=''https://youtu.be/LNwODJXcvt4' imgsz=32
Workshop 3 : test a web camera
import cv2
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 480)
while True:
ret, img= cap.read()
cv2.imshow('Webcam', img)
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Workshop 4 : detect everything from web camera
- connect a web camera
- run this command
yolo predict model=yolov8n.pt source=0 imgsz=640
Workshop 5 : detect everything from web camera + add notation
# source from https://dipankarmedh1.medium.com/real-time-object-detection-with-yolo-and-webcam-enhancing-your-computer-vision-skills-861b97c78993
from ultralytics import YOLO
import cv2
import math
# start webcam
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 480)
# model
model = YOLO("yolov8n.pt")
# object classes
classNames = ["person", "bicycle", "car", "motorbike", "aeroplane", "bus", "train", "truck", "boat",
"traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat",
"dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella",
"handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat",
"baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup",
"fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli",
"carrot", "hot dog", "pizza", "donut", "cake", "chair", "sofa", "pottedplant", "bed",
"diningtable", "toilet", "tvmonitor", "laptop", "mouse", "remote", "keyboard", "cell phone",
"microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors",
"teddy bear", "hair drier", "toothbrush"
]
while True:
success, img = cap.read()
results = model(img, stream=True)
# coordinates
for r in results:
boxes = r.boxes
for box in boxes:
# bounding box
x1, y1, x2, y2 = box.xyxy[0]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2) # convert to int values
# put box in cam
cv2.rectangle(img, (x1, y1), (x2, y2), (255, 0, 255), 3)
# confidence
confidence = math.ceil((box.conf[0]*100))/100
print("Confidence --->",confidence)
# class name
cls = int(box.cls[0])
print("Class name -->", classNames[cls])
# object details
org = [x1, y1]
font = cv2.FONT_HERSHEY_SIMPLEX
fontScale = 1
color = (255, 0, 0)
thickness = 2
cv2.putText(img, classNames[cls], org, font, fontScale, color, thickness)
cv2.imshow('Webcam', img)
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()