TABLE OF CONTENT
|2. About YOLO|
|3. Step-by-Step Guide on Custom Object Detection Model|
|5. About CloudThat|
Object Detection is a part of the Computer Vision technique to localize the object in the image and classify it. As we humans see what the object is, we also make computers to understand what the image is and where it is localized.
As you can see in above Fig 1 Object detection compromises Classification and Localization.
It is possible for computers to observe, recognize, and analyze objects in images and videos in a similar fashion to how people do so using Computer Vision, a field of artificial intelligence that uses machine learning and deep learning. The use of Computer Vision for automated AI visual inspection, remote monitoring, and automation is quickly gaining prominence.
In this blog, you will come to know how to train and detect custom object detection using You only Look once V3. In the end, I am sure that you can implement your custom object detection. I have used Google Colab for training purposes. And for the demo, I have used Face Mask Detection, as it is a binary class (With Mask or Without Mask). Also, I have mentioned the requirements to get started.
The YOLO (You Only Look Once) was written by Joseph Redmon using a framework called Darknet. YOLO is an open source and the state of the art algorithm for real-time object detection. There are multiple versions of YOLO (V2, V3, V4, and V5). We will be using Yolo V3 for easy training.
The initial version presented the overall architecture, the second iteration improved the design and used pre-defined anchor boxes to boost the bounding box proposal, and the third iteration further improved the model architecture and training procedure.
Step-by-Step Guide on Custom Object Detection Model
Here we will be creating Face Mask Detection using YOLO v3
Step 0: Custom Dataset Creation and Labelling
You have to collect the data for custom training. After preparing the dataset, it is recommended to you use the LabelImg tool, which can be used to create bounding boxes and actual labels for the images.
pip install labelImg
For more reference: https://github.com/tzutalin/labelImg
- Create a new folder “Train” and create a “class.txt” file
- In the class.txt file create the class label
Example: 0 Mask Not Detected
1 Mask Detected
Create obj.data file and modify the content below
classes= 2 (person with mask and without mask)
train = data/train.txt
valid = data/train.txt
names = data/obj.names
backup = backup/
- Classes: represent no of classes
- path to train data
- path to test data
- Create another folder “Images” under the “Train” Folder
- Move all images (of different classes) to the “Images” folder
Now using the labelImg tool, create a bounding Box for the dataset
Make sure you save the image with the bounding box in the same Folder “/train/images” and save it in YOLO Format
Upload to google drive or GitHub account as a zip file
Congrats, one big step has been completed.
Step 1: Cloning the Darknet repository for YOLO architecture
Here, we are cloning the architecture of yolov3 which is used for detection.
!git clone https://github.com/AlexeyAB/darknet.git
Step 2: Configuring the MakeFile
Here, we are going to make some changes in the Make File for further computation.
2.1 Change the directory to Darknet Folder
2.2 Make sure You have GPU installed
2.3 Make Changes to GPU and OPENCV from 0 to 1
- ‘1’ represents to activate or use
- !sed – stream editor
- !cat – Makefile Cat(concatenate),it will read the file
- !cat Makefile
Step 3: Download the pre-trained weights
- We download the weights so that we can initialize them with pre-trained models and train them for our dataset.
- !wget https://pjreddie.com/media/files/darknet53.conv.74
- Download the respective weight for the respective cfg file. As for yolov3, I have used yolov3.cfg and darknet53.conv.74 weights. To use Yolov4 you can refer to the Alexab GitHub page (https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects)
Step 4: Upload the dataset you have stored into Git hub or google drive
Make sure you have the below-defined files,
- dataset ( images )
Unzip the data files we zipped before
!unzip data/custom.zip -d data/ # adjust the path
With the above files, you also need train.txt where it says the path of every image for training, and for validation it’s optional.
Step 5: Configuring the Yolo cfg file
We now going to make some changes to yolov3.cfg file available in Darknet/cfg folder
- random 0 to 1
- Max_batch = No_of classes * 2000
- Filters = (classes + 5)x3
- Subdivisions should be 8 batches to 32
- Set network size width=416 height=416 or any value multiple of 32
- Change line classes=80 to your number of objects ( e.g.: 2 )
- To Configuring the cfg file run the below command
!sed -i 's/batch=1/batch=32/g' cfg/yolov3.cfg
!sed -i 's/subdivisions=1/subdivisions=8/g' cfg/yolov3.cfg
!sed -i 's/random=1/random=0/g' cfg/yolov3.cfg
!sed -i 's/max_batches = 500200/max_batches = 4000/g' cfg/yolov3.cfg
!sed -i 's/steps=400000,450000/steps=3200,3600/g' cfg/yolov3.cfg
!sed -i 's/classes=80/classes=2/g' cfg/yolov3.cfg
!sed -i 's/filters=255/filters=21/g' cfg/yolov3.cfg
Step 6: Train and Test model
For Linux use the below command
Step 7: When should I stop the training?
- In the training part, you will see average loss, IoU, ith iteration as output
- Make note of the average loss once the loss starts to increase rather than decrease continuously. If your average loss is increasing, then you should stop the training
- After every 100 iterations, you will see, the weight’s are downloaded to the darknet/backup folder and after every 1000 iterations Weight’s will be stored in the darknet/backup folder
- So now, let’s check the accuracy of our weight’s using a map indicator
For example, you have 3 different weight files (7000, 8000, and 9000th iterations)
darknet.exe detector map data/obj.data yolo-obj.cfg backup\yolo-obj_7000.weights
Replace 7000 with 8000 and 9000
Choose weights-file with the highest mAP (mean average precision) or IoU (intersect over union) darknet.exe detector train data/obj.data yolo-obj.cfg yolov4.conv.137 -map
For windows use,
Darknet.exe instead of !./darknet
Step 8: Testing with input Images / Videos
!./darknet detector test data/obj.data cfg/yolov3.cfg /content/weights/yolov3_1300.weights /content/darknet/data/image_test01.jpg -dont_show
Video Detection :
!./darknet detector demo data/obj.data cfg/yolov3.cfg /content/weights/yolov3_1300.weights -dont_show videoname -i 0 -out_filename me_06.avi -thresh 0.7
That’s it. Congratulation, you made it.
Video Source Detection:
To elevate the custom object detection using Yolo, we created the Person with Mask and Without dataset and labeled it carefully using the tool LableImg. With that, we choose Yolo v3 as an architecture for faster detection. At last, trained and tested successfully in Google Colab.
Git Hub Reference: https://github.com/Ganesh9100/Mask-Detection-YOLO_V3-
With more Training data and different classes, the model can be used for many Real-Time Applications.
CloudThat is also the official AWS (Amazon Web Services) Advanced Consulting Partner and Training partner and Microsoft gold partner, helping people develop knowledge of the cloud and help their businesses aim for higher goals using best in industry cloud computing practices and expertise. We are on a mission to build a robust cloud computing ecosystem by disseminating knowledge on technological intricacies within the cloud space. Our blogs, webinars, case studies, and white papers enable all the stakeholders in the cloud computing sphere.
Drop a query if you have any questions regarding YOLO, Object Detection and I will get back to you quickly.
Q1. What is LabelImg?
A. It is a graphical image annotation tool written in python.
pip install labelImg
Q2. What is Yolo Cfg file?
A. It is a configuration file where it has some parameters like batch, subdivisions, decay, etc.
Q3. What is darknet ?
A. It is an opensource predefined neural network framework written in C and CUDA and also it supports CPU and GPU computations