Human Detection from Drone using You Only Look Once (YOLOv5) for Search and Rescue Operation
Drones are unmanned aerial vehicles that can be remotely operated to perform a variety of tasks. They have been used in search and rescue operations since the early 2000s and have proven to be invaluable tools for quickly locating missing persons in difficult terrain and environment. In certain case...
Published in: | Journal of Advanced Research in Applied Sciences and Engineering Technology |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Penerbit Akademia Baru
2023
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85163048742&doi=10.37934%2faraset.30.3.222235&partnerID=40&md5=82401b462c9b2e4a997c42ad56425efa |
id |
2-s2.0-85163048742 |
---|---|
spelling |
2-s2.0-85163048742 Zaman F.H.K.; Tahir N.M.; Yusoff Y.M.; Thamrin N.M.; Hasmi A.H. Human Detection from Drone using You Only Look Once (YOLOv5) for Search and Rescue Operation 2023 Journal of Advanced Research in Applied Sciences and Engineering Technology 30 3 10.37934/araset.30.3.222235 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85163048742&doi=10.37934%2faraset.30.3.222235&partnerID=40&md5=82401b462c9b2e4a997c42ad56425efa Drones are unmanned aerial vehicles that can be remotely operated to perform a variety of tasks. They have been used in search and rescue operations since the early 2000s and have proven to be invaluable tools for quickly locating missing persons in difficult terrain and environment. In certain cases, automated human detection on drone camera feed can help the responder to locate the victims more effectively. In this work, we propose the use of a deep learning method called You Only Look Once version 5, or YOLOv5. The YOLOv5 model is trained using data collected during a simulation of search and rescue operations, where mannequins were used to represent human victims. Video was acquired using DJI Matrice 300 drone with Zenmuse H20T camera which flew around an area with various terrains such as farms, ravines, and river of more than 15,000 m2, at a height of 40 meters. The drone used grid, circular and zigzag flying patterns, with three different levels of camera zooms, and the data was captured on different days and times. The total duration of the video collected at 1080p@30fps is 148 minutes 26 seconds. Five pretrained models of YOLOv5 with different complexities were trained and tested using this dataset. Results showed that pretrained yolov5l6 model delivered the best precision, recall and mAP50 rate at 0.668, 0.303 and 0.346 respectively. Besides, the experiment also showed that we can improve the overall performance by using images acquired at 6x zoom magnification level where precision, recall, and mAP50 rate are increased to 0.846, 0.543, and 0.591 respectively. yolov5l6 model also delivered an acceptable inference time of 43ms per 1920x1080 resolution image, thus it can run at a respectable 23fps. © 2023, Penerbit Akademia Baru. All rights reserved. Penerbit Akademia Baru 24621943 English Article All Open Access; Hybrid Gold Open Access |
author |
Zaman F.H.K.; Tahir N.M.; Yusoff Y.M.; Thamrin N.M.; Hasmi A.H. |
spellingShingle |
Zaman F.H.K.; Tahir N.M.; Yusoff Y.M.; Thamrin N.M.; Hasmi A.H. Human Detection from Drone using You Only Look Once (YOLOv5) for Search and Rescue Operation |
author_facet |
Zaman F.H.K.; Tahir N.M.; Yusoff Y.M.; Thamrin N.M.; Hasmi A.H. |
author_sort |
Zaman F.H.K.; Tahir N.M.; Yusoff Y.M.; Thamrin N.M.; Hasmi A.H. |
title |
Human Detection from Drone using You Only Look Once (YOLOv5) for Search and Rescue Operation |
title_short |
Human Detection from Drone using You Only Look Once (YOLOv5) for Search and Rescue Operation |
title_full |
Human Detection from Drone using You Only Look Once (YOLOv5) for Search and Rescue Operation |
title_fullStr |
Human Detection from Drone using You Only Look Once (YOLOv5) for Search and Rescue Operation |
title_full_unstemmed |
Human Detection from Drone using You Only Look Once (YOLOv5) for Search and Rescue Operation |
title_sort |
Human Detection from Drone using You Only Look Once (YOLOv5) for Search and Rescue Operation |
publishDate |
2023 |
container_title |
Journal of Advanced Research in Applied Sciences and Engineering Technology |
container_volume |
30 |
container_issue |
3 |
doi_str_mv |
10.37934/araset.30.3.222235 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85163048742&doi=10.37934%2faraset.30.3.222235&partnerID=40&md5=82401b462c9b2e4a997c42ad56425efa |
description |
Drones are unmanned aerial vehicles that can be remotely operated to perform a variety of tasks. They have been used in search and rescue operations since the early 2000s and have proven to be invaluable tools for quickly locating missing persons in difficult terrain and environment. In certain cases, automated human detection on drone camera feed can help the responder to locate the victims more effectively. In this work, we propose the use of a deep learning method called You Only Look Once version 5, or YOLOv5. The YOLOv5 model is trained using data collected during a simulation of search and rescue operations, where mannequins were used to represent human victims. Video was acquired using DJI Matrice 300 drone with Zenmuse H20T camera which flew around an area with various terrains such as farms, ravines, and river of more than 15,000 m2, at a height of 40 meters. The drone used grid, circular and zigzag flying patterns, with three different levels of camera zooms, and the data was captured on different days and times. The total duration of the video collected at 1080p@30fps is 148 minutes 26 seconds. Five pretrained models of YOLOv5 with different complexities were trained and tested using this dataset. Results showed that pretrained yolov5l6 model delivered the best precision, recall and mAP50 rate at 0.668, 0.303 and 0.346 respectively. Besides, the experiment also showed that we can improve the overall performance by using images acquired at 6x zoom magnification level where precision, recall, and mAP50 rate are increased to 0.846, 0.543, and 0.591 respectively. yolov5l6 model also delivered an acceptable inference time of 43ms per 1920x1080 resolution image, thus it can run at a respectable 23fps. © 2023, Penerbit Akademia Baru. All rights reserved. |
publisher |
Penerbit Akademia Baru |
issn |
24621943 |
language |
English |
format |
Article |
accesstype |
All Open Access; Hybrid Gold Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1809677583363604480 |