The use of personal protective equipment (PPE) is essential to strengthen the safety of workers in the workplace. Although there exist specific regulations requiring the use of PPEs, due to carelessness, haste or comfort, workers sometimes neglect to wear them. Thus, it is crucial to monitor the appropriate use of PPE, especially in dangerous processes. Computer vision technology can help perform this task automatically, exploiting appropriate deep neural models for recognizing PPE. YOLO (You Only Look Once) deep neural models ensure good accuracy against a limited complexity, which allows running them on devices with limited computational capacity. In this paper, we evaluate the performances, in terms of accuracy and processing speed, of the most popular models implemented in the version 5 of YOLO (YOLOv5) in the task of recognizing whether workers correctly wear PPE. To this aim, all models are trained on a PPE dataset on a dedicated server. Then, each model is deployed on low-cost hardware, which includes a Raspberry Pi4 Model B equipped with an Intel Neural Compute Stick 2, used as a processing unit. The outcomes show that YOLOv5n, with only 1.9 million parameters, is the fastest model which allows processing 7.9 frames per second, while YOLOv5l and YOLOv5x, respectively with 46.5 and 86.7 million parameters, are the most accurate but slowest models, processing 1.3 and 0.7 frames per second. We also compare the performance of the YOLOv5 models with the ones in version 4 of YOLO, showing how models in version 5 in general outperform the previous version with higher accuracy, especially in the detection of small objects.
Performance Evaluation of YOLOv5 on Edge Devices for Personal Protective Equipment Detection
Miglionico G. C.
;Ducange P.;Marcelloni F.;Vallati C.;Di Rienzo F.
2024-01-01
Abstract
The use of personal protective equipment (PPE) is essential to strengthen the safety of workers in the workplace. Although there exist specific regulations requiring the use of PPEs, due to carelessness, haste or comfort, workers sometimes neglect to wear them. Thus, it is crucial to monitor the appropriate use of PPE, especially in dangerous processes. Computer vision technology can help perform this task automatically, exploiting appropriate deep neural models for recognizing PPE. YOLO (You Only Look Once) deep neural models ensure good accuracy against a limited complexity, which allows running them on devices with limited computational capacity. In this paper, we evaluate the performances, in terms of accuracy and processing speed, of the most popular models implemented in the version 5 of YOLO (YOLOv5) in the task of recognizing whether workers correctly wear PPE. To this aim, all models are trained on a PPE dataset on a dedicated server. Then, each model is deployed on low-cost hardware, which includes a Raspberry Pi4 Model B equipped with an Intel Neural Compute Stick 2, used as a processing unit. The outcomes show that YOLOv5n, with only 1.9 million parameters, is the fastest model which allows processing 7.9 frames per second, while YOLOv5l and YOLOv5x, respectively with 46.5 and 86.7 million parameters, are the most accurate but slowest models, processing 1.3 and 0.7 frames per second. We also compare the performance of the YOLOv5 models with the ones in version 4 of YOLO, showing how models in version 5 in general outperform the previous version with higher accuracy, especially in the detection of small objects.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.