Image Detection Model for Construction Worker Safety Conditions using Faster R-CNN

Many accidents occur on construction sites leading to injury and death. According to the Occupational Safety Health Administration (OSHA), falls, electrocutions, being struck-byobjects and being caught in or between an object were the four main causes of worker deaths on construction sites. Many factors contribute to the increase in accidents, and personal protective equipment (PPE) is one of the defense mechanisms used to mitigate them. Thus, this paper presents an image detection model about workers’ safety conditions based on PPE compliance by using the Faster Region-based Convolutional Neural Networks (R-CNN) algorithm. This experiment was conducted using Tensorflow involving 1,129 images from the MIT Places Database (from Scene Recognition) as a training dataset, and 333 anonymous dataset images from real construction sites for evaluation purposes. The experimental results showed 276 of the images being detected as safe, and an average accuracy rate of 70%. The strength of this paper is based on the image detection of the three PPE combinations, involving hardhats, vests and boots in the case of construction workers. In future, the threshold and image sharpness (low resolution) will be two main characteristics of further refinement in order to improve the accuracy rate. Keywords—PPE; OSH; accident; construction site; image detection; faster R-CNN


I. INTRODUCTION
One of the most dangerous fields to work is the construction industry. The Occupational Safety and Health Administration (OSHA) has outlined safety measures and precautions in the form of a legislative framework for the construction industry. Based on works by [1][2][3][4], any worker, especially those in the construction industry, is exposed and vulnerable to accidents that could lead to non-permanent disability (NPD), permanent disability (PD) or death. As stated in work by [5], ignorance of safety procedures and protection such as wearing PPE, failing to understand written safety rules, and many migrant workers, are among the factors that lead to accidents. On top of that, the uniqueness of the industry and of construction site conditions, also play a big role in the cause of accident or death. Moreover, workers on construction sites could help to reduce the risk of accidents by informing their supervisor or employer of any risks that they have spotted so that appropriate control measures could be introduced to prevent such accidents. Generally, safety performance is measured based on lagging indicators such as Incident Rate (IR), Accident Rate (AR) and Experience Modification Rate (EMR). While works by [6][7][8][9][10], described the examples of safety measures available with different lagging indicators across the world. In addition, Abas and colleagues wrote a comprehensive paper on the factors that affects safety performance on construction projects [11]. These authors identified safety factors which are beneficial when it comes to reducing accident and compensation costs and to increasing productivity, employee awareness attitudes, and project on time completion. As proposed by [12,13], employers should evaluate their employees' knowledge and awareness regarding PPE and other equipment at the construction site, so that proper training and control measures could be implemented to reduce the accident or death risk.
Consequently, this paper presents an image detection model based on safe and dangerous condition facing workers on construction sites in terms of their compliance when it comes to wearing PPE. Existing works for construction sites on safety detection were more focusing on one PPE only such as hardhats, vest or boots. As for our paper, we proposed detection safety conditions based on three combinations of PPEs in the form of hardhats, vests and boots. Furthermore, for us to classify the worker as operating in a safe or dangerous manner, we used the Faster R-CNN algorithm. The PPE considered takes the form of hardhats, boots and vests. These are basic PPE that should be worn all time by construction workers.
This paper is organized as follows. Section II explains related work, Section III presents the method used in this research, Section IV consists of the findings and their evaluation, and Section V concludes the paper and makes suggestions for future work.

II. RELATED WORKS
According to work by [14], falling objects among construction workers is ranked as the one of the highest incident that occurred at construction sites. Hence, we need a solution to lower the death risk among such workers. There is a small number of existing works related to the construction system and to quality of project management such as [15,16,23]. Work by [15] tended to focus on project progress monitoring, and communication between employees, while work by [16] proposed a quality management project for construction projects. Work by [23] proposed optimisation modelling for repetitive works on construction site by using an Unmanned Aerial Vehicle (UAV), and work by [26] detected workers on construction sites by using UAV and RFID concepts. Different scopes of work using UAVs were summarised by Ham and colleagues [27] such as for progress monitoring, building inspection, building measurement, surveying, safety inspection, structural damage assessment and geo-hazard investigation. As far as safety inspection concerned, existing works tend to focus on one piece of PPE only in the form of the hardhat. There are few existing studies related to image detection about construction sites, as summarised in Table I. Based on Table I and on our summarised works, we have identified that the most common challenges for researchers in the future would be to have more PPE detection on construction sites or at the workplace, more dataset resources, the detection of object blockage endangering the target image, and different image positions and backgrounds. From the existing works, the best recommendations for image detection algorithm performance is the use of the Faster R-CNN algorithm. Hence, our research has tackled these challenges and used the Faster R-CNN algorithm as our detection algorithm.

III. METHOD
For this research, the setup, software and hardware used for the experiment as displayed in Table II, and the research processes are as depicted in Fig. 1.  Vol. 11, No. 6, 2020 For this research, the images were collected from the MIT Database [24]. From fifteen thousand construction images, 1,129 images were selected as the training dataset based on PPE components, which inclusive of hardhats, vests, and boots. These images were then labelled using LabelImg python scripting (see Fig. 2) and further analysed, trained and classified using Tensorflow. LabelImg is written in Python in order to label the images, together with Qt graphical interface. The annotations were saved as an XML file in PASCAL VOC. During the image analysis, we trained and classified the images by using the Faster R-CNN Inception v2 COCO model. This model uses fast R-CNN with shared convolutional feature layers, and a unified model composed of RPN (region proposal network) as depicted in Fig. 3.
The strength of Faster R-CNN is based on its ability to reuse the CNN results for the regional proposal process. Hence, only one CNN needs to be trained, and regional proposals can be made almost cost-free computationally [25]. Once the image has been inserted, the Faster R-CNN produces the classifications and bounding box co-ordinates of the specified classes in the images. In our research, this algorithm helps us to identify and to assign the safety condition based on PPE compliance. The safety condition was decided upon, as being either safe or unsafe (dangerous) based on worker compliance in terms of wearing a hardhat, vest and boots at the construction site as depicted in Fig. 4. Once this safety classification was completed, the evaluation was carried out with the aid of 333 images. For safety conditions, the formulation was as follows. where, where, PPE represents the PPE classification, T represents the target image and S is the safety model.
where 248 | P a g e www.ijacsa.thesai.org = True positive (number of worker correctly classified as safe).
= False positive (number of worker incorrectly classified as unsafe).
= True negative (number of worker correctly classified as unsafe).
= False negative (number of workers incorrectly detected as safe.
The findings based on these formulations are explained in the next section.

IV. FINDINGS
Based on the experiment conducted, 1,129 images related to PPE that included hardhat, vest and boots were trained in order to classify conditions as being safe or unsafe. Then, 333 anonymous self-collected images from construction sites were used for evaluation.
During the training, from 1,129 images, a total of 2,373 hardhats, 1,023 pairs of boots and 1,478 vests were detected. In terms of the evaluation of 263 images, a total of 156 hardhats, 49 boots, 73 vests and 123 safe conditions were detected with an overall accuracy of 70%. In the case of a further 70 images self-collected from construction sites, 53 cases were detected as being safe. Table III summarises the experimental results, while Fig. 5 shows examples of the evaluation image results. For this experiment, a total of 6 hours was used for image training, and the total accuracy loss was 0.5 or less, as displayed in Fig. 6.
Many factors contributed to the accuracy rate. Apart from the model itself, other factors such as the training dataset, input image resolution, and training configurations including batch size, input image resize, learning rate, and learning rate decay, also affected the accuracy rate [28]. In our case, the ability to detect safe conditions with an accuracy rate of 70% is considered as a good result in terms of real-time detection. It is hard and very subjective to make comparisons with any other existing works due to the different settings of the experiment for each existing work. However, the selection of the best detector algorithm and the best configuration are crucial in terms of image detection. These two factors contribute to the best balance of speed and accuracy. We chose the Faster R-CNN algorithm due to its better accuracy rate compared to that of other existing algorithms, as summarised in Table I. In addition, based on the experiment conducted, the formulations developed mean that the construction workers' compliance with wearing PPE can easily be identified and measured.

V. CONCLUSION
Based on the experiment conducted, there are a few considerations. These are the threshold value assigned during the data configuration settings, the momentum optimizer value, and the belief that image resize and image sharpness (low resolution) could be further adjusted or improved for better accuracy. All these elements are among the components of the Faster R-CNN algorithm. Nonetheless, this paper has successfully developed formulations and an image detection model relating to construction workers' compliance when it comes to wearing PPE in the workplace. This will help create a safer and healthier environment on construction sites. For future work, the threshold, momentum optimizer, image resize and image sharpness will be further refined and improved to obtain an improved accuracy rate.