An Efficient Real-Time Weed Detection Technique using YOLOv7

—Since farming is becoming increasingly more expensive, efficient farming entails doing so without suffering any losses, which is what the current situation desires. Weeds are a key issue in agriculture since they contribute significantly to agricultural losses. To control the weed, pesticides are now evenly applied across the entire area. This approach not only costs a lot of money but also harms the environment and people's health. Therefore, spot spray requires an automatic system. When a deep learning embedded system is used to operate a drone, herbicides can be sprayed in the desired location. With the continuous advancement of object identification technology, the YOLO family of algorithms with extremely high precision and speed has been applied in a variety of scene detection applications. We propose a YOLOv7-based object detection approach for creating a weed detection system. Finally, we used the YOLOv7 model with different parameters for training and testing analyzed on the early crop weed dataset and 4weed dataset. Experimental results revealed that the YOLOv7 model achieved the mAP@0.50, f1score, Precision, and Recall values for the bounding boxes as 99.6,97.6, 99.8, and 95.5 respectively on the early crop weed dataset and 78.53, 79.83, 86.34, and 74.24 on 4weed dataset. The Agriculture business can benefit from using the suggested YOLOv7 model with high accuracy in terms of productivity, efficiency, and time.


I. INTRODUCTION
As of right now, losses from pests, diseases, and weeds can account for up to 40% of annual crop yields worldwide. In the years to come, this proportion is anticipated to rise sharply. Currently, the principal method of weeding in fields is to spray herbicides across a huge region. Leaving pesticide residues in the soil, this practice not only wastes resources but also pollutes the environment. As a consequence, precision spraying [1] [2] effectively controls the growth of weeds in a field while using less pesticide, improving utilization, and avoiding chemical residue.
Quick and accurate weed detection in crop fields is crucial because it may serve as a foundation for the development of precision spraying systems. Many researches have been done so far on image-based techniques for the automated identification and categorization of weeds. For the purpose of enhancing weed detection accuracy in rice fields, [3] retrieved 101-dimensional characteristics from a picture of a weed, including color, shape, and texture.
They achieved a recognition rate of 91.13 percent using deep belief networks with fusion features. Two different classification techniques are presented by [4] to identify weed density in photos. Based on the grey level co-occurrence matrix (GLCM), the first approach used a Support Vector Machine (SVM) to get an accuracy of 73 percent, while the second method combined a Random Forest classifier with invariant scale and rotation moment features to achieve an accuracy of 86 percent. These methods have the drawback of not being effective against sedges and wide-leaf weeds. The artificial neural network (ANN) employed by [5] to identify various types of weeds was optimized using the bee algorithm (BA), and the ANN-BA attained an accuracy of 88.74 percent for the right channel and 87.96 percent for the left channel. The techniques utilized in the aforementioned research were aimed at enhancing recognition in conventional machine vision. Due to the minimal hardware requirements for operation, they are wellsuited for practical deployment. However, the majority of these techniques only tested the effectiveness on low-density samples. The difficulties of opacity, clumping, light change and other natural environment characteristics are challenging to overcome.
Deep learning has been used to address weed detection issues in agriculture. Researchers have had success using various deep learning models for this task. In [6], employed the small YOLO-v3 for real-time application in a field of strawberry and tomato plants and succeeded in detecting goose grass with an accuracy of 82 percent. With the use of pretrained Faster R-CNN, [7] achieved 65 percent accuracy, 68 recall, 66 F1 score, and 0.21 s inference time in recognizing late-season weed in soybean fields. The author [8] used Inception-ResNet-v2 as the basis and achieved F1 scores of 72.7 percent (at IoUall) and 96.9 percent for identifying agricultural plants and weeds (at IoU0.5). The study [9] used the Mask R-CNN to accurately extract weed from the "cranesbill seedling dataset." In [10] the author categorized the weed Rumex obtusifolius with a VGG-16 classification accuracy of 92.1%. The research [11] discovered that VGG-19, which had been tweaked to generate binary output, had the greatest classification accuracy of 98.7 percent for detecting volunteer potatoes in sugar beet in a comparison of Inception-v3 with AlexNet, VGG-19, GoogLeNet, ResNet-50, and ResNet-101.
Convolutional neural networks have been used by some researchers in recent years to try to detect weeds in rice fields. Fully convolutional networks were utilized by [12] to classify pixels in high resolution unmanned aerial vehicle (UAV) www.ijacsa.thesai.org imagery taken from a rice field (FCN). Their method had an accuracy rate of 91.96 percent and an average mean intersection over union (mean) of 84.73 percent. Using a semantic segmentation model called SegNet, IoU. In [13], detected the pixels in the image that corresponded to rice seedlings, weeds, and the backdrop. They can address the category imbalance by finding the class weight coefficients. Their approach has a greater accuracy of 92.7 percent when compared to FCN and U-Net.
However, the bulk of important research has only been able to recognize the leaves of certain plants, rather than actual photographs with intricate backgrounds in real settings. The techniques have poor stability and accuracy when applied to identify weeds in rice fields [15].
Large-scale weed picture collections must be carefully curated to create high-performing weed identification algorithms. Images of weed may be captured on a variety of platforms [16], incorporating field robots [18], portable camera sensors, and unmanned aerial vehicles (UAV) [17]. DeepWeeds [19], Early crop weed dataset [21], Open Plant Phenotype Database [22], and Dataset of food crops and weeds [23] are only a few examples from a recent assessment of 19 publicly accessible datasets for weed identification and plant recognition, published in [20]. These datasets are all made up of RGB (red-green-blue) photos. Currently, a large number of researches have demonstrated the effectiveness of deep learning object detectors in weed identification. These studies include those using the YOLO series, Faster R-CNN, Mask R-CNN, RetinaNet, and EfficientDet.
The YOLOv7 approach had been used in this study to address this issue and significantly enhance the performance for weed detection in the early weed dataset [21] and to assess the performance of a newly formed 4weed dataset [14] that has had no machine learning models applied to it up to this point . Finally, studies show that the YOLOv7 proposed in this study may successfully handle the problems related to weed identification in crops, achieving high accuracy and outstanding efficiency.

II. METHODOLOGY
To build a framework for weed identification, we must finish data collection, model training, and multi-class plant species classification. Two main datasets were used in this study: The Early Crop Weeds dataset and the 4weed dataset. The dataset contains photos with varied resolutions that were translated into the same dimensions using the deep learning model input layer. After creating a suitable dataset, the gathered data is separated into 90% training and 10% testing sets. YOLOv7 is then trained for agricultural weed detection utilizing those data. The performance of the trained model is evaluated using multiple parameters. Fig. 1 depicts the proposed approach used for weed identification.

A. Dataset
The early crop weed detection dataset contains 308 images that are taken from the early crop weed classification dataset [21] and the objects of interest are annotated with bounding boxes. This dataset contains 308 RGB images of four species at early growth stages. It includes images of 25 cotton, 67 velvet, 121 tomato, and 95 nightshade. Fig. 2 shows sample instances of the early crop weed detection dataset images.
The 4Weed [14] collection includes 618 RGB photos in total, which were collected at Purdue University under challenging field circumstances. The collection includes photos of four weed species that are often seen in corn and soybean production systems: Giant Ragweed, Foxtail, Cocklebur, and Redroot Pigweed Fig. 3.
The final dataset included 150 Giant Ragweed photos, 170 Redroot Pigweed images, 35 Cocklebur images, and 73 Foxtail images. You may get the dataset at https://osf.io/w9v3j/.    In this research, a computer vision-based recognition and detection technique is provided for object detection. The most recent Yolov7 model was used. Yolov7 is a single-stage object detection technique. You Only Look Once Yolov7's network structure diagram is depicted in Fig. 4 [25]. Overall, the YOLO-V7 technique resizes the input picture to 640x640 before feeding it into the backbone network, producing three layers of feature maps of varying sizes via the head network, and then outputting the prediction result using RepConv [24]. RepConv is utilized to build a planned reparametrized convolution architecture with increased gradient variation for various feature maps [24]. The soft labels generated by the optimization process are used by the lead head and auxiliary head learning processes, together with the introduction of the auxiliary detecting head. In order to acquire more accurate findings, the soft labels that were produced from it ought to more faithfully represent the distribution and relationship between the source data and the object [26]. Silu activation function, ELAN structures, and MP structures make up the YOLOv7 backbone, by managing gradient pathways and deeper networks, the ELAN structure can effectively learn and converge. Fig. 4 depicts the ELAN and E-ELAN network structures. Down sampling is performed using the MP structure as shown in Fig. 4.

A. Performance Metrics
The most often used metric to evaluate object identification systems are mean average precision (mAP). Comparing the detected box to the corresponding ground truth box allows the mAP to determine its score. The connection between the predicted bounding box coordinates and the actual bounding box is characterized by intersection over union. The projected bounding box coordinates and the truth values should match more closely, according to higher IoU values. (1) The proportion of true positives to all correctly predicted outcomes is referred to as precision. Precision evaluates how accurately a model category a sample as positive. (2) The proportion of true positives to all the predictions is known as recall. Recall gauges the accuracy with which a model can find positive samples. The most positive samples are discovered when recall is higher.

Recall = (3)
The average area of the precision-recall curve below a given IoU threshold is known as the average precision at IoU (AP IOU ). AP IOU is a performance indicator for a certain class or category. To indicate the overall detecting performance, mean average precision at a threshold IoU (m AP IOU ) is calculated and denoted as follows: = (4)

{ }
One of the often-employed measures for assessing the effectiveness of machine learning algorithms is the F1 score. F1 scores are calculated using the harmonic mean of recall and accuracy. The F1 score value is an indicator of how well categorization systems can anticipate outcomes.

B. Performance Evaluation
The model was trained and tested using the cloud-based Google Colab environment, which has access to the NVIDIA Tesla T4 GPU. The YOLOv7 model's training process was started using pre-trained weights derived from the COCO dataset rather than starting from scratch.
We evaluate the effectiveness of the YOLOv7 model developed using the early crop weed dataset and 4Weed dataset. This model trained over 50 epochs. YOLOv7 provides superior mAP than prior trained models, as shown in Table I. The YOLOv7 model was evaluated using random testing pictures, and the results are displayed in Fig. 5 and Fig. 6. The collected findings suggest that the model can successfully identify agricultural weeds. Three different types of losses were produced throughout the YOLOv7 training and validation process: bounding box loss, objectiveness loss, Classification loss, precision, recall, mAP@0.5, and mAP@0.5:0.95. Fig. 7 and Fig. 8 show that throughout training, every loss value displayed a decreasing trend, and the model did not exhibit any overfitting. While the validation loss converged near the conclusion of the training, the training loss did so early on. The minimal value in the training and validation loss curves was attained after 50 training epochs with batch size 16.
The Normalized confusion matrix evaluated on test images using YOLOv7 Trained model is plotted as shown in Fig. 9 and Fig. 10.    The performance of the YOLOv7 model on Early Crop Weed dataset is evaluated using performance metrics mAP, F1score, Precision, Recall, and Precision-Recall. The mAP, F1score, Precision, and Recall of YOLOv7 on early crop weed dataset after training for 50 epochs are 99.6,97.6, 99.8, and 95.5. The graphs of these metrics as shown below in Fig. 11 to 14.         The YOLOv7 model have inherent limitations that affect its accuracy for plant weed detection, such as difficulty in detecting small or occluded weeds, or misclassifying non-weed objects as weeds. This study used a limited dataset for training and testing the YOLOv7 model, which could affect the accuracy and generalizability of the results.

IV. CONCLUSION
Weeds increase agricultural cultivation costs and lower crop yields. Machine vision plays a significant part in precision agriculture by helping to locate weeds on agricultural land. For the purpose of weed detection using machine vision in this work, we use the early crop weed dataset and the 4weed dataset. On the datasets, the one-stage object detector YOLOv7, which is based on deep learning, was tested for weed detection. The mAP@0.5 detection accuracy for the early crop weed dataset is 99.6 while the mAP@0.5 detection accuracy for the 4weed dataset is 78.53. Because of its quicker inference speed, YOLOv7 has strong promise for real-time applications. By enhancing model training and data augmentation methods, increasing the dataset, and improving the model, further research is still required to increase the accuracy of weed detection. Additionally, field experiments www.ijacsa.thesai.org and demonstrations using trained models deployed on a machine vision system with onboard computer hardware in real-world field settings are required for further model assessment and updating.