A Pedestrian Detection and Tracking Method for Robot Equipped with Laser Radar

In order to detect and track pedestrians in complex indoor backgrounds, a pedestrian detection and tracking method for indoor robots equipped with Laser radar is proposed. Firstly, The SLAM (Simultaneous Location and Mapping) technology is applied to obtain 2D grid map for a strange environment; then, Monte Carlo localization is employed to obtain the posterior pose of the robot in the map; then, an improved likelihood field background subtraction algorithm is proposed to extract the interesting foreground in changeable environment; then, the hierarchical clustering algorithm combining with an improved leg model is proposed to detect the objective pedestrian; at last, an improved tracking intensity formula is designed to track and follow the objective pedestrian. Experimental results in some complex environments show that our method can effectively reduce the impact of confusing scenarios which are challenges for other algorithms, such as the motion of the chair, the suddenly passing by person and when the objective pedestrian close to the wall and so on, and can detect, track and follow pedestrians in real time with high accuracy. Keywords—Laser radar; likelihood field model; pedestrian detection; pedestrian tracking; simultaneous location and mapping


I. INTRODUCTION
Under the fast development of artificial intelligence, the research and application of intelligent service robot attracts more and more scholars and researcher's attention, and the application fields of robots cover all aspects of human life, such as restaurant service, Home-based services, shopping guide, accompanying dance and so on [1]. The main work of these robots is to interact with people, so the detection, tracking and avoidance of pedestrians are particularly important.
At present, pedestrian detection and tracking methods mainly depend on ordinary camera [2][3], infrared camera [4][5][6][7], laser radar and so on. Because the ordinary camera based method are easy to be affected by light, field size, shooting angle and so on, so these kinds of method are only suitable for the application scene with stable lighting and relatively fixed shooting position. The infrared camera based method can be employed in large workplace and is not affected by visible light and these methods can work normally even in dark environment. However, due to the technical limitation, many drawbacks exist ， such as low signal-to-noise ratio, low contrast, low resolution, and the cost is high. Because the laser radar based methods have the advantage of high precision and low cost and is not affected by illumination changes, it is widely used. For the consideration of performance and price, 2D laser radar is always suitable. However, two-dimensional plane information does not include the depth of the image and what's more, the longer the detection distance, the lower the resolution. Authors in [8][9] proposed 14 features of human leg, and used AdaBoost strong classifier to detect pedestrians. Authors in [10][11] proposed improved convolutional neural network to classify wheelchairs and human legs. However, in complex background, due to the lack of two-dimensional information, the classifier would output wrong results, such as the chair leg near the corner would be misjudged as a human leg. In order to reduce the interference of complex background, [12] proposed a background subtraction method, which reduces the interference of background in fixed laser radar and fixed scene, but can't be applied to mobile robot. Author in [13] firstly generate the local grid map in the environment, then match and align the grid map of the front and back frames to get the undetermined foreground. At the same time, assuming that the human leg corresponds to the minimum value of the laser radar distance histogram, the final foreground can be obtained by the operation of the laser radar distance histogram of the undetermined foreground and the laser radar distance histogram of the human lag. This method can be well applied in the relatively open environment such as corridor, but it can't deal with the situation that pedestrian is still and environment is complex.
In order to solve the problem of background interference in pedestrian detection and tracking, this paper propose a newly method, first, the environment map is constructed, and then the likelihood domain model is used to segment the foreground from the background; at last, the improved Kalman filter is used to track and follow the pedestrian in the complex background.

II. FLOWCHART OF OUR METHOD
As shown in Fig. 1, after the SLAM technology is applied to construct the environment map, the Monte Carlo localization is applied to determine the location of the robot in the map. Then, the foreground is extracted by likely likelihood domain model, the data cluster technology is employed to generate the steady background, which would enhance the accuracy of the foreground extraction. Then the foreground is judged whether the objective pedestrian exist, if the objective pedestrian is localized, then the robot will follow the pedestrian automatically. Otherwise, the robot will standstill or cruise randomly and waiting for the result of the next frame. 33 | P a g e www.ijacsa.thesai.org

A. The Map Construction
The SLAM technology of mobile robot is first proposed by R.Smith, M.Self and P.Cheeseman [14], which locate the robot's pose when the robot move in an unknown environment, and then build a map according to the robot's pose, so as to achieve the purpose of Self-orientation and mapping at the same time. Grisetti G et al. [15] improved the traditional SLAM by rao-blackwellized particle filter, and proposed adaptive sampling to reduce particle loss, forming the current GMapping algorithm. GMapping is used to build environment map, because the method has the advantages of low requirement for the performance of the laser radar, low calculation consumption and high accuracy of the mapping. Fig. 2 is the grid map of the experimental scene using our own device and GMapping technology. In the figure, the white part indicates that the robot can move in these areas; the black part indicates that the robot cannot move in these areas; the gray part means area still not be detected; numbers represent the position of where to operate background difference experiment.

B. Monte Carlo Localization
Monte Carlo localization [16] is an algorithm which can be used to determine the position and direction for a robot in the grid map using odometer information and laser radar data. The algorithm first initializes a particle swarm in normal distribution using to standard mean and variance, then updates the pose of all particles in the particle swarm by the odometer data and the motion model, then obtains the importance weight of the particles by calculating the correspondence between the laser radar data and the map under the corresponding pose, and finally the maximum possibility rule is applied to resample the particle swarm, and the pose with the largest weight is treated as a posteriori pose. What's more, the random particles are usually added into the resampling step to recover the robot from global positioning failure and local optimal solution.

C. Likelihood Field Model
The likelihood field model is first applied to eliminate the uncertainty of the signal obtained by various sensors, the possibility of the value of signal intensity obtained by various sensors is employed to determine the final value of the signal, doing so, the output of the sensor would be more robust to the influence of the noise and the fluctuation of the environmental factors such as voltage, temperate, humidity and so on. In this paper, the likelihood field model is used to obtain a steady background, then under the circumstance that a constructed grid map and the position and direction of the robot is determined, the foreground can be extracted accurately.
The likelihood field model can be represented by a conditional probability distribution to 0 and variance equal to 1. rand p is the likelihood value affected by objects randomly showing up. It is considered that the likelihood of obstacles detected by the measurement points obey an average distribution, which max z is the maximum measurement distance. Based on the above two considerations, the likelihood of the objects detected in the map by the corresponding sensor, and hit z , rand z is the weight, respectively. Fig. 3 shows the relationship between likelihood and dist.
The value of likelihood represents the possibility of a surrounding point of the measurement point is a background, and it is related to the shortest distance dist from the measurement point to the obstacle in the map. In this paper, a fixed threshold value _ theta pk is set. For each when it is bigger than the threshold value, the scanned region is the background. When the calculated likelihood value is smaller than the threshold, it is considered that the scanned region include foreground. _ theta pk equal to 0.5, and the dist equal to 0.12 in our experiment.

A. Data Clustering
After obtained the foreground information taken from laser radar data, the clustering operation is then followed, the foreground is then classified into several class, and then is used to recognize. The state-of-the-art algorithms can be departed into three classes, they are hierarchical clustering algorithm, Kmeans clustering algorithm and DBSCAN clustering algorithm [13]. K-means clustering algorithm needs to know the number of clusters in advance, which is not suitable for the situation of unknown pedestrian and environment. DBSCAN is a densitybased algorithm, which calculate the number of elements in a circle with a certain radius. As long as the number of elements exceeds the preset threshold, these elements in the circle are regarded as a class. The effect of DBSCAN algorithm depends on the radius of circle and the number of elements, and the calculation is huge. The hierarchical clustering algorithm first treats each data to be processed as a class, calculates the distance between classes, then compares the distance between adjacent classes, and merge points whose distance is less than the preset threshold value into a class.
The hierarchical clustering algorithm is selected in this paper, which does not need to specify the number of clustering results and has few restrictions. In traditional hierarchical clustering algorithm, the singular value would negatively affect the clustering results. To solve this problem, the laser radar foreground data is preprocessed firstly, directly removes the singular points which is far from the laser radar, and then conducts hierarchical clustering.

B. Pedestrian Detection
Although there are many extracted features related to human legs, [8] pointed out that only a few features will occupy a relatively large weight in the final trained AdaBoost detector, and most of the other features will have less weight. Adding these features may lead to over fitting.
On the basis of [8], the leg model is designed and can be seen in Fig. 4, in which two side points represent the cluster of each leg and center point represent the location of the pedestrian. k D represents the distance between the head and tail elements of the corresponding cluster, and k L represents the total length of the corresponding cluster. Finally, the detector gives the middle position of the pedestrian leg model as the pedestrian coordinate. The specific process of the pedestrian detector is shown in Fig. 5.

C. Pedestrian Tracking
In order to solve the problem of one of leg is occluded temporarily, the pedestrian tracking algorithm as mentioned in [17][18] is applied, and rectified the tracking intensity formula. When a pedestrian suddenly lose in a scene, the history tracking information is used to judge whether the losing is caused by one of leg is occluded, if yes, the pedestrian is just occluded temporarily. In the actual experiment, it is found that the improved algorithm can effectively track multiple pedestrians, and robust in circumstances such as a pedestrian suddenly break in or leave or be occluded.

D. Automatically Following
The pedestrian is followed within 30cm in front of the robot as the following target until the tracking strength of the corresponding tracker is reduced to a certain extent or the following is given up artificially. After obtain the global coordinate of the objective pedestrian ( ) p p x , y and the robot coordinate ( ) r r x , y , the next robot posture ( ) g g g , , x y yaw is determined by direction of the line between the robot and the pedestrian, then the posture is sent to ROS (Robot Operating System) platform, and using the navigation function to complete the automatically following.

V. EXPERIMENTAL RESULTS AND ANALYSIS
The platform used in our experiment as can be seen in Fig. 6, is a wheel robot, which equipped with a Flash Lidar F4 laser radar which the scanning angle is 360, the angle resolution is 0.5 and the frame rate is 10FPS, two wheels with distance encoder, an industrial computer with Intel i7-3610 and using C++ in ROS to realize the algorithm.

A. The Experiment of Likelihood Field Background difference
The experimental results are shown in Fig. 7. When the robot is in the three locations as shown in Fig. 2. The black outline in the figure is the original obstacle of the map, the white fine points on the obstacle are the laser radar background data separated from the likelihood field, the points with thick white and black edges are the laser radar foreground data extracted from the likelihood field method, the black circle in the middle is the location of the robot, and it can be seen that the location of the objective pedestrian can be figured out in each image.
Using the method in [19], when the region detected by laser radar close to the obstacle in background, they would be treated as a background. Only when the region is far away from the background, they would be regarded as a foreground. But in Fig. 7, which applies our method, the foreground and the background can be classified correctly.

B. The Anti Interference Experiment
The method of [20] is used to pedestrian detection experiment. Compared the method with likelihood field background difference and the original method, it can be seen that our method can require better results in complex environments. Fig. 9 shows the experimental results, in which (a) is the actual scene, (b) is the result of our method (ours) and (c) is the result of original method (original).
Three scenes are used to test the algorithm, Fig. 8(a1) and Fig. 8(a3) is the circumstance that the pedestrian would easily be mistreated as background, Fig. 8(a2) is the circumstance that the leg of the chair would be mistreated as pedestrian. it can be seen in Fig. 8(b1), Fig. 8(b2), Fig. 8(b3), using original method, three wrong judgments are received in these three scenes. When using our method, we can receive the correct pedestrian detection results, in Fig. 8(c1) and Fig. 8(c3), the accurate location of the pedestrian is detected, and in Fig. 8(c2), the leg of the chair is not mistreated as pedestrian.
Experimental results show that after using likelihood field background difference, background and foreground can be classified correctly in complex environment, so as to reduce the interference of the complex environment to pedestrian detection.

C. The Experiment of Pedestrian Detection
The pedestrian experiments are done in two locations and with two different detection distances (50cm or 50-100cm). The experimental results can be seen in Fig. 9. The method can always detect the pedestrian in different locations and distances, which demonstrate the practicability of our method. 37 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 6, 2020 D. The Experiment of Pedestrian Tracking and Following Our method is then used to pedestrian tracking and following, the experimental result can be seen in Fig. 10.
As shown in Fig. 10, black Pentagram and black circle represents the start and end points of the path of particle of the robot and the target pedestrian. The black thick line between the start point and the end point is the trajectory of the robot, and the white thick line is the trajectory of the target pedestrian. In order to test the stability of the proposed detection algorithm and tracking algorithm, a non-target pedestrian showed up, first move parallel to the target pedestrian, and finally surpass the target pedestrian and run counter to the target pedestrian. The track of non-target pedestrian is marked by white arrow, and the intermittent gray thick line represents the track of nontarget pedestrian.
It can be seen that the trajectory of the robot and the target pedestrian basically coincides, and the emergence of non-target pedestrian does not affect the robot's follow-up to the target pedestrian. Therefore, the follow-up strategy in this paper can enable the robot to eliminate interference and track the target pedestrian continuously and stably.

VI. CONCLUSION
This paper proposes a pedestrian detection, tracking and following algorithm in complex environment using a mobile robot with laser radar. The algorithm firstly maps the environment and then extracts the foreground data by the likelihood domain model to reduce the interference of complex background, the hierarchical clustering algorithm is used to cluster the foreground data, and then the improved Kalman tracking algorithm is used to effectively track the multi pedestrians. Finally, the automatic tracking strategy proposed in this paper is used to effectively follow the target pedestrians in the known map environment. Experiment result shows that the whole system has high real-time performance and is not interfered by complex background, and has certain practical value. The future work would focus on the combination of the laser radar and machine vision.