HOG-AdaBoost Implementation for Human Detection Employing FPGA ALTERA DE 2115

Human detection system using Histogram of Oriented Gradients (HOG) feature and AdaBoost classifier (HOG-AdaBoost) in FPGA ALTERA DE2-115 are presented in this paper. This work is expanded version from our previous study. This paper discusses 1) the HOG performance in detecting human from a passive images with other point-of-views (30 deg., 40 deg., 50 deg., 60 deg. and up to 70 deg.); 2) FPS test with various image sizes (320 x 240, 640 x 480, 800 x 600, and 1280 x 1024); 3) re-measurement the FPGA’s power consumption and 4) simulate the architecture in RTL. We used three databases as a parameter for test purpose, i.e. INRIA, MIT, and Daimler. Keywords—FPGA; human detection; adaboost classifier; ALTERA DE2-115; Histogram Oriented Gradients (HOG) feature


I. INTRODUCTION
The object detection is a critical task for many implementations in computer vision research area, like entertainment and autonomous system/robotics application [1].The Histogram of Oriented Gradients (HOG) has been widely employed as a feature for object detection [2].The HOG feature robust to the illumination changes and has a high accuracy in detecting objects with texture variations.So far, there is none proved feature to surpass the HOG existence [3].That is why HOG-based detectors serve as the fundamental feature for object detection systems.HOG is also commonly implemented for detecting the humans effectively [4][5].
To date, there are many papers on FPGA-based HOG that implemented in Xillix FPGA for high-speed and highaccuracy human detection systems [6][7][8][9][10].In previous work [11], we have applied the FPGA-based human recognition in an image, we employed ALTERA DE2-115.The results indicate that the humans are succesfully detected from a particular image with 1280 x 1024 resolutions and frame rate with 129 FPS.In that case, we used INRIA database.In [11], we also compared the other related works with our architecture and also investigated its power consumption of FPGA using Altera's Power Analyzer when performing as human detector system.However, the paper discussion is a lack of the results analysis when the image viewed from different angles using various databases, e.g.INRIA, MIT, and Daimler.Hence, we extend its study involving HOG performance for human detection in an image with other angles up to 70 o ; FPS test with different image resolutions; review about the power measurement of FPGA; and the simulation result of registertransfer level (RTL).To collect more data, various databases such as INRIA, MIT, and Daimler are used in this research.

A. System Description
Section II-A describes the HOG algorithm implementation on FPGA ALTERA DE2-115.The illustration of research workflow can be shown in Fig 1 .The following is brief description of research methods where the details are presented in Section II-B to Section II-F.

1)
Greyscaling and scaling of the image input's are conducted in the first step.Therefore, human on the image can be identified with various distance.
2) Afterwards, the process is continued by every pixel gradient calculation so the dx and dy component can be obtained for every pixel.Both of the components are used to calculate magnitude and angle that used for HOG Feature Extraction every one cell with the size of x*y.
3) HOG Feature Extraction used can minimize system throughput and logic gates utilization because there is not arctan calculation using CORDIC which use a lot of logic gates and add the system throughput.
4) After histogram of 2 x 2 cell is obtained which every cell contained 8 x 8 pixel, the normalization process is performed.In this process, the quadratic and root mathematical operation is approached by the value of the near power of two that segmented into 4 section between two near power of two values so the normalization process can be implemented with shifting operator 5) Because of the FPGA's limited size of memory, histogram binarization is performed so one histogram can be represented using 8 bit.Although the utilization of memory can be reduced to 1/64th, this histogram binarization process can affect the system performance.www.ijacsa.thesai.org6) After normalized histogram is obtained, detection process using the neural network is performed.Because this process is needed matrix multiplication, the systolic array is implemented on FPGA so the number of logic gates and system throughput can be minimized.Weight matrix and bias vector are obtained using Support Vector Machine (SVM) and AdaBoost which are done in offline.

B. Grayscaling & Scaling
As discussed before, the Grayscaling process is performed because of the HOG is an object detection method oriented on the shape of an image.The colored image has three matrices consists of R (Red), G (Green), dan B (blue) as visualized in  We can use shifters block to approximate constants as expressed in Eq (1).To simplify the approach, we take two numbers after a comma.The following is the justification:  0.29 is approximated by 0.0100101, needing 3 shifters. 0.58 is approximated by 0.100101, needing 3 shifters. 0.11 is approximated by 0.000111, needing 3 shifters The scaling process is needed so the system can detect humans from the various different distances.This process used sampling method of some specifics pixel correspond to the scaling rate.The architecture of preprocessing (grayscalling + scalling) can be illustrated in Fig 3.

C. Gradient Calculation
Gradient calculation can be done by calculating the difference of the neighboring pixel horizontally for dx and vertically for dy.The magnitude and angle value are needed to calculate histogram.
Magnitude calculation is explained on the Eq (2).The magnitude of the gradient m is included in histogram based on quantized difference orientation for every luminance value on a certain cell.The process can be illustrated in Fig 4 .In this work, the magnitude calculation uses own algorithm to approximate roots and squares.It uses 120 modules in parallel.The architecture is simple subtraction between neighboring pixels.Afterwards, the calculation result is then saved to the SDRAM.The architecture is shown in Fig. 5

D. HOG Feature Extraction
In this method, angle specific value doesn't need to be calculated, however just needed the range of angle values corresponding to the Eq (3).After the range of angle value is known, the magnitude of the pixel is added on the histogram block correspond to the angle value range.In this work, angle calculation is only compute the approximate angle, not the precise numbers so it requires two clock cycles.The architecture is shown in Fig 6.

E. The Normalization and Binarization
After histogram is obtained, normalization of the histogram collection is performed correspond with the Eq (4).Afterwards, the approached of the divisor in the Eq (4) is conducted to the power of two value that divided into 4 intervals with every near power of two values.
When normalized histogram is obtained, histogram binarization is performed with classified the histogram value based on a threshold value.When the histogram value is lower than the threshold and it will be represented by "1" binary value and vice versa as illustrated in

F. SVM and AdaBoost
Detection process using HOG used SVM that can classified histogram data into two classifiers, the human is detected or not.Training is done offline using two datasets, MIT pedestrian database, and INRIA database.Then boosting is performed to increase the accuracy of detection process with combined some classifier with a different algorithm.

G. Implemented Algorithm in Pseudocode
Table I shows the pseudocode for HOG-AdaBoost Implementation in FPGA.

A. RTL Simulation
FPGA implementation of human detection result is shown in Table II.Every steps of RTL simulation result (as depicted in   Therefore, simulation result on Fig 10 shows the magnitude calculation process which adds the two squares value and root process output if given the 16 and 9 as the input is 5.    Simulation process of histogram normalization can be seen in Fig 13 .The normalization calculation result for every inputs are 300 that approached.Therefore, the divisor value is 256 and the division process can be implemented using shifter.

B. Power Consumption
To measure the power consumption of FPGA DE2-115 implementation, we used The ALTERA's Power Analyzer tool and obtain 3.6 Watt total power.

C. Test with Other Angles
In this testing, human image from three datasets, INRIA, MIT [14], and Daimler with quite a high angle are applied to see the effect of those angle against the detection accuracy.In this testing, 5 angle categories are used as shown in Fig 15 .The result is accuracy detection compared to the dataset that used for training.From this data, can be seen for the angle 50 o and below, detection process still quite reliable.However, for high angle, the advance adjustment is needed.Fig 16 shows table and graph of testing results with three datasets, and the results obtained as detection speed in Frame per Second (FPS).This test is done by inserting test images from the three datasets as fast as possible.Next image will be inserted after the previous image detection process is done.Detection accuracy is maintained still at more than 0.85, if less than 0.85, the pause will be given before the next image is inserted.From that results, can be seen that Daimler consistently has the lowest FPS value.This is because Daimler dataset is designed to have a high level of complexity.However, this result can't be compared with the testing using video, because in video file there is a certain thing, like encoding that not taken into account.

IV. CONCLUSION
In this paper, we present the human detection system using HOG features and its classifiers using AdaBoost implemented in FPGA ALTERA DE2-115.The RTL simulation shows that the calculation of HOG-AdaBoost implementation in FPGA is approaching the theory.Furthermore, we also perform the power consumption measurement for HOG-AdaBoost in FPGA.Later is the high angle test and FPS test.Based on testing, it can be concluded that for the angle<50 degrees, the detection process still reliable.However, with angle>50, needed further system optimation.One of the methods can be done by using own specific dataset for human images taken from high-angle.With new dataset, the SVM is expected to recognize human with an image taken from a high-angle.
Fig 7. The normalization block requires a maximum of 19 shifters, 4 clock cycles, and using 61 stage RAM Buffer as shown in Fig 8.

Fig 1 )
are shown in Fig 9 to Fig 13.Simulation result in Fig 9 shows the quadratic value process.It can be seen that the quadratic value is obtained in 2 clock cycle.

Fig 11
Fig 11 simulation result shows the histogram calculation added input value to the array with 9 elements for every clock cycle.Fig 12 shows the simulation result of the grayscaling process performed the calculation based on an equation as explained before (Eq 1) and produce output every one clock cycle.

Fig. 13 .
Fig. 13.The Histogram Normalization Simulation Result.The systolic array implementation module simulation result can be seen in Fig 14.The output can be obtained after 6 clock cycles from the moment data is given, this result is 6 times faster than the usual matrix multiplication in which it is needed 3*3*3 = 27 multiplication iterations in 3 x 3 matrix multiplication.

Fig. 15 .
Fig. 15.Test with High Angles (For Interpretation of the Color in this Figure Legend, the Reader is Referred to the Web Version of this Article).

Fig. 16 .
Fig. 16.The FPS Tests (For Interpretation of the Color in this Figure Legend, the Reader is Referred to the Web Version of this Article).

TABLE I .
HOG-ADABOOST IMPLEMENTATION IN PSEUDOCODE % Real-time Human Detection with HOG-AdaBoost Pseudocode %Read an RGB value of each pixel from the image read_image; %Greyscaled an image based on the RGB value of each pixel image_greyscaling; %Saving the luminance value of each pixel to RAM save_greyscaled_image; for every_scaled_image %Scaled down an image size with the ratio scale 1.25 every iteration image_scaling; %image window iteration with 8 pixel shifting every iteration for every_image_window:8:image_size %save image window luminance map into RAM end www.ijacsa.thesai.org