A Hybrid Background Subtraction and Artificial Neural Networks for Movement Recognition in Memorizing Quran

Movement change beyond the duration of time and the variations of object appearance becomes an interesting topic for research in computer vision. Object behavior can be recognized through movement change on video. During the recognition of object behavior, the target and the trace of an object in a video must be determined in the sequence of frames. To date, the existence of object on a video has been widely used in different areas such as supervision, robotics, agriculture, health, sports, education, and traffic. This research focuses on the field of education by recognizing the movement of Quantum Maki Quran memorization through a video. The purpose of this study is to enhance the existing computer vision technique in detecting the Quantum Maki Quran memorization movement on a video. It combines the Background Subtraction method and Artificial Neural Networks; and evaluates the combination to optimize the system accuracy. Background Subtraction is used as object detection method and Back propagation in Artificial Neural Networks is used as object classification. Nine videos are obtained by three different volunteers. These nine videos are divided into six training and three testing data. The experimental result shows that the percentage of accuracy system is 91.67%. It can be concluded that there are several factors influencing the accuracy, such as video capturing factors, video improvements, the models, feature extraction and parameter definitions during the Artificial Neural Networks training. Keywords—Movement recognition; computer vision; Quran memorization movement; background subtraction; back propagation; artificial neural networks


I. INTRODUCTION
Computer vision has been subject to excel in the world of research and development of information technology industry.The change in movement beyond the duration of time and variations in the appearance of an object becomes an interesting topic to study in today's computer vision.The movement changes of the objects are supposed to be able to be detected and recognized by an advance system.This recognition is useful to understand the object movement behavior through a video camera.To detect moving objects on a video, the target and trace of objects must be determined in a sequence of video frames.
There are some important obstacles to consider for detecting objects on a video, such as the appearance of a background or another object that is similar to the focus of objects [1], the shadows on top of the focus of the object, and noise caused by several highlights [2].Despite these obstacles, the object recognition technique on a video has been widely used in several fields, such as supervision [3], robotics [4], [5], agriculture [6], health [7], sports [8], education [9], and traffic [10].This research is proposed to contribute in the field of education.
Most object movement detection techniques on a video use Background Subtraction method, the combination of it or the modified one.In total, we found that there are at least 30 references; one of them is [11] that uses a method with an optimal result of object detection.There are other methods that we found in the literature such as Feature Extraction [7], [12], Histogram of Oriented Gradients [13], [14], Neural Networks [15], [16], Face Detection Module [17], Fuzzy Inference System [18], Naïve method [9], Probabilistic method [19], Directed Acyclic Graph [20] with various results of object detection.
Object analysis means classifying objects based on their behavior.Objects can be classified by various techniques, such as matching the pattern [3], [21], system learning and Artificial Neural Networks [22], control comparison and fuzzy logic [23], and others.The Artificial Neural Networks is considered as the best technique, since it can adapt and learn better when environment changes.www.ijacsa.thesai.org To the best of our knowledge, the study that uses Background Subtraction method as object detection method and Artificial Neural Networks as object classification method is rare.There is only one related study [24] that combines these two methods.However, the study concludes that the accuracy of the proposed system is 84.6%.The result contradicts with the theoretical textbook [25] which requires the minimum accuracy of the system in Artificial Neural Networks with Back Propagation algorithm to be 90%.We argue that the increase of the accuracy may serve to fill the gap.
Quran is the holy book for Muslim.It is the source of truth and guidance to human life (Chapter Al-Baqoroh: 2) [26].For this reason, Muslims has complete obligations to practice the contents of the Quran.To ease the practice, it is suggested to memorize the contents as much as possible.There are some techniques in memorizing Quran used by Huffadz, i.e. a person who memorize Quran.The memorization is believed to start from the creation of Prophet Adam (Chapter Al-Baqoroh: 31) [26].Nowadays, there are many methods that can be chosen in memorizing Quran.One of the methods is the sign method which is innovated by Huffadz who emphasizes in memorizing in different ways.The famous example of sign method is the Quantum Maki.Quantum Maki uses body movements to translate the meaning of the Quran verses word by word.Due to the nature of Quantum Maki method, we believe that it can be improved by computer vision.
This research aims at achieving system accuracy above 90% by combining two methods in order to better recognize the memorization of the Quantum Maki Quran.To achieve the goal, it is required that the distribution of data uses the hold-out method which has 70% training data and 30% test data.This method is important for stratified random sampling which randomizes data to produce proportional training and test data used in this research.
To speed up the process, however, this research is limited to the use of one chapter in Quran, namely Al-Ikhlas and the detected movement for memorizing Quran is in a standing position.Another limitation of this research is that the image is automatically cropped with predetermined size to standardize the process, although the consequence is that it is difficult to obtain specific parts from particular object for everyone.Moreover, the reliability of the data sets is important for movement recognition in memorizing Quran.Hence, this research uses a green color background to accommodate colorful video recognition in mp4 format.

II. THEORITICAL FRAMEWORK
Background subtraction is a method to seek particular objects in an image by comparing existing images with a background model.Modeling the background subtraction is sensitive to object motion recognition.Background subtraction detects any objects by separating background and moving foreground objects; which is computed by using the formula in Where R is the result of background separation, I is an object that is explored in position change, and B is the object background.
During the movement recognition, the Background Subtraction method is used as object detection and Artificial Neural Network is used as object classification.There are two features to represent the movement of objects during the memorization that involves body and hand movements.They are metric and eccentricity.Metric is a quantity that represents the roundness of particular object and eccentricity is the ratio calculation between two axes; i.e. major and the minor axis, in the shape of the object.In addition, the movement is recorded as a video before processed into the system.
Although the background subtraction method is considered as suitable for separating the background with moving objects, there is a possibility of another moving object or the shadow of the object that is detected as a foreground.The basic idea of this method is |frame (n) -background| > threshold.If there is a pixel n (the shadow of the object) that meets the equation, then the pixel is classified into a group of pixel objects, whereas the others are considered as background.
Artificial Neural Networks is a well-known artificial intelligence technique based on the mechanism of human neural networks.It is formulized by modeling human neural networks in mathematical formula with the three basic assumptions.First, the information is processed within several elementary neurons as elements.Second, the signals of information flow in and out through the path between neurons.This path is called connectors.Third, the connection between two neurons is given a weight that strengthens or weakens the signal.
The output is obtained from neuron by calculating activation function.This function; which is usually not a linear function, is computed based on the summation of the received inputs.A threshold is utilized to compare the amount of output.The structure of Artificial Neural Networks is conceptually represented in Fig. 1.Back propagation is one of the algorithms in Artificial Neural Networks.This algorithm is commonly used to change the weights connected to neurons in the hidden layer.Back propagation uses an output error to change its weight in backward direction [25].
Back propagation has several units in one or more hidden layers.Fig. 2 illustrates the back propagation architecture with n input pieces (plus a bias), a hidden layer consisting of p units (plus a bias), and an output unit.

A. The Sign Movement of Quantum Maki
The Quantum Maki is a sign movement in memorizing Quran by exploiting human body, mainly by using hand movement.This movement is developed by a husband and wife Huffadz, who successfully memorized the whole chapters of Quran, in accordance with Chapter An-Nahl verse 78 and Chapter Yaasiin verse 65 [26].
This movement is slightly different from BISINDO [27] and considerably different from SIBI [28].The main difference is that the Quantum Maki movement does not only use hand movements, but also other limb movements in particular circumstances.For an example, to illustrate the word alaqdaam is by patting the foot with both hands.Basically, the Quantum Maki is based on a movement that generally represents the meaning of the Quran verse.To illustrate the movement, several examples are captured in Fig. 3.
In general, each word in Quran verse has its own meaning and movement.However, there are several verses that are too long to memorize by using the movement.In order to facilitate the memorization, one movement is used.Additionally, to understand and remember the movement easily, general movements are repeated for other verses.For an example, the word robbi in the phrase of birobbil falaq is indicated by the hand up.In any verses and chapters, the movement for the word robbi remains the same.

III. METHODOLOGY
To streamline the research, a conceptual methodology is required.Fig. 4 shows the methodology of the research which is divided into two stages.The first stage is data collection which starts from interviewing the creator of the movement to get the sample data from the primary source and to validate the result of the experiments.The second stage is system development which emphasizes on the prototyping method to speed up the coding.To achieve the minimum value of error, the prototype is tested several times and its parameter is modified.
The training and testing diagrams are designed to detail system development.Fig. 5 shows training process diagram, while Fig. 6 shows testing process diagram.Several processes in these two diagrams are the same.For each process, the discussion is provided together with the result of its implementation to ease the understanding.
The training and testing diagrams start from video capturing which is conducted by three volunteers with a standing position.The result is inputted into the system together with its attributes, such as video dimension, frames, and duration.In background subtraction, object and background are separated through the following processes: 1) Frame separation: It is the process of separating each frame on a video in seconds.Frame attributes are separated and adjusted based on video duration.
2) Image sequence: It is the process of separation between objects and background.The selected object is used to process the segmentation, while the background is ignored.The RGB values of each frame are converted to HSV and their binary digit values are converted with XOR to eliminate negative value in truth table.The result is gray scale that is then transformed into binary digits by using the formula in 3) Noise removal: It is the process of removing noise on the object that is already separated from the background by using Median Filter formula, as in  

4) Cropping:
It is an automatic process of taking specific parts from particular object.This research uses an arbitrary cropping size of 170x170 pixels to automate and speed up the whole 5) Feature extraction: It is a process of taking features from particular object.There are two features used in this paper: Metric and Eccentricity; which are also used in [29].The mathematical equation for the Metric feature extraction is represented in (4) Where is metric value, is object area, and is object size.

The equation of Eccentricity feature extraction is in
Where is eccentricity value, is length of major ellipse foci, and is length of minor ellipse foci.
The sample result of Background Subtraction processing is illustrated in Fig. 7.
Normalization is the process of normalizing the value of object features.The average value of the two features are selected and calculated by using the formula in Where is a feature (metric and eccentricity), ( ) is a feature value in each frame, ∑ ( ) is the number of feature value in each frame, and N is the number of frame.II.The trial process has the same steps as the training process has.However, the normalized data are required to be simulated to the network.Fig. 8 shows the interface of the developed system to accommodate the trial process.The result is divided into current frame, background, foreground in binary, and after cropped image.This interface enables users to evaluate the trial result effectively.The trial results of other six testing data are available in Table III.
All testing data for each movement are tested into a network that has been previously trained.The trial was done 12 times with 11 movements were correctly detected and one movement was incorrectly detected.In other words, the percentage of accuracy system in detecting the Quantum Maki Quran memorization movement is 91.67%.The accuracy percentage is calculated by using the equation in (7) The equation is generally used by other existing works.Hence, it is not required to replicate and compare the experiment from other existing works.This is also because the implementation of this research is different from others in term of its unique requirement and process.

Fig. 3 .
Fig. 3.The use of Sign Movement of Quantum Maki.

Fig. 7 .
Fig. 7. Background Subtraction Output in One Frame.IV.EVALUATION AND DISCUSSION Artificial Neural Networks is the process of training and dataset classification which leads to be recognized by the system.Back Propagation is the algorithm that is used for training.There are nine videos available in this research.Six videos are used as training data and three videos used as testing data.The training data that have been processed are shown in Table I.The training parameters of the Artificial Neural Network used in this research are shown in TableII.
V. CONCLUSIONDetecting the movement during Quantum Maki Quran memorization has three main processes, namely the introduction of objects (Background Subtraction), the training of Artificial Neural Networks (Back Propagation) and the trial process.The trial process is used to seek the accuracy of the system in recognizing the movement of Quran memorization on the data demonstrated by three different volunteers.The system accuracy results in 91.67% that fulfils the minimum requirement for Artificial Neural Networks.Several factors are considered as influencing the accuracy, such as video capturing, models demonstration, video editing, feature extraction, and Artificial Neural Networks parameters.The future works include more detected object movements in memorizing the Quantum Maki Quran.It is also suggested to use other methods such as Gaussian Filter to detect the movement in memorizing the Quantum Maki Quran as comparison, to use other extraction features that are adapted to the complexity of the Quantum Maki Quran memorization movement, to compare the accuracy of the Background Subtraction method with other methods, to use other algorithms of training to classify the movement of objects in memorizing the Quantum Maki Quran, to group similar movements to identify better.There is also an expectation to include more chapters in Quran to improve the reliability of the system.

TABLE III .
PARAMETERS OF BACK PROPAGATION TESTING

TABLE IV .
TESTING DATA TRIAL RESULT