Intelligent Locking System using Deep Learning for Autonomous Vehicle in Internet of Things

Now-a-days, we are using modern locking system application to lock and unlock our vehicle. The most common method is by using key to unlock our car from outside, pressing unlock button inside our car to unlock the door and many vehicles are using keyless entry remote control for unlocking their vehicle. However, all of this locking system is not user friendly in impaired situation for example when the user hand is full, lost the key, did not bring the key or even conveniently suited for special case like disable driver. Hence, we are proposing a new way to unlock the vehicle by using face recognition. Face recognition is the one of the key components for future intelligent vehicle application in the Autonomous Vehicle (AV) and is very crucial for next generation of AV to promote user convenience. This paper proposes a locking system for AV by using face deep learning approach that adapt face recognition technique. This paper aims to design and implement face recognition procedural steps using image dataset that consist of training, validation and test dataset folder. The methodology used in this paper is Convolution Neural Network (CNN) and we were program it by using Python and Google Colab. We create two different folders to test either the methodology capable to recognize difference faces. Finally, after dataset training a testing was conducted and the works shows that the data trained was successful implemented. The models predict an accurate output result and give significant performance. The data set consist of every face angle from the front, right (30-45 degrees) and left (3045 degrees). Keywords—Face recognition; deep learning; internet of things; convolution neural networks

everything else encountered in daily life, contains sensors. Sensors or devices in an IoT system collect and send data [3] from the surrounding environment on the state of the device's operation. As a data transmission channel, a gateway or link is necessary. The gateway's job is to make communication and data sharing between devices easier.
As a connectivity of physical objects, IoT works as a bridge to all devices by transferring their data using a common language to connect with various sensors or devices. Basically, sensors acting as the data's supplier to the IoT platform and the data are gained from multiple sources. Moreover, the raw data needs to be analyzed [3] before useful information can be extracted. In the end, the data, automate processes and the efficiency can be enhanced and improved further by integrating it with other devices.
Furthermore, the IoT platform such as storage, actuation, sensing, enhanced services, and communication technologies are very important to gather and analyses data [4] from smart infrastructure. On the other hand, the IoT is changing our way of life and transforming how we interact with technology [4], and it is driving the world to become a better place.
Human lifestyle has been impacted in certain ways [4] by how people react to the way humans behave with all gadgets (things) in synchronize with the increasing IoT revolution. This revolution, as described in [5] provided and guaranteed the capacity to have seamless interaction when transmitting and sharing data across a network without needing any human-tocomputer communication.
The cloud, which functioned as a platform for collecting, storing, managing, and analyzing real-time data, was a major component of IoT. Analytics will play a role in transforming analogue data from billions of devices into meaningful information that can subsequently be utilized for thorough analysis once the cloud handles the data. Data from devices and sensors is converted into a format that is simple to read and process.
Artificial intelligence (AI) is becoming more prevalent in IoT applications and deployments [6]. John McCarthy was the first to present AI in 1956, and he felt that AI included developing a machine that could really mimic human intellect [6,7]. Simply put, AI was designed in a way that a machine can simply replicate it and do the tasks from easiest to the hardest. www.ijacsa.thesai.org The idea of AI is to replicate human cognitive processes. The developer and researcher's expectation towards AI is up until imitating humans in simulating processes including perception, reasoning and learning [6]. In a number of situations, AI systems outperform humans by a large margin [8]. It demonstrated that AI can defeat numerous computer games, including a world champion chess program and the top professional poker players in the world [9].
There are two types of AI which are weak and strong [10]. Every system that always does a single task is known as weak AI, while the strong AI always refers to a sophisticated and difficult system. The example for the weak AI is like video games and personal assistants and the famous personal assistants in this world is like Apple's Siri [11].
Other than that, computer games also are one of the examples of weak AI. On the other hand, there are many examples of strong AI nowadays such as operating rooms in hospitals and self-driving automobiles. These technologies are capable of solving problems without human intervention [12] because they have been trained before to deal with the circumstances.
Previous AI standards are becoming obsolete as technology develops [12,13]. Nowadays, machines that calculate fundamental operation or read text by applying optical character recognition are formerly regarded to contain AI because these operations are now inspected standard computer functions. AI is continuously being enhanced [13] to benefit a wide range of businesses.
Mathematics, psychology, linguistics, and computer science are examples of multidisciplinary methods that are always used to wire machines [14]. We can even lock and unlock Autonomous Vehicles (AV) with our own face utilizing a deep learning approach using facial recognition, thanks to the advancement of AI models [15] that connect with IoT.
Furthermore, the AV idea is now at the forefront of the automotive industry's future security [16]. With the progress of technology, AV have the potential to reduce accidents, increase accessibility to transportation, especially for elderly persons, provide stress-free parking, and provide high-end security, among other benefits [16,17]. However, technological advancements might sometimes have downsides for AV users. Sensors, for example, may malfunction, attracting a hacker to steal personal data from an AV user [17]. In Section 2, we'll go into AV in further detail. Fig. 1 shows a process on how the data gathered from AV devices or sensors and then the data go through an analysis process before being transferred to the cloud. Basically, the data is gathered in edge computing after receiving from the AV. This data needs to be gone through pre-processing and decision-making in the edge node. Thenceforth, the data will be transferred to the cloud by the edge node after analyses locally by the IoT sensor [18]. The aim of this process is for less time-sensitive decision-making and for offline global processing.
Based on the diagram shown above, the road accident can be prevented by using obstacle recognition as shown in the diagram above. These time-sensitive choices are capable of avoiding crashes in a shorter amount of time as illustrated in the edge node above [18]. To improve a driving experience, the cloud provides a platform for the data to analyses about the traffic, roads and also the driving habits.
The edge node as illustrated above shows that the AI models will be actively changed in terms of consumer needs, regulations, policies and appropriate laws. Moreover, in this node the amount of data sent by the IoT is bigger than the generated data in AVs because the data needs to be preprocessed, filtered and cleaned before proceeding to the cloud and by using this method the amount of cost and bandwidth can be reduced.
When it comes to the locking security in AV, normally the most common method is by using key to unlock our car from outside, pressing unlock button inside our car to unlock the door and many vehicles are using keyless entry remote control for unlocking their vehicle. However, all of this locking system is not user friendly in impaired situation for example when the user hand is full, lost the key, did not bring the key or even conveniently suited for special case like disable driver. Hence, we are proposing a new way to unlock the vehicle by using face recognition. Face recognition is the one of the key components for future intelligent vehicle application in the Autonomous Vehicle (AV) and is very crucial for next generation of AV to promote user convenience. This paper proposes a locking system for AV by using face deep learning approach that adapt face recognition technique.
In the remainder of this paper, Section II contains a brief history of AV including the superiority of AV, challenges of AV and the solution of the challenges; Section III discuss the component of machine learning and in that section, we have discussed of 10 sub-component of machine learning; Section IV discuss the methodology that have been used in this research paper including the way we collect the data to the way we implement the research; Section V conclude the result and discussion in this paper ; Section VI is our conclusion part for this research; Section VII discuss the future work for this research paper and lastly Section VIII dedicated the acknowledgement for this paper.   [19]. A study from [16] had mentioned that AV was introduced in the 1980s and the research about AV was funded by Defense Advanced Research Projects Agency (DARPA) [16]. Thanks to AV because with the innovation of transport systems, combination of sensors and software to control, not only can reduce time, money and environmental impact yet can improve safety, increasing capacity, and minimizing traffic congestion [20].
Through the advancement of technologies in these modern days, the AV also known as driverless vehicles evolved by the ability to sense its surroundings, perform significance function, and operate by itself without interference by humans. An automation level generally divided into six levels which are from level 0 to level 5 and each level represents the operation control capabilities whichever the level 0 has the least automation control while the level 5 has the most control capabilities. The lowest level known as level 0 basically can't control all the operation and the whole process of driving needs to be done by humans [16]. On the other hand, the control process at the level 1 had been improved in terms of steering and braking control of the vehicle with the support of Advanced Driver Assistance System (ADAS).
As the level of automation increases, the vehicle becomes more advanced. As stated in [16], the level 2 automatics are capable of controlling the steering and braking by using the ADAS system. However, the drivers need to be focused and paying attention to the environment along the journey. At level 3 it becomes more advanced where the driver gives full control to the vehicle through Advanced Driving System (ADS). This system is capable of controlling all parts of the driving task with a few conditions. However, the human driver was allowed to control the vehicle when requested by ADS. In addition to that, the human driver executes the necessary tasks in the remaining conditions. The ADS plays an important role in AV's system where at level 4 the system is capable to control and perform all tasks without any human intervention including supervision from the human. The last and the most advanced AV is at level 5, where in this level the AV is not only capable to perform all the driving tasks but also is capable to communicate with other devices [16] including traffic lights, signage and the environment of the roads and to perform this function this level requires 5G application.
Together with that, vehicle speed is also one of the important elements to the AV. To ensure the speed of AV kept at a safe distance, the Adaptive Cruise Control (ACC) is used. This system uses sensors to get the distance information and undertake the vehicle to perform tasks when the sensors send the signal to the vehicle such as perform brake when senses and predict any imminent and any vehicle ahead. These sensors give the information to the actuators in the vehicle and then proceed the control action activity in the vehicle such as braking, acceleration and steering [16]. Furthermore, the high level of AV is adept to control the automated speed in order to respond to the signals that come from the traffic lights and nonvehicular activities.

A. Superiority of AV
Statistic states that usually vehicle crashes happen because of human error and it is proven when 90% of fatal vehicle accidents are due to human failure [21] hence, the AV's technologies got the potential to reduce the death statistics because of human error. Thus, driverless cars are a future technology that is needed by humans to scale down the deaths and injuries from car collisions. The reason for crashes comes from the driver's focus interruption [21]. On the other hand, there is a website called the house energy and commerce committee that claims that traffic deaths can be reduced up to 90% and can save up to 30,000 people yearly by using driverless vehicles or known as self-driving cars. Apart from that, there is a report from American Society of Civil Engineers (ASCE) that states that Americans can't avoid wasting their time in traffic every day [22] and surprisingly they used 6.9 billion hours for that purpose.
Furthermore, AV brings a lot of benefits to people, especially to senior citizens and for the disabilities drivers to handle vehicles safely. Other than reducing the numbers of accidents, the idea of AV is to help people in these groups to drive effortlessly. AV caters and provides more people to drive independently without worries about the safety issues [23]. Moreover, a study by [24] states that by accommodating AV technologies, will make life much easier and effortless to go to work, attend meetings with clients including going to the doctor especially for senior citizens and disability people.
Other than that, according to [23] Many benefits will be gained from the AV in terms of travel time, commuting and congestion time and also cut down the fuel consumption which is a good barrier to the citizen especially to the citizen who live in the city or town and crowded place and road. The country can save up to a trillion dollars when using AV and also can reduce manpower and law enforcers and save more money [25] to the country.
Safety and security issues are an important part of everyday life and are needed in many areas included in modern transportation. Moreover, these days AV concept is leading the future security of the vehicle industry [20]. AV has Light Detection and Ranging (LiDAR) sensor [16,26,27] It is capable of avoiding obstacles in an unknown environment and being able to classify dynamic objects in urban roads into cars, pedestrians, bicyclists and background [22]. LiDAR divides into two types which are non-scanning and also scanning LiDAR. In addition, scanning LiDAR comes with different features, other than single scanning, there is also one type called multi-line scanning LiDAR. In addition, LiDAR also has a non-scanning LiDAR type [23] that uses 3D-flash LiDAR.
The next feature in AV is Radio Detection and Ranging (Radar) [16,26,27]. Radars have proven effective for appearance on AV in the existence of fog and dust [25]. Besides that, radar is also designed to aid Off-road Light Autonomous Vehicle (OLAV) platforms in classification, mapreading, and detection [27]. Furthermore, Lidar and cameras are indeed very popular sensors, but radar gains much more advantage when compared to radar in terms of speed measurement capability and cost and target range [27]. AV also has an image sensor feature like rear-view cameras [27]. www.ijacsa.thesai.org Rear-view cameras are used to detect obstacles behind the vehicle with aid of a fisheye lens [28]. The most important feature to AV is the locking system [29] when it comes to security factor.
Generally, a key of a vehicle is the most important thing to the vehicle to start the engine including to unlock the steering [26] such as by using pin tumblers lock, and then the lock changed to transponder key lock and after that an AV locking system became more advanced which can lock and unlock the vehicle by using Passive Keyless Entry and Start (PKES) system [29]. The PKES system is widely used in modern vehicles, the user can lock and unlock the vehicle whenever they are near to the vehicle without needing to take out the key from their pocket [29]. This system is very convenient to many users and makes life easier. Now, many manufacturers want to move to another level of vehicle locking security system by face recognition [30]. This paper invented the novel prototype of a safety system in AV, especially the locking system by using Keras, TensorFlow and Deep Learning (DL).

B. Challenges of AV
The research done by [12] claims that though AV has been successfully programmed, an unpredicted flaw still may come after. On the other hand, the crazy advancement of technologies makes the older version equipment faced with the faulty code issues. With the many advanced technologies in the AV doesn't mean the vehicle can't be hacked by hackers. Hackers can hack AV systems easily because the system still has many vulnerabilities [16] as this is new to the world and the hackers definitely will steal the personal data through the AV.
The next drawback of AV is dysfunctional sensors [31]. Sensor failures often happened in AV [31], as an example the locking system. The locking system is a very crucial part in automated vehicles safety [32]. The AV user is very concerned about the locking system. AV provides the modern locking system to the user by maximizing the security and safety to the vehicle. Safety and vehicles cannot be apart. Safety is a very important element for the vehicle [16]. Basically, the common safety element in vehicles is like the lighting will turn on when the doors are unlocked, and the AV gives notification to the drivers by integrated control of the lighting.
However, AV locking systems also will have problems in the modern lifestyle, when the user's hand is full, lost the key, did not bring the key, the key can be duplicated by others or even conveniently suited for a special case like disable driver [33]. All of those factors demand a new locking system for AV users.

C. Solution
As explained in the previous section, AI has been widely used in this world. Basically, AI is a technique that enables a machine to mimic human behavior. As example, an ability to sense, reason, engage and learn. AI operates autonomously and uses a variety of methods through data learning processes by machines [7]. By the recent advances in AI, many impacted areas have been affected by using AI techniques, as example voice recognition, Natural Language Processing (NLP), computer vision algorithm, robotic and motion, planning and optimization, and knowledge capture [34,35].
When we go deep in AI, we will find that AI is supported by an algorithm model known as Machine Learning (ML) and inside the ML there is another algorithm model called Deep Learning (DL) [36], [37]. Fig. 2 shows AI and the subfield.
ML is used to manage from the raw data, when ML gets the data, they need to be trained and this technique can be achieved by using specific algorithms. Basically, AI has a lot of methods that make machines operate autonomously through provided data [7], [37].
ML is created to be an independent computer program by learning itself from the data and ML algorithms is divided in a few categories [37], known as reinforcement, supervised, unsupervised and semi-supervised learning. This type of learning is illustrated in Fig. 3 below. Nowadays AI innovation is leads by ML techniques and nominal by Deep Neural Networks (DNN) [32,33,34,37,38] and this model widely used as black boxes.
Currently Deep Learning (DL) is a very popular algorithm and has been widely used by researchers and developers in various fields. The idea of DL is to imitate human brain function into machines. The most popular algorithms in DL are; a) Long Short-Term Memory Networks Deep Boltzmann Machine (DBM) and f) Stacked Auto-Encoders. Moreover, the DL is focusing on the more complex and larger dataset [7,11,38] such as video, audio, text and image.
As mentioned above, the human brain acts as a major part for the evolution of DL. The design structure and frameworks of DL exactly look alike and function well as the human brain which is capable of differentiating patterns and can classify diverse types of data [38]. The evolution of DL makes the CNN become popular methods used in this field including Face Recognition (FR) technology based on CNN [39,40]. FR is widely used nowadays in so many fields including to unlock mobile devices and the FR process includes the recognition task, feature extraction, alignment and detection [11]. On the other hand, the DL methods are able to support a huge dataset of faces and learn rich and compact representations of faces. During 2012, a competition called ImageNet Large Scale Visual Recognition contesting became popular because of the CNN research that was initiated by Alex Krizhevsky [41]. After that competition the name of Alex became more popular. By using multiple processing layers and levels of features extraction, DL is capable of learning delegation of data [42]. On the other hand, fully connected layers, normalization layers, convolutional layers and pooling layers are a few examples of the layers that are hidden [38].in the CNN algorithm.
With DL, we can produce a new locking system for AV by using FR technique [43].The AI Researchers begin to use DL as a tool for training the face expression [44]. DL is knowingly an authoritative tool in the automation industry, and FR is part of the applications. FR is widely used in the military, finance industry, daily life and public security [45,46], and FR is divided into two classes which are one to many augmentations and many to one normalization [46]. We will explain in detail about FR in Section 3 below. This method can solve the locking system problem. With the advancement of AI, the locking problem that was discussed in the previous section can be solved.
A survey done by [47] finds that FR technique will become a useful technique for AV users in terms of security, especially for locking systems. This technique requires a dataset to train before you can prove this technique meets the expected result. Moreover, with the help of IoT, the user will get notified [48] about the system failure.
In traditional methods, the system recognizes the human face by layers, which is one or two layers such as responses of filtering. With the emergence of DL, the landscape and framework of FR technique has been changed [49]and reshaped in all algorithm designs, evolution protocols, application scenarios including the dataset training.
FR needs three modules to run the system, the first one is a face detector. This module is needed to contain faces in images or videos. After that the next module is the face landmark detector and lastly is the FR module. This module is an antispoofing face [50]. There are two categories of FR [51], the first one is face verification and the other one is face identification.
In this study we are using ground truth research by using convolutional neural networks (CNN). This method is used for unlocking an AV by using our face. This model needs to be trained and validated in their own respective data. We will elaborate more in section 3 below about the methodology.

III. MACHINE LEARNING
The advancement of ML was proven with minimal human interference and these methods are capable of analyzing the data before building an analytic model. Other than that, ML is intelligent in digesting information from raw data, analyzing the form of data and finding the decisions with least human supervision [52,53]. Hence this new model of ML is definitely different from the traditional ML [54] which current ML was designed to learn from pattern recognition and can learn without being programmed to specific tasks but learn from data.
To be an independent model, ML's interactive aspect is very crucial because ML works with the new data. The more ML learns about the data, the more ML will become smarter without any assistance from humans [55] and ML can produce reliable results and repeatable decisions.
Normally, to understand the data without human intervention, we need four kinds of algorithms that rely under ML, which are; a) semi-supervised learning, b) reinforcement, c) supervised, and d) unsupervised learning [36] Fig. 3 shows the ML types and the categories model in ML.

A. Supervised Learning
Supervised learning under the ML has been divided into two outcomes which are regression and classification. The regression outcome intention is to forecast based on the training sample set given, such as house pricing, weather forecast and market forecasting. Whereas the classification outcome's goal is to identify pattern [36] such as identify fraud detection, image classification and diagnostics.
In supervised learning, there are three models [36], the first one is Classic Neural Networks or known as multilayer perceptron (MLP), the second one is Convolutional Neural Networks popular as CNN and the last model is Recurrent Neural Networks known as RNN. Fig. 4 generally shows a normal workflow for classification in supervised learning algorithms. Above all, three steps are needed to classify data before generating the expectation output. The first step is to clean the raw data through the extraction process before gaining the quality or useful data. Secondly, the useful data including the labels are sent to the training stage by ML algorithm to analyze an excellent model.  To improve the accuracy of the model, the model needs adjustment from the evaluation step, because this step is capable of giving a point of view about the feature's extraction and learning stage. Before achieving the desired accuracy stage, the data needs to go through the training process [56] all over and over again. Once it done, the new data can be predicting easily.

B. Multilayer Perceptron (MLP)
A simple algorithm that calculates the binary classification is called perceptron. Many real case's classified in this algorithm and this algorithm categorizes the input based on their own categories as an example cat or not cat and not fraud and fraud. While the MLP is composed of more than one perceptron [57].
A MLP involves a few layers and the layers have different types of uses. The common layers are known as output, hidden and input layers. Input layer task is to gain signal. Whereas core layer for MLP is the hidden layer, because this layer is the computational engine for MLP and lastly is the layer that functioning to predict the input and this layer known as output layer [58] Supervised learning technique is used in MLP for training purposes for every node and the technique called as back propagation and every node in this layer uses nonlinear activation function except the input node. The design of this layer is illustrated in Fig. 5.
To minimize the error the MLP's training does the altering parameters such as weights and biases. Then the MLP learns from the model's correlation between input and the output. Hence, not extraordinary when the MLP is capable of estimating the XOR operator and other nonlinear functions very well [58]. So, all of these are the advantages of MLP.
However, the parameters that are set by the MLP will become inefficient whenever the numbers of parameters become so high and it will cause redundancy in high dimensions. Moreover, it will disregard spatial information and make the flatter vectors as inputs [58].

C. CNN
As discussed in the previous section, CNN play a paramount role in identifying and classifying images. Many researchers use CNN because of the magnificence of this algorithm in classifying images such as identifying objects, individuals, tumors, street signs, faces and many other data that are related to visuals. And these algorithms can perform the classification of images including photo search [34,58]. CNN came within three layers. And all the layers are very famous among the researchers and developers known as convolutional, fully-connected and pooling layers. The function of the first layer, which is convolutional layers, is to obtain many attributes and diversity of features that are gained from the input images. These input images are then filtered to a specific size by the mathematical operation. This layer is then followed by the other layer [34,58] known as the pooling layer. Decreasing connections between layers capable to reduce the computational cost by decrease the convolved feature's size.
In the end, the last layer is called a full connection layer. This layer is located before the output layer and makes another layer in CNN architecture. This layer comes along with weights and biases and the neuron elements. All of these elements are used to connect with different layers [34]. Fig. 6 shows CNN architecture where that architecture consists of an output layer, an input layer, a full connection layer, 2 maxpooling layers and 2 convolutional layers. All algorithms in ML have their own advantages. For CNN, the advantages for this algorithm are advances in Computer Vision (CV). CV algorithm diversity used in technologies nowadays including treatments for the visually impaired, security, drones, medical diagnoses and driverless vehicle [59]. Other than that, CNN is widely used in business-oriented tasks [58] such as making natural-language processing available on analogy and manuscript documents, whereby the images are symbols to be transcribed and to digitize text known as Optical Character Recognition (OCR).
However, the CNN are naturally slower because of the operation, like maxpool and in addition the training process will become slower when the CNNs have several layers because the computer doesn't consist of a good GPU. Other than that, based on [59] the author stated that to process and train the neural network CNN require a large Dataset.

D. Recurrent Neural Network (RNN)
Artificial Neural Network (ANN) alongside internal loops is called RNN [60] and this is a powerful technique. The interesting part about RNN is this technique is being used every day such as image recognition that capable to tell the picture's content, speech recognition, language translation, stock prediction and also driverless vehicle RNN indeed a powerful algorithm for prediction purposes because this algorithm can divide text and words into sequences, especially on sequence data modelling and the sequence data appear in www.ijacsa.thesai.org many patterns like text and audio. RNN predicts the data by having a concept of sequential memory. For example, as a human we can easily mention the alphabet in sequence because we already memorized it. However, when we want to mention the alphabet backwards it's pretty hard for us because we have not memorized it. A human brain capable to recognize sequence of patterns [60] by using sequential memory mechanism.
RNN replicates the concept of sequential memory of the human brain by using 3 layers called as an output, hidden and an input layer. In RNN exists a looping mechanism that can pass the previous information forward. This looping performs as an expressway to flow information from one step to another [60]. The previous input information is kept in the hidden state. As an example, by using RNN we can build a chat box and the chat box capable to classify intentions from the user inputted text [60].
As mentioned in the previous sentence, the hidden layer representation of previous input and it will be modified. This modified hidden state contain data from all the earlier steps and continue to loop until no more words and then gives to the output to feed the board layer and it gives the forecast. Forward pass control flow of a RNN can be done by for loop [61].
The RNN architecture workflow quite simple to understand by many. Example of RNN such as previous information is taken by present cell before come out the output. Meanwhile, by referring the Fig. 7, it represents the processed word, t also will be the input after the word was processed. All the text available to process when the sequence dimensions are reduced to a certain value. The sequence needs to meet the size requirement, otherwise the sequences will be filled until the specified value. However, the excess will be barred if the sequence size is more than the specified value [62].
Training in neural network consists of three crucial steps. The first step is to make a prediction by forward pass, then by using loss function the networks can differentiate the output prediction to the ground truth. On the other hand, when the loss function gives an error value as their outputs, it means the network is performing badly. Finally, this network calculates the gradients for every node to do back propagation by using the error value [61]. In this case, gradients referred to a value that mainly used for allowing the network to learn by adjusting the network's internal weight and if more the gradient, hence the more the adjustment.
However, a short-term memory is inside the hidden state. This is common memory to other neural network architectures and exist because of the infamous problem which known as vanishing gradient. This problem appears on a one reason, which is the information that have been kept in the previous step have error [61], and basically during training and optimize neural network processes, this problem is normal nature to back propagation algorithms.
The computation of RNN model is slow, this drawback led to difficulty to train the data when the researchers using the activation functions because it will make a very exhausting process which makes the long sequences process [63]. Hence, the exploding or gradient vanishing will happen.

E. Unsupervised Learning
Supervised Learning is different from unsupervised learning in so many ways. This algorithm trains the samples without training the labels. The unsupervised learning is divided into two outcomes which are clustering and dimensionality reduction. Clustering outcome formulated using the algorithm to find consistent patterns become apparent, the similar data points can be clustered together, and different data points will be in different clusters in the data such as recommender system, targeting marketing and customer segmentation. While dimensional reduction outcomes are like finding suitable structure and pattern in the data [58,61] such as big data visualization, structure discovery and feature elicitation.
The unsupervised Models also consist of three different models [58]. The first one is a self-organizing map or known as Self-Organizing Map (SOM), secondly is the Boltzmann Machines model and lastly is AutoEncoders model.

F. SOM
A neural network based on dimensionality reduction algorithm is called as SOM commonly utilize two-dimensional discretized pattern to perform a high-dimensional dataset [58]. Dimensionality decreases will occur whenever to retaining the data's topology in the primary feature space.
The input space of the training samples will produce a low dimensional discrete on this type of neural network, called as map [34]. In addition, this technique capable to do reduction of the dimensionality.
The similarities in the data could be observed easily by using the dimensionality reduction and grid clustering, that makes this type of neural network is easily to understood and clearly explained [64]. However, cluster inputs need sufficient neuron weights. If the weights not sufficient the map will produce inaccurate results. The SOM model is illustrated in the Fig. 8. (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 10, 2021 572 | P a g e www.ijacsa.thesai.org

G. Boltzmann Machines Model
Unlike others neural network, Boltzmann model only have two kind of nodes called as visible and hidden nodes without an output node [63]. This situation makes this model known as non-deterministic features.
Two computational error can be fix by using this model. As example, optimization problem and search problem may happen and by fixing the weights on the connections the problem will solves and this method also is a cost function [63]. That is illustrated in the Fig. 9. The main disadvantage is that Boltzmann learning is significantly slower than backpropagation [63]. However, this model also has numerous problems in the use of algorithms. The example of the problems encountered are such as weight adjustment, the time needed to collect statistics in order to calculate probabilities, the times weights change at a time, the difficulties of adjust the temperature during simulated annealing and the difficulties to decide when the network has reached the equilibrium temperature [63].

H. AutoEncoders Model
To learn an encoding data set, autoencoder model need to use training technique in unsupervised way. In order to learn an encoding data set focusing on dimensionality reduction, autoencoder need to training the network to overpass the signal noise [65].
Learning to encode a data set is the main focus for an autoencoder. Generally, autoencoder will reduce the dimensionality to neglect the noise's signal by network's training. In addition, autoencoders will provide a model to user by referring the data rather than predefined filters [65].
In general, the autoencoders provide the users a filter that may fit the user's data better [65]. However, Generative Adversarial Networks is much more efficient than autoencoders in term of recreate an image. In addition, images will start blurry whenever the complexity of the images increase [65]. That is illustrated in the Fig. 10.

I. Semi-Supervised Learning
Semi-supervised learning falls in the middle of unsupervised and supervised learning. In reality, hiring an expertise worker needs a lot of money because we need to pay their skilled, because of that the cost of label is high [66].
Hence, semi-supervise algorithm became the best methods for the building the model especially for the less labels data. To put it clear, this data brings crucial information about the group parameters even from the unknown group of unlabeled data.

J. Reinforcement Learning
Machine Learning that has a representative like robots that know how to behave in an environment by evaluate the results is called as Reinforcement Learning. In addition, the robot will get a reward through the points given to them based on their performance on gives correct response in every situation. This point will boost up the robot's confident to take more actions. This process is like Markov Decision Process also known as (MDP). On the other hand, data classification in reinforcement learning is useless [67] because the robot learns from trial and error and support by the concept reward and punishment by MDP.

IV. METHODOLOGY
Based on those algorithms mentioned in the above sections, we chose the CNN algorithm. A study by found that the trendiest technique of neural network for working with image data is CNN. Other than that, CNN are very good in extract image features that makes this neural network became famous and our research paper is based on image data, so CNN is the best algorithm to build our model. On top of that as our research paper used image and pattern recognition, hence CNN is the best choice for our research as this network can solve our problems with using their technique whereas other neural network doesn't have this technique.
The interesting part in this paper is about our methodology stages. Firstly, pre-processing step is being used to haul the images of test data before predicting the classes for these images utilize the trained model. There are seven sequences to construct this model. Firstly, we need to set up Google colab, secondly, we need to import libraries, thirdly we need to load and pre-process data. This step took around three minutes, and then we needed to create a validation set. The next step is defining the model structure, this step took around one minute. Then we need to train the model. This step took around five minutes and the last step was to make a prediction and this step took around one minute. After that we may predict the pattern of the model and analyze its performance using model.predict().
In this study, we are using our own dataset to present the training set. To set up the structure of our image data, we prepared two folders for our data set, the first one is set A www.ijacsa.thesai.org which is named as known folder and the other one is set B which is named as unknown folder. Firstly, we create three folders in each set, the first folder is the train folder, the second folder is the test folder, and the last folder is the validation folder. These sets contain the images of all the test images but no labels. The reason is because the images set will be trained on our model and the testing set images will predict the data by label it.
After that we split the model building's process into four stages. The first one is loading and pre-processing data which takes 30% of all time, the second one is defining model architecture which takes 10% of all time. The third stage is training the model. This model took half of all time for this process. The last stage is estimation of performance. This stage took 10% of the time.
Validation set should be constructed before disclosing to the test set, these methods used to perform the unseen data. Train and validate the data should be done on their own data respectively by subdivide those data set. All the parameters applied in Table I. The model trained a few times using the list of parameters in the table and the results are shared in the next section of this paper.

A. Stage 1: Loading and Pre-processing Data
To begin with, we need a dataset to train on. Data is a new gold. ML and DL are not magic, they need data to train. As mentioned above, the first section is loading and preprocessing data. This is a very important step in any research. With having a good number in training set means it determine the better performance of the model and the architecture of the model determine the pattern of data in order to create the validation set.
In this study we are using our own dataset. As shown in Table II, our data consists of 255 images and the images represent two different datasets which are known face and unknown face folder. This process just needs a very little Preprocessing process because by considering the small size of the images, hence the dataset is very easy to upload. Firstly, we need to import the necessary libraries. We are using a matplotlib array and diverse modules correlate with TensorFlow and Keras. In this research paper, the data images are shrunk to one similar size of training data images: 150 x 150 pixels during training.
Then we need to prepare the data. In this section we need to load and import our data In this section we can determine the data that we want to load by using load_img() function. However, negative impact will happen if the amount is huge. On top of that, we set our data image to 150 wide and 150 heights. In this stage we need to normalize our input data.
Our research uses image as our input value. Its value is between 0 and 255. On the other hand, we need to normalize the data by divide the image value with 255. To predict the number of neurons that we need to compress in the last layer, firstly we need to declare the data type as an integer in the dataset and fix the number of classes. In our case, we utilize ImageDataGenerator (rescale = 1. /255) command because its currently an integers value.

B. Stage 2: Defining the Model's Architecture
Model architecture is very crucial step to define. CNN model was designed in this stage and estimate the number of convolutional layers and hidden layers that we need. After that we need to define the format that we will use for the model. We are choosing Keras because Keras has few different formats to create the models. In Keras sequential format is very popular because of those factors we import it from Keras.
A convolutional layer was used in our first model and this layer also will run in input nodes specifying the number of filters is very important to implement in Keras, the size of filters that we want, In our case we are using 64 filters of dimension 3 x 3, the input shape and the activation and padding that we need. So, the activation that we use is ReLU since it is the common activation in DL; however we can string the activation and pooling together.
Dropout layer were created to prevent overfitting. To do this, we need to eliminate some of the connections between the layers. We are using string dropout (0.5), so we can drop 50% of the existing connection. After that we are doing batch normalization where the input heads to the layer after and make sure that the network continuously provides activations with the similar circulation needed.
Then, to learn more complex presentation of network we need to increase the filter's size, hence another convolution layer will appear. Adding a convolution layer means the filter's numbers is increased too, hence, more complex images will be learnt. We also used a pooling layer. In the pooling layer, as discussed in the previous section this layer makes the image classifier more robust so it can learn relevant patterns. However, pooling layers discards some data, so we don't use many of those layers. Because of our data already in small size, we just twice the pool. After that we did over these layers to give our network look more representation. Then we used fully connected layers and sigmoid activation algorithms.

C. Stage 3: Training of the Model
After defining this architecture's model, then we do the training of the model stage. In this stage we need to compile it and need to detail up the epoch's number that we prefer to train www.ijacsa.thesai.org and the optimizer that we need to use. In order to reach the fewest point of loss, we need to use the right optimizer to tune the weights for the network and that why we choose Adam optimizer algorithm as it provides good output on most problem situations. In this stage we need to combine with our chosen model's parameter and also determine the metric to be implemented. We are using Adam optimizer as the optimizing function and binary_crossentropy as the loss function while training the data. Then, the model's summary can be print out to analyses the pattern. There is a lot of info inside the summary such as type of layer, the output shape and the parameter. To train the model we need to use the fit() functions on the model and pass in the chosen parameters. We will have the validation set which is different from the testing set. In this stage, we just want to make sure the test data is set aside but not to be trained.
For training models, we require two important data sets. The first one is the true labels and training images, and the other one is the validation images and true labels. The true labels in validation images are needed not for the training phase, but to validate the model.

D. Stage 4: Estimating the Model's Performance
Lastly is the estimating the model's performance stage. In this stage we can see the accuracy result, loss result, plot loss and plot accuracy for each validation and training epoch. Finally, we can test the model on the random train image for both sets; the flow of train image in our work is shown in the Fig. 11. To excellency support of the proposed approach, we compared the results of our approach with some other methods of face recognition in the literature based on existing methods including ANN, support vector machine (SVM) and Principal component analysis (PCA).

V. RESULT AND DISCUSSION
Few parameters are being adjusted on the validation and training process. The training was done repeatedly in eighty sets, and the Table III shown the experiment output. We also show our training and validation result in graph foam. The graph shows in Fig. 12 and 13.  12 shows the training and validation accuracy graph for our research. The meaning of accuracy here is the number of correct predictions. The training accuracy is actually the accuracy that we get when we use the model on the training data; on the other hand, the validation accuracy is the accuracy on the validation data. As we can see in our graph below the training accuracy in our research achieves the 100-percentage accuracy before the 30 epochs. Fig. 13 shows a training and validation loss graph for our research. As we can see the graph shows training which is the blue line against validation loss which is the orange line. Training loss means that is the error on the training set of data, while validation loss means that the trained data got error after running the validation, so when the epochs increase the both training and validation error drop. Our graph shows that the training error continues to drop, it also shows that the training error totally drops before the 30 epochs that means the network learns the data better and better.     Normally, the loss value will be reduced after each model training repetition. The model prediction is considered perfect if the loss is zero and vice versa. As we can see in the table IV below our loss value keeps reducing for every epoch, it means that our model prediction is perfect. The next parameter is accuracy; the accuracy of a model is defined as a percentage of correct predictions for the test data. In our research, the accuracy of the model increases in every epoch and achieves 100 percent accuracy before the epoch of 30. Next is the validation loss parameter, the validation loss means that the trained network has error after the data set have been run through the validation set, as we can see in table IV the validation loss of our research reduces in every epoch. That means the error is reduced in every epoch. The last parameter is validation accuracy parameter; validation accuracy is the accuracy on the validation data. As we can see in our table below the validation accuracy in our research achieves the 100percentage accuracy before the 30 epochs. When the research is completed, we will feed the tested images to the model that we have trained using known and unknown face's label. The predicted image in Fig. 16 and 17 is well identified by the model as we trained the model data, which is the Fig. 16 predicted to known faces and Fig. 17 is predicted unknown faces. The prediction images are shown in Fig. 16 and Fig. 17.  Based on the comparison of face recognition approach in the Table V, it is clear that the all mentions methods capable to recognize face recognition but the accuracy is differed. Author from [68] mention that the highest percentage by using ANN approach is 80%. Meanwhile a study did by [69] had mentions that the experiment by using Multi-class SVM achieve until 96%. Other than that, a study did by [70], had mention their approach got the 77% similarities by using PCA integrated with Eigenface approach. From those comparisons, our proposed approach shows that the CNN can achieve a higher face recognition accuracy than others. As such, it can be concluded that the CNN can promote the performance of face recognition due to the availability of the many features. VI. CONCLUSIONS Convolution neural networks have become the main technique in the field of face recognition. In this research paper, implements a CNN, which automatically trained the given dataset to predict the classification of images. These models predict an accurate output result by using every face angle from the front, right (30-45 degrees) and left (30-45 degrees). and give significant performance. On the other hand, it will lead to further development for face recognition using deep learning. From the model training experiment point of view, we can conclude that our data set can produce good results and through the data set the model can differentiate between two different data in high accuracy prediction. Hence CNN is a good technique for face recognition technology.
However, we can have a better result if we divide our data set into 70: 20: 10 ratios. 70 % is for training folder data set, 20% is for validation folder data set and 10% for the test folder data set. We are using 255 images for this experiment. We can have a better prediction if we use the bigger data set.

VII. FUTURE WORK
The research should be further developed in a large dataset to make sure this research can be implemented in real vehicles for safety purposes. Moreover, this paper aims to use face recognition technology in scientific and daily life applications for locking and unlocking autonomous vehicles. In the near future, face recognition technology will become a common approach in many applications. With the Covid-19 pandemic that happens in this world right now where all people are wearing mask wherever they go outside to public area, its hard to identify a person with a mask covering half of their face. The researcher should consider the user in pandemic situation. Furthermore, additional algorithms need to be used and conducted to improve user experience especially for disabled users. www.ijacsa.thesai.org the Government of Malaysia which provide MyBrain15 program for sponsoring this work under the self-fund research grant and L00022 from Ministry of Science, Technology and Innovation (MOSTI).