Camera Mouse Including “Ctrl-Alt-Del ” Key Operation Using Gaze, Blink, and Mouth Shape

This paper presents camera mouse system with additional feature: "CTRL - ALT - DEL" key. The previous gaze-based camera mouse systems are only considering how to obtain gaze and making selection. We proposed gaze-based camera mouse with "CTRL - ALT - DEL" key. Infrared camera is put on top of display while user looking ahead. User gaze is estimated based on eye gaze and head pose. Blinking and mouth detections are used to create "CTR - ALT - DEL" key. Pupil knowledge is used to improve robustness of eye gaze estimation against different users. Also, Gabor filter is used to extract face features. Skin color information and face features are used to estimate head pose. The experiments of each method have done and the results show that all methods work perfectly. By implemented this system, troubleshooting of camera mouse can be done by user itself and makes camera mouse be more sophisticated. Keywords—Camera Mouse; User Gaze; Combination keys function


INTRODUCTION
Camera mouse that is a pointing device which utilizes only camera have recently been gaining a great deal of attention.Camera mouse likes usual mouse.However, no additional hardware is required except camera only.Recently, camera mouse is become good issues because of its function can be used for handicap person.Typically, mouse is controlled by hand, but the camera mouse can be controlled without touching anything.Only web camera is required for using this system.The function of the camera mouse is also same with typically mouse, but further function also can be used for handicap person who cannot use their movement organ.Mostly all types of mouse are always used as pointing tool.This pointing tool allows people with disability uses computer and also other tools.
Typically, output of mouse can be divided into: (1) pointing value, (2) left click (making decision), (3) right click (showing menus), and (4) scroll.The problem of existing camera mouse methods are still cannot replace all functions of ordinary mouse.Most of them only yield pointing and decision values.Also, because of users cannot use their hand, it is necessary to create "CTRL -ALT -DEL" key in order to end task not responding program.This function allows users do troubleshooting by themselves.Moreover, camera mouse have to robust against various users, head movement, illumination changes, and accuracy.The published camera mouse methods can be broadly classified into following approaches, [6], (method of using selected face image and then follow its motion),
Mouse utilizes web camera have been proposed [1].Mouse pointer follows face areas which have been selected first.Clicking is done when mouse pointer stops with specific time.Camera mouse that lets users move the cursor around with their head is proposed [2].Camera mouse for people with severe disabilities is proposed [3].The system tracks the computer user's movements with a video camera and translates them into the movement of the mouse pointer on screen.Visual tracking algorithm is based on cropping an online template of the tracked feature from current image and finds the new location of this by using template matching method.Mouse pointer controlled by gesture is proposed [4].This system can be controlled by moving user's hand or face.Camera mouse controlled by user gaze is proposed [5].This system uses eye tracking to find eye location.Infrared illumination is used to estimate dark pupil location and used to estimate user gaze.Camera mouse driven by 3D model based visual face tracking technique is proposed [6].The human facial movement is decomposed into rigid movement, e.g.rotation and translation, and non-rigid movement, such as the open/close of mouth, eyes, and facial expressions.Camera mouse based on face and eye blinking tracking are proposed [7].Face features and detect eye blinking are tracked using combination between optical flow and Harris corner detection.Camera mouse based on eye gaze is proposed [8].The mean shift algorithm is used to track the detected eyes.Motion based camera mouse is easy to developed.The weakness of this approach requires user movement, it is difficult to use for users who cannot move their organs.Even though gaze approach is more difficult to developed, it has high sensitivity and is convenience.
In this paper we proposed camera mouse based on user gaze with combination key: "CTRL -ALT -DEL".The objective of this research is to overcome troubleshooting when computer not responding by providing "CTRL -ALT -DEL" key.Moreover, we improve robustness against different users, illumination changes, head movement, and accuracy.Our proposed system utilizes single IR camera which put on top of www.ijacsa.thesai.orgscreen display.The IR camera will improve robustness against illumination changes.User gaze which used to control pointing value is estimated based on eye gaze and head pose.In order to create "CTRL -ALT -DEL" key, we use openclosed mouth detection.When system runs in the first times, Haar-classifier is used to detect eyes location.After eyes location is found, roughly position of mouth estimated based on eye location.The detail position of mouth and eye corners position is estimated by using Gabor filter.After all face features are found, optical flow Lucas-Kanade method is used for tracking them."CTRL -ALT -DEL" key is created by utilizing open-closed shape of mouth.When mouth is open, it means that SHIFT is selected.When mouth is open and left eye is closed, it means that CTRL, ALT, and DEL are selected.In order to avoid disturbance (noise when user talking), these conditions are selected when it is done within two seconds.Improvement of robustness against various uses is done by using pupil knowledge.Pupil knowledge such as shape, size, and color are used as first knowledge.When first knowledge fails, sequential of pupil location is used as next knowledge.Last, when all steps fail, pupil is estimated based on their motion.Also, improvement against head movement is done by estimating gaze also based on head poses.The camera mouse system by user gaze is proposed in Chapter 2.Moreover, blink and mouth detection methods which are the important component engineering of proposed system are explained in full detail.Chapter 3 shows the experiment method and a result, and show that a proposed system can work very well.Then, the remarks that finally relate to an experimental result are described.

II. PROPOSED METHOD
The problem of the utmost importance of a proposed system camera mouse system is combination keys function.In order to consider that it is free hand touch, it is important to create combination key: "CTRL -ALT -DEL" in order to avoid undesired computer condition.When PC system is halt, all programs will halts.The only way is press CTRL -ALT -DEL to end not responding program.Even though camera mouse program also possible halts, it is possible to setup our program using another memory location and make it always runs.Therefore, the camera mouse system to propose aimed at providing the combination key: "CTRL -ALT -DEL" using eye-mouth combination.

A. System Configuration
The configuration of the proposed system is shown in Figure 1.As shown in the figure, a user looking at the screen while camera put on top of screen display.Only camera which mounted on top of screen display is required.Optiplex 755 dell computer with Core 2 Quad 2.66 GHz CPU and 2G RAM is used.We develop our software under C++ Visual Studio 2005 and OpenCv Image processing Library which can be downloaded as free on their website.The specification of IR camera is shown in Table 1.The pointing value is determined based on head poses and eye gaze

B. Key Cadidate Selection and Determination
Combination between eye gaze and head poses are used to obtain user gaze.The determination key is done by blink.Generally, we can use all windows programs.In order to typing, combination between Microsoft Word and windows screen keyboard can be used.Because of windows screen keyboard is used for typing, user may look at the desired keys and blinks in order to make selection.In special conditions, combination keys important to be provided.CTRL -ALT -DEL combination keys can be done by using eye and mouth shape.Unfortunately, user also may open their mouth unintentionally.Windows screen keyboard.This software allows user to type onto MS Word.
For instance, when user become sleepy and open their mouth, when user says something, etc.In order to avoid conflicts with these situations, our mouth shape is created with sequential states.

User IR Camera
Eye detection and tracking

Eye Gaze Estimation
Open-Closed Mouth detection

Blinking detection
Head Poses Estimation www.ijacsa.thesai.org In our system, "CTR -ALT -DEL" key is done when mouth shape is closed-open-closed during for two seconds and end with left eye blinks.

C. Gaze Estimation
In order to analyze eye gaze, eye should be detected and tracked first.Figure 3 shows flow of eye detection and tracking.The published eye detection approaches can be broadly classified into two categories: the active infrared (IR)based approaches [10], [11], [19] and the traditional imagebased passive approaches [12], [13], [14], [15], [16], [17], [18].Eye detection based on Hough transform is proposed [12], [13], [14].Hough transform is used in order to find the pupil.Eye detection based on motion analysis is proposed [10].Infrared lighting is used to capture the physiological properties of eyes (physical properties of pupils along with their dynamics and appearance to extract regions with eyes).Motion analysis such as Kalman filter and Mean Shift tracking are used.Support vector machine classifier is used for pupil verification.Eye detection using adaptive threshold and Morphologic filter is proposed [15].Morphologic filter is used to eliminate undesired candidates for an eye.Hybrid eye detection using combination between color, edge and illumination is proposed [18].Eye detection based on motion analysis [10], [11] will fail when eyes are closed or occluded.
In our system, we cannot use this method.Our pupil detection have to works when eye shape changes.When eyeball moves, eyebrow and pupil will also move and yield two kinds of motion.Because of other eye components move when eye moves, the ambiguity problem to distinguish between eyebrow and pupil will happen.Eye detection based on Hough transform [12], [13], [14] is susceptible against noise influence.Eye detection using morphological filter [15] which eliminate noise and undesired candidate of eye will not robust against user variance.Morphologic method will not work when noises have same shape and size with pupil.Eye detection based on template matching [16], [17], segments of an input image are compared to previously stored images, to evaluate the similarity of the counterpart using correlation values.
The problem with simple template matching is that it cannot deal with eye variations in scale, expression, rotation and illumination.Use of multi scale templates was somewhat helpful in solving the previous problem in template matching.When eye detector only relies on eye appearance [18], this method will fail when eye unseen or closed.This method also will be faced on variance user color skin.Basically, our system detects eye based on deformable template method [20].This method matches between eye template and source images.We create template by using Gaussian smother onto this image.Deformable template method detects roughly position of eye.Benefit of deformable template method is that it takes less time than classifier methods.
Although this method faster than classifier type, the robustness is still less.In our system, when deformable template fails to detect eye position, viola-Jones classifier will detects eye.It means that Viola-Jones method is used only when deformable template fails to detect eye.The viola-Jones classifier employs adaboost at each node in the cascade to learn a high detection rate the cost of low rejection rate multitree classifier at each node of the cascade.To apply the viola-Jones classifier onto system, we use viola-Jones function in OpenCV [21].Before use the function, we should create XML file.The training samples (face or eye image) must be collected.There are two samples: negative and positive sample.Negative sample corresponds to non-object images.Positive sample corresponds to object image.After acquisition of image, OpenCV will search the face center location followed by searching the eye center location.By using combination between deformable eye template and Viola-Jones method, eye location easier to be detected.Advantages of these methods are fast and robust against circumstances change.Fig. 3.

Eye detection flow
After the roughly eye position is found, next position of eye is tracked using Lucas-Kanade optical flow method.It means that we do not repeat the eye detection again.Eye gaze is estimated based on pupil location.Because of this system rely on the pupil location, pupil detection with perfectly accurate and robustness is required.Pupil is detected by using its knowledge.Flow of pupil detection is shown in Figure 4. Three types of knowledge are used.We use pupil size, shape, and color as the first knowledge.First, adaptive threshold method is applied onto eye image.Threshold value T is obtained from average pixel value (mean) of eye image μ.We set threshold value is 27% bellow from mean.
Pupil is signed as black pixels on image.In the first case, when the pupil clearly appears on eye image, the result of adaptive threshold itself is able to detect pupil location.Pupil is marked as one black circle on image.By using connected labeling component method, we can easily estimate the pupil location.While noise appears on image, we can distinguish them by estimate its size and shape.Next case, when eye is looking at right or left, the form of eye will change.This condition makes the pupil detection is hard to find.Noise and interference between pupil and eyelid appear.This condition brings through others black pixels which have same size and

Yes
No www.ijacsa.thesai.orgshape with pupil.To solve this problem, we utilize the previous pupil location.The reasonable pupil location is always in surrounding previous location.Last case, when all steps above fail to detect pupil location, we estimate pupil location by its motion.This situation happens when the black pixels mixed with other black pixel or no black pixels at all on image.We put this knowledge as last priority to avoid ambiguity motion between pupil and other eye components.We monitor pupil location using its previous location.We adopt Kalman filter [22] to estimate pupil location.A simple eye model is defined on Figure 5.The eyeball is assumed to be a sphere with radius R. Actually, it is not quite a sphere but this discrepancy does not affect our methodology.The pupil is located at the front of eyeball.
The distance from the center gaze to current gaze is r.Gaze is defined as angle θ eye between normal gaze and r.The relation between R, r and θ eye is expressed in equation ( 3) and (4), respectively.The radius of the eyeball ranges from 12 mm to 13 mm according to the anthropometric data [23].Hence, we use the anatomical average assumed in [24] into our algorithm.Once r has been found, gaze angle θ eye is calculated easily.In equation 4 is shown that eye gaze calculated based on r value.In order to measure r, the normal gaze should be defined.In our system, when system starts running, the user should looks at the center.At this time we record that this pupil location is normal gaze position.In order to avoid error when acquiring normal gaze, normal gaze position is verified by compare between its value and center of two eye corners.

D. Head Pose Estimation
Head poses estimation is intrinsically linked with visual gaze estimation.Head pose provides a coarse indication of gaze that can be estimated in situation when the eyes of a person are not visible.When the eyes are visible, head pose becomes a requirement to accurately predict gaze direction.
The conceptual approaches that have been used to estimate head pose are categories into [25] (1) Appearance template methods, (2) Detector array methods, (3) Nonlinear regression methods, (4) Manifold embedding methods, (5) Flexible models, (6) Geometric methods, (7) Tracking methods, and (8) Hybrid methods.Our proposed head pose estimation based on Geometric methods.Face features such as eyes location, the end of mouth, and boundary of face are used to estimate head pose.After eyes location are found, mouth location is presume based on eyes location.Two of end of mouth are searched by using Gabor filter.The boundary of face is searched based on skin color information.From eyes and two end of mouth, face center is estimated.Head pose is estimated by difference between face center and face boundary.Head pose estimation flow is shown in Figure 6.
Head pose can be calculated with equation ( 5) to (8).where f c is face center, eyeL and eyeR are locations of left and right eyes, mthL and mthR are left and right mouth locations, f h is head center, f top , f btm , f L , and f R are top, bottom, left, and right head boundary locations.r ch is distance between face and head centers.R head is radial of head.θ head is head pose angle.

E. Mouth Detection
Open-closed mouth shape is one of feature that will be used to create "CTRL -ALT -DEL" key.When left eye closed while mouth is open (during for two seconds), system will send "CTRL -ALT -DEL" command.This combination keys is important when computer halt.Our mouth shape detection based on skin color information.Normalized RGB based skin color detection is used to extract mouth [26].Because of mouth was surrounded by skin, mouth shape will easily to detect by removing skin color.

F. Blink Detection
Considering of the blinking application, the accuracy and success rate become important.Many techniques based on image analysis methods have been explored previously for blinking detection.Blinking detection based on Hough Transform in order to find iris location is proposed [27].Eye corners and iris center based detector is proposed [28].Blinking detection system uses spatio-temporal filtering and variance maps to locate the head and find the eye-feature points respectively is proposed [29].Open eye template based method is proposed [30].Using a boosted classifier to detect the degree of eye closure is proposed [31].Lucas-Kanade and Normal Flow based method is proposed [32].
The methods above still did not give perfectly success rate.Also, common problem is difficult to implement in real time system because of their time consumption.Because of these reasons, we propose blinking detection method which able to implement in real time system and has perfectly success rate.Basically, our proposed method is measure percentage of open-closed eye condition by estimating the distance between topside of arc eye and the bottom side.Gabor filter is used to extract arcs of eye.By utilize connected component labeling method, top-bottom arcs are detected and measured.The distance between them can be used to determine the blinking.Blinking detection flow is shown in Figure 7.

III. EXPERIMENTS
In order to measure the performance of our proposed system, each block function is tested separately.The experiments involve eye gaze, head pose estimation, blinking detection, and open-closed mouth detection.

A. Gaze Detection
Eye gaze estimation experiments include eye detection success rate against different user, illumination influence, noise influence, and accuracy.The eye detection experiment is carried out with six different users who have different race and nationality: Indonesian, Japanese, Sri Lankan, and Vietnamese.We collect data from each user while making several eye movements.Three of Indonesian eyes who have different race are collected.The collected data contain several eye movement such as look at forward, right, left, down, and up.Two of Indonesian eyes have width eye with clear pupil.Numbers of images are 552 samples and 668 samples.Another Indonesian eye has slanted eyes and the pupil is not so clear.Numbers of images of this user are 882 samples.We also collected data from Sri Lankan people.His skin color is black with thick eyelid.Numbers of images are 828 samples.The The first experiment investigates the pupil detection accuracy and variance against various users.We count the success samples followed by counting the success rate.Our method is compared with adaptive threshold method and Template matching method.The adaptive threshold method uses combination between adaptive threshold itself and connected labeling method.The template matching method use pupil template as reference and matched with the images.The result data is shown in Table 2.The result data show that our method has high success rate than others.Also, our method is robust against the various users (the variance value is 16.27).
The next experiment measures influence of illumination changes against eye detection success rate.This experiment measures the performance when used in different illumination condition.Adjustable light source is given and recorded the degradation of success rate.In order to measure illumination condition, we used Multi-functional environmental detector LM-8000.Experiment data is shown in Figure 8.
Data experiment show that our proposed method allow works with zero illumination condition (dark place).This ability is caused of IR light source which automatically adjust the illumination.Our proposed method will fail when illumination condition is too strong.This condition may happen when sunlight hit directly into camera.The next experiment measures noise influence against eye detection success rate.Normal distributed random noise is added into image.We add noise with mean is 0 and standard deviation of normal distributed random noise is changed.Robustness due to noise influence is shown in Figure 9.The experiment data show that our system robust enough due to noise influence.Noise influence against eye detection success rate The next experiment measures accuracy of eye gaze estimation.Users will look at several points which are shown in Figure 10 and calculate the error.Error is calculated from designated angle and real angle.The errors are shown in Table 3.These experiment data show that our proposed system has accuracy 0.64 o .www.ijacsa.thesai.org

B. Head Pose Estimation
This experiment measure accuracy of head poses in yaw and pitch directions.In order to measure head pose, head center is estimated first.Head center is estimated by using skin color information.RGB normalization method is used to detect skin color Head center estimation using skin color information Next is estimates face center.Face center is estimated using four face features (two ends of eyes and two ends of mouth).Face features are extracted using Gabor Filter.Face feature extracted using Gabor filter.Gabor image is shown in left side.
By using head center and average position from four face features, head pose is estimated.Head pose estimation; head pose is estimated using center of four face features and head center.
The complete experiment of head pose estimation is shown in Table 4.The result show that error will increase when yaw angle rises.This condition is caused by when head move with high angle, ear will visible.Because of ear color same with skin color, it causes result of head center estimation shift and obtains error.

C. Blink Detection
This measurement is conducted by counting the success and fail ones when position of user face is constant, but eye gaze changes.IR camera is put on the top of monitor display and the user face looking forward.Distance between camera and user is 30 cm.This experiment is shown in Figure 14.We recorded the success rate when user face position is constant and eye gaze angle changed.The result is shown in Table 6.The results show that our method will give 100% success rate when pitch angle is below than 15 o and the maximum of roll angle is 30 o .When pitch angle of eye gaze exceed 15 o , the success rate become decrease.It happens because method fails to distinguish between open and closed eyes images.Alteration of eye images when eye gaze changes are shown in Figure 15.This figure shows that when pitch angle become greater, eye will look like closed eye.This condition will not benefit for our method.16 shows the open-closed mouth shape.Experimental data show that our mouth states detection works very well.

IV. CONCLUSION
Each block of camera mouse with "CTRL -ALT -DEL" key have been successfully implemented.Estimation of eye gaze method shows good performance is robust to various users, illumination changes, and robust showing success rate greater than 96% and variance is small enough.Open-Closed Mouth Shape.Mouth shape is extracted by using Gabor filter.Left side is output of Gabor filter and right side is mouth images.Also, estimation of head pose is simple.However, it shows good performance when used to estimate in yaw and pitch directions.Furthermore, blinking and mouth shape detections show good performance when detect open-closed eyes and mouth conditions.Moreover, all of our methods do not require any previous calibration.Beside can be used for creating "CTRL -ALT -DEL" key, it also can be used to create other combination keys.By implemented this system, camera mouse will be more sophisticated when it is used by disable person.

Fig. 2 .
Fig.2.Windows screen keyboard.This software allows user to type onto MS Word.

Fig. 4 .
Fig.4.Flow of Pupil detection methods.Pupil detection works using three steps: (1) based on size, shape, and color, (2) based on its sequential position, and (3) based on the motion.

Fig. 6 .
Fig.6.Head pose estimation flow, head pose is determined based on face and head centers.

Fig. 7 .
Fig.7.Blink detection flow.Gabor filter is used to extract arcs of eye.
Eye image www.ijacsa.thesai.orgcollected data of Japanese is bright with slanted eyes.Numbers of images are 665 samples.

Fig. 8 .
Fig.8.Illumination Influence, this figure shows that our proposed method allow works with minimum illumination.

Fig. 10 .
Fig.10.Experiment of eye gaze accuracy.Five points are used to measure the accuracy.

Fig. 11 .
Fig.11.Head center estimation using skin color information

Fig. 12 .
Fig.12.Face feature extracted using Gabor filter.Gabor image is shown in left side.

Fig. 13 .
Fig.13.Head pose estimation; head pose is estimated using center of four face features and head center.

Fig. 16 .
Fig.16.Open-Closed Mouth Shape.Mouth shape is extracted by using Gabor filter.Left side is output of Gabor filter and right side is mouth images.

TABLE II .
ROBUSTNESS AGAINST VARIOUS USERS, THIS TABLE SHOWS THAT OUR METHOD ROBUST ENOUGH AGAINST VARIES USER AND ALSO HAS HIGH SUCCESS RATE

TABLE III .
EYE GAZE ESTIMATION ACCURACY

TABLE IV .
HEAD POSE ESTIMATION IN YAW DIRECTION

TABLE VI .
BLI9NK DETECTION ACCURACY D. Mouth Detection Performance of mouth shape detection is measured by detecting open-closed mouth shapes.User does 10 times openclosed condition while system analyzed it.Figure