Espousing AI to Enhance Cost-Efficient and Immersive Experience for Human Computer Interaction

—Because of recent technological and interface advancements in the field, the virtual reality (VR) movement has entered a new era. Mobility is one of the most crucial behaviours in virtual reality. In this research, popular virtual reality mobility systems are compared, and it is shown that gesture control is a key technology for allowing distinctive virtual world communication paradigms. Gesture based movements are very beneficial when there are a lot of spatial restrictions. With a focus on cost-effectiveness, the current study introduces a gesture-based virtual movement (GVM) system that eradicates the obligation for expensive hardware/controllers for virtual world mobility (i.e., walk/ jump/ hold for this research) using artificial intelligence (AI). Additionally, the GVM aims to prevent users from becoming dizzy by allowing them to change the trajectory by simply turning their head in the intended direction. The GVM was assessed on its interpreted realism, presence, and spatial drift in the actual environment in comparison to the state-of-the-art techniques. The results demonstrated how the GVM outperformed the prevailing methodologies in a number of common interaction components. Additionally, the empirical analysis showed that GVM offers customers a real-time experience with a latency of ~65 milliseconds.


I. INTRODUCTION
Virtual reality (VR) has been around for years, but it has only lately piqued the attention of customers and professionals as the technology grows increasingly economically viable. VR economies are exploding, with the overall global demand estimated to approach four billion revenues by 2025, involving 45 million VR headset deployed and a global population coverage of 3% [1]. Human-computer interaction approaches in the earlier years compelled human behaviour to conform to the computer's capabilities; however, VR perspective is unique in that the computer now must mirror the actual environment to deliver the most authentic view feasible. To provide individuals an immersive experience, VR develops a variety of participation activities relating to visual, auditory, and tactile sensitivities. Widely available methodologies for VR mobility are heavily reliant on a controller to explore and move, or actual relocating in a constrained geographic space, disregarding proliferating necessities on the strategy for travelling an unregulated virtual space by physically strolling the user's legs, which causes fatigue [2]. The most extensively used VR movement methods are listed below.

1) Gadgets:
A frequent strategy for navigation in the VR world is to use gadgets such as joysticks and head orientation tracking with Gyro in VR head mounted displays. For consumers focused on control movement in VR, these gadgets are intuitive and comfortable, straightforward to use, and productive. However, because of a perceptual mismatch [3] between visual and vestibular inputs [4], joysticks frequently influence the vision to act swiftly [5] and erratically, creating dizziness [6].
2) Teleportation: Another typical strategy for reducing dizziness is to provide many gateway locations allowing players to swiftly move from one location to the next. Unfortunately, due to the discontinuous movement that negatively impacts the user's experience and may induce vertigo, these tactics are not organic enough to boost the interactive experience in the virtual environment [7].
3) Walking-in-place (WIP): The WIP approach allows users to travel in a specific location while controlling the character's motion and orientation using real body gesture detection sensors such as Microsoft Kinect [8]. This technique enhances the matching among mechanoreceptors of data from a person's body movements and tactile senses through machine screens, rendering it more natural and potentially lowering operator dizziness. Nevertheless, this technique requires the user to remain in one place and use their entire body, as well as a large amount of underlying hardware, that are costly and not available to all. A good travel experience, on the other hand, must cause less fatigue in a walk-through arrangement [9].

4) Hand gestures:
In virtual reality, a gesture is a stance or motion of the user's body which is employed as input. The The authors in current research introduce a gesture-based virtual movement (GVM) system to facilitate an inexpensive solution for supporting individuals with walk-through activities in virtual worlds, which allows customers to unwind while sitting or standing in a place as if they're in reality

A. Key Contributions
The authors' goal in this study is to enrich the user's immersive experience. The following are the major findings of this research.

1) Cost efficiency:
GVM is a low-cost solution that eliminates the requirement for any additional costly gesture recognition gear.
2) Purging dizziness: GVM reduces dizziness by letting users modify their trajectory by merely tilting their head in the desired direction and hand movements for gesture recognition to move in VR.
3) Handling strain: GVM relieves the user of physical strain.

4) Usability:
The usefulness of the suggested approach is demonstrated by user input on several factors such as interpreted realism, presence, and spatial drift in the real world.
5) Real-time experience: With a latency of ~65 milliseconds, the suggested system offers consumers a realtime experience.

B. Paper Organization
The manuscript is further divided into sections. Section II presents a brief literature survey of the various VR techniques. Section III introduces the proposed model, GVM. Section IV explains the experimentation done and the results achieved that highlights the suitability of GVM. Section V concludes this research. Finally, Section VI highlights the future work. Table I shows a comparison of various widely used VR movement methods based on the dimensions of motion sickness and physical strain. Hand gestures have been shown to be a remedy for motion sickness and physical strain; however, using a hand gesture detection system necessitates the acquisition of expensive gears. Thus, the authors introduce a GVM system to facilitate an inexpensive solution for supporting individuals with walk-through activities in virtual worlds, which allows customers to unwind while sitting or standing in a place as if they're in reality. Mine [12] proposes using hand-based communication to manage mobility and walk-through in a simulated world. An elevated hand-gesture tracer gadget, such as Leap Motion, is a unique technology which delivers input via hand gesture mapping, allowing for bare-hand interactivity [13] in a threedimensional world. Ni et al. [14] investigates menu selection employing freehand signals, whereas Kulshreshth et al. [15] provides the findings of the first thorough research on fingercount panels to assess their suitability for 3D menu choice applications. Beattie et al. [16] demonstrates a CAD Engagement Facility that allows users to deconstruct a kinematic model in virtual reality and operate and analyse constituent parts. Lee et al. [17] offer TranSection, a handbased communication strategy for executing a strategy game in virtual reality. Salomoni et al. [18] describes research in which recreational virtual world interfaces are reconsidered in view of the rise of head-mounted displays. These concepts, unfortunately, do not yet include how to handle walk-through activity in a simulated world.

II. LITERATURE SURVEY
Numerous studies have investigated ways to execute a natural and pleasant interaction approach in VR employing Leap Motion to solve this research gap. Codd-Downey et al. [19], for instance, offers a finger tracking movement approach that uses a 2DOF driving paradigm like typical mouse and keyboard control in 3D computer gaming. Khundam [20] presents a novel engaging single-hand-gesture control drive system with palm norm. The results reveal that controlling tour activity with hand gestures is more natural than to use a joystick. There are several aspects of VR controller hardware for diverse approaches, and some studies have developed a system that gathers multiple devices for a certain objective. The Oculus Rift and Leap Motion have lately been employed in several studies, particularly in virtual reality. Programmers are particularly interested in studying usage patterns and determining what the most productive utility for them in the future is through VR engagement.
Prior studies on in-air controllers and hand monitoring intended to develop and deploy VR applications. The precision of hand monitoring is critical for a reliable system. Sato et al. [21] provide a technique for monitoring a user's hand in three dimensions and identifying hand gestures in real time even without any intrusive sensors connected to the hand. Several cams are used to assess the location and direction of a user's hand floating in 3D environment. A neural network that is adequately trained recognises specified motions in a rapid and reliable fashion. 3D item processing for a desktop machine and www.ijacsa.thesai.org 3D movement for a big holistic projection system are two typical applications. Many studies have been done on hand gestures and their uses. Chastine et al. [22] describe research comparing single hand gestures to typical keyboard, mouse, and controller input of first-person gameplay. The purpose of this study is to enable game analysts, architects, and builders to better understand how to include gesture control in current applications. The findings demonstrate that in FPS games, human rehearsals are crucial for gestural-based gaming system performance. As people continued through the activities, users were increasingly skilled at using the gadget, indicating that gesture-based handling can be used by users with no prior knowledge. This feedback helps programmers to employ Leap Motion as a device in virtual reality and ensures that there is a compelling incentive for them to do so in the long term.
Many people use virtual reality headsets to interact with 3D models. Stefan Greuter and David J. Robert [23] present the SpaceWalk technology. This system, which consists of two hardware devices: a motion sensing unit and a cordless VR gear, allows for low-weight full-body VR experience while wandering around the living area. The preponderance of the equipment in this system are made up of an Oculus Rift (DK1) HMD and a backpack tablet which operates standard VR program (Unity3D) alongside their extension script that connects all the elements. Participants may move and engage with things in the virtual world in this research's living area, however this framework is not designed for huge VR environments. Webel et al. [24] describe how to build a moderate, fully interactive, stochastic virtual world setup that allows users to naturally perceive intangible cultural assets. They look at new technology including the Oculus Rift virtual reality headset, Microsoft Kinect, and the Leap Motion controllers. When it comes to constructing HMD VR situations, modern technologies such as the Oculus Rift HMD, Microsoft Kinect, and Leap Motion provide excellent results.
The usage of the Kinect or Leap Motion in conjunction with organic conversational inputs lets users engage directly with the virtualized world. However, because of the user's movement control, this VR system is generally limited to the comparatively small region in front of the sensing element. As a result, adopting engaging hand gestures for motion in VR will increase the VR system's admin tools via rigorous positioning and replacing previous techniques.
Users can employ an expanding number of input gadgets to engage with systems and apps. When building applications for technological innovations, though, there are no defined interface guidelines or benchmarks, and the customer satisfaction suffers the consequences. Jake Araullo and Leigh Ellen Potter [25] give a study that investigates the perspectives of a set of people who used the Oculus Rift and the Leap Motion device to play. The incorporation of blended conventional and non-traditional input methods, as well as depending on existing interface paradigms when leveraging innovative methods, were found to have a detrimental impact on system adoption in this study.
The present research proposes a gesture-based virtual movement (GVM) system that eliminates the need for pricey equipment for immersive virtual movement (i.e., walk/jump/hold for this research) with a focus on affordability. By enabling users to alter the trajectory by merely rotating their head in the desired direction, the GVM also seeks to prevent users from feeling dizzy.

III. PROPOSED MODEL
The goal of authors is to employ user hand gestures to create movement in the virtual environment. The suggested GVM's overall process flow is shown in Fig. 1. The overall procedure is segmented into the following:

A. Input Processing
The suggested approach uses streamed live video as an input. The footage is divided into frames at a 60 frames per second rate rather than being supplied directly to the model. Webcam data is used to provide each frame to the model that are further fed to an AI model for recognizing the hand landmarks.

B. Gesture Identification
An artificial intelligence (AI) model built on top of MediaPipe's [26] recognises the hand motions. A platform for creating pipelines that do interpretation over any type of sensory input is entitled MediaPipe. The AI model works in two phases i.e., palm detection and hand land-marking achieved through a palm detector and hand landmark model, respectively.
• Using an aligned hand bounding box, a palm detector identifies palms on a whole input picture. A single-shot detector model tailored for cellular real-time is utilised to find the first hand placements.
• A hand landmark model which generates highdefinition 2.5D landmarks based on the palm detector's clipped hand bounding box. After detecting the palm across the entire picture, a second hand landmark model uses regression, or direct location projection, to carry out exact feature point placement of 21, 3D handknuckle positions inside the identified hand areas. Only six of them were used for the suggested gestures model, as seen in Fig. 2. The model acquires a reliable inherent hand posture depiction and is unaffected by selfocclusions or semi-transparent hands. In every instance, the landmarks are almost perfectly spotted. Rather than employing hard computing, authors have opted to soft computing to identify gestures more precisely. The proportionate placements of various landmarks serve as the basis for codes. Utilizing relative locations, authors programmed three distinct gestures: hold, move, and jump.
• Hold: In this gesture, the radius is formed by the line connecting the wrist and the tip of the index finger, as shown in Fig. 3(a).
• Jump: In this gesture, the radius of the circle is established by the line connecting the wrist and the pip of the index finger, and the tips of the remaining fingers are contained within the circle, as seen in Fig. 3(b).
• Move: In this motion, all finger tips are located outside the circle, with the radius being the line between the wrist and the tip of the index finger (see Fig. 3(c)).
Every live streamed hand gesture is labelled as either one of three (hold/jump/move) and further the gesture calculations are used to classify the gestures accurately. The gestures calculation starts with the identification of Euclidian distance, between the coordinates of wrist ( , ) and index finger pip ( , ) as per equation (1) = After the is calculated, the behavior of index (j), middle (k), ring (l) and pinky (m) fingers are identified using equation (2) and (3) The gesture, Ω is the calculated based on equation (4)

C. Database Interaction
The AI Model [27,28] subsequently sends the gesture to the real-time database (Firebase in current research), which updates the motion parameter with the potential movement gestures (isMove, isHold or isJump). The Firebase database gives the system the most recent value of the information as well as modifications to that information by using a single API. The clients are able to retrieve their data from any platform, including the web and mobile devices, owing to real-time synchronization.
On the other side, the Unity3D Engine [29] is coupled to the real-time database. The Unity3D engine serves as the base layer for the present VR experience. Additionally, C# scripting is used to fetch data from the real-time database each time a database update is triggered. www.ijacsa.thesai.org

D. VR Realization
The user has complete freedom to roam around the area and may utilise gestures to commence any movement. In the virtual environment, neck movement provides the directional input. Users of Virtual Reality (VR) may freely spin their heads 160° while viewing the surroundings owing to rotational tracking (as presented in Fig. 4).  Fig. 4). The avatar travels both in translation and rotation inside the VR environment based on the data received.

IV. EXPERIMENTS AND RESULTS
Current research's objective is to assess the system behavior, empirical characteristics, and experience aspects that are most important for VR locomotion [30]. To evaluate the efficacy of GVM, a comparison research using four approaches i.e., walking-in-place, controller/joystick, teleportation, and GVM (the proposed approach) is conducted. Current research investigates the propose model on two aspects i.e., 1) Latency and 2) User Experience.

A. Environmental Setup
The HTC Vive headgear and Epic Games' Steam VR SDK for Unreal Engine 4 were used in the development of the experimented-with VR locomotion methods. With a display resolution of 1080 x 1200 (2160 x 1200 combined pixels), 90 Hz refresh rate, 110 field-of-view, and complete 360 roomscale human monitoring, the HTC Vive headgear allows highfidelity visuals. It is well known in the commercial VR industry and is made to use room-scale equipment, which uses sensors to transform a place into a 3D world. The HTC Vive monitoring system, an extra sensor that can be utilized to monitor tangible goods and translate them into activities or items in the simulated space, is supported by the system. Using a pristine HD 720p/30 fps camera with a diagonal field of view of 55 degrees and automatic light adjustment, the Hand Gesture Detection feature of Logitech C270 Digital HD webcam is employed.

1) Walking in place:
The participant's limb motions during walking in place must be converted into virtual reality activity. The participants' right foot-mounted HTC Vive tracker and HTC Vive controllers were used to record and, respectively, manage the VR movement velocity and direction. The VR movement velocity is closely correlated with the users' actual walking speed; that is, the quicker the participants moved around in actual situations, the quicker their avatars moved in the simulated space. Right footstep speed is used to imitate left footstep speed. The HTC Vive controllers' orientation affected the motion direction. Users have to manually turn themselves in the intended way in order to adjust the movement's trajectory.
2) Controller/joystick: In this approach, the type of controller can be anything from a straightforward joystick to a gaming remote or a keyboard. To enable controller-based VR movement, the HTC Vive controllers have been used as a touchpad. Motion is initiated by tapping the touchpad, and the velocity of motion is controlled by where the thumb is placed on the touchpad. The HMD system displayed a directional line to indicate the direction of motion, which has been governed by the orientation of the HTC Vive controllers.
3) Teleportation: With this method, you may point or use a controller to indicate where you want to teleport to. The HTC Vive Controllers' grip trigger is used. Whenever the trigger is pulled, a graphical signal that showed the movement's location, a ray accompanied by a marking on the simulated ecosystem's ground appeared. The trigger is pushed to initiate movement. The teleportation's orientation has been decided by the participant's body orientation. 4) GVM: GVM is a low-cost solution that eliminates the requirement for any additional costly gesture recognition gear. It reduces dizziness by letting users to modify their trajectory by merely tilting their head in the desired direction and hand movements for gesture recognition to move in VR. GVM relieves the participant from the physical strain. The usefulness of the suggested approach is demonstrated by user input on several factors such as interpreted realism, presence, and spatial drift in the real world. With a latency of approximately 65 milliseconds, the suggested system offers consumers with a real-time experience.
The virtual 3D environment is build using Unity 3D Engine, version 2019.4.40f1 (LTS) and deployed for Android and IOS platform. A simple Unity3D scene as presented in Fig.  5 is setup for the survey having 3D assets and paths to explore.
Participants can perform various movement actions like Jump/Hold/Move within the environment and move freely. This investigation gathers information to create an assessment of the strategies' efficacy in real-world settings. It moreover gathers information through semi-structured questionnaires to create a "rich description" of the perspectives of the participants.

B. Latency
The duration that it takes for information which is fed at an end of the connection to appear at the opposite end is referred to as latency. Typically, authors gauge how long it takes for information to go from one end to the other. In this setup we actually measure the round trip time (RTT), the "latency" (time of event from real-time database to Unity3D Engine) can easily be estimated as Δi = 0.5 * RTT, where Δi represents the latency. www.ijacsa.thesai.org The tests were run with both/all clients on the same machine, located behind a 100mbits connection. For the Firebase Real-time Database, the location [us-central1] was selected. The average latency for three hundred observations is used to summarize the time offset between the data flow and RTT is calculated as approximately 65 milliseconds.

C. User Experience
Utilizing the Game Experience Questionnaire (GEQ) [31], the user experience is assessed. Due to its capacity to address a broad variety of experiential aspects with strong reliability, the GEQ is a customer experience questionnaire which has been utilised in numerous areas (including gaming, virtual reality, and location-based services) [32,33]. In numerous research on subjects including VR education [34], haptic engagement in VR [35], virtual reality orientation and mobility [36,37] and virtual reality entertainment [38], the usage of GEQ has also been validated in the VR arena. The GEQ's Competence, Sensory and Imaginative Immersion, Flow, Tension, Challenge, Negative Affect, Positive Affect, and Tiredness categories are deemed pertinent and helpful for current investigation of the underlying strategies. According to a sequence of phrases in the GEQ questionnaire, the participant has been prompted to describe how he or she experienced throughout the encounter. It had 16 assertions that have been scored on a five-point severity scale from 0 ('not really') to 4 ('strongly') and included phrases like "I forgot everything around me". At the beginning of the research, demographic information was gathered, including age, gender, regularity of VR exposure ('never, seldom, often, and every day'), and familiarity with VR technology.

1) Analysis of participants:
Within our institution zone, the participants have been sought for between October 2022 and December 2022. Participants needed to be physically capable of using VR technology, although prior VR experience wasn't really necessary. Participants have been informed of the possibility for dizziness as well as their right to withdraw from the research at any moment. To be a part of the study, every participant provided their informed permission.
The four VR locomotion strategies were tested on thirty people (N = 30, mean age: 22.7, male/female: 18/12). Twelve individuals had only sometimes used virtual reality (VR), whereas eight people had used it regularly. Ten participants had never utilized VR. Twenty participants had earlier used VR; six had done so with HMDs and portable VR headsets, eleven had done so solely with HMDs, and three had done so only with portable VR headsets. Each participant finished the episode satisfactorily.
2) Methodology: After providing the informed consent, the participants responded to demographic and VR encounter forms within approximately ten minutes' duration. Then, the participants had additional trial opportunities to discover at their leisure and witness a "clean" rendition of the VR world, that is, one that had no time restrictions and did not use the VR locomotion approach for an average five minutes). The exercise was then completed by the participants within a duration of ten minutes on average. The participants may provide vocal comments while traversing using the GVM approach, and the investigators have been taking notes in order to tackle these issues in the discussion. The GEQ questions have been completed once the work has been finished within a duration of five minutes on average.

3) Results & discussions:
There have been thirty tasks in total, one for each participant. The typical assignment took thirty-seven minutes to complete. The GVM technique stood out magnificently outstanding in the majority of the GEQ constituents (i.e., Competence, Sensory and Imaginative Immersion, Flow, Tension, Challenge, Negative Affect, Positive Affect, Tiredness) after the couple mean leader board analysis (depicted in Table II). The participants felt that WIP offered excellent degrees of immersion because of its authentic and organic movement. However, most participants indicated the approach to be exhausting due to the difficulty of translating actual body action to virtual reality motion. Others said this function added a certain amount of amusement, enjoyment, and exercise. Eventually, amidst the investigators attempting to take all necessary precautions, such as setting up a virtual boundary structure and an open area, participants still reported experiencing a pause in their exploration in the simulated space due to their fear of running into actual physical items in the real www.ijacsa.thesai.org world. It was discovered that the controller/joystick VR movement was simple to use and has been described as "pleasant," "simple," and "pleasant." However, a few users mentioned experiencing brief motion nausea at the beginning of the questionnaire job.
Owing to its visible 'jumps' and irregular mobility, teleportation was deemed the weakest engaging of the four modalities. On the contrary, the participants judged GVM to be more engaging and competent than the majority of the GEQ aspects. In addition to reducing fatigue compared to WIP, it also eliminated motion sickness brought on by hand gestures. The user had fewer difficulties using GVM because of the predetermined hand movements. The majority of participants praised GVM and rated it as the easiest and perhaps most enjoyable approach. Fig. 6 and Table III shows n normalized mean value (NMV) between 0 -10 of each technique for every GEQ Components. • Sensory and Imaginative Immersion: The findings indicated statistically significant contrasts among the four strategies for the Sensory and Imaginative Immersion aspect favouring GVM in close vicinity to WIP.
• Flow: Following the test, there have been no appreciable variations in the Flow component amongst the four strategies; nonetheless, participants gave the Teleportation approach a higher rating.
• Tension: GVM obtained the lowest mean value (3.64) in the assessment, followed by WIP, Teleportation, and Controller, in that order.
• Challenge: The challenge score between GVM and Teleportation differed significantly according to the MSR, showing that GVM (mean value: 5.02) is a less difficult approach than teleportation and others.
• Negative effect: The testing activity for Negative effect demonstrated significant differences throughout all technique analyses. In the mean assessment of GVM, WIP, Controller, and Teleportation, the GVM showed minimal negative effect. Teleportation on the other hand put extra strain on the participants due to continuous transition in the virtual world.
• Positive effect: The GVM and teleportation had greater value, but the assessment did not reveal any changes in the Positive Affect component amongst the four approaches.
• Tiredness: The Tiredness aspect exhibited substantial variations according to the assessment. In each instance, GVM, Teleportation, and Controller all scored much lower on Tiredness compared to WIP    A range of hand gestures must be recognised and reliably classified by gesture recognizers in order to provide improved user interfaces for Virtual or Augmented Reality goods. This study compares the most common virtual reality mobility systems and finds that gesture control is a crucial technology for enabling unique virtual world communication paradigms. The present research proposes a gesture-based virtual movement (GVM) system using artificial intelligence (AI) that eliminates the need for pricey equipment for immersive virtual movement (i.e., walk/jump/hold for this research) with a focus on affordability. By enabling users to alter the trajectory by merely rotating their head in the desired direction, the GVM also seeks to prevent users from feeling dizzy. In comparison to cutting-edge methods, the GVM's interpreted realism, presence, and spatial drift in the real world were evaluated. According to the empirical analysis, GVM offers customers a real-time experience with a latency of ~65 milliseconds. Additionally, the results demonstrated how the GVM outperforms the existing techniques in many standard interaction elements.

VI. FUTURE WORK
A proof-of-concept for using hand motions identified by computer vision to enable movement in a virtual world is provided by the work discussed in this paper. However, there remains lots of opportunities for enhancement and more research.
• Enhancement of Gesture Lexical Items: Authors aim to increase the number of hand gestures available for use in directing the movement of the virtual world. This can entail introducing fresh motions that let users navigate across the area or control items.
• User Interface Layout: It will be crucial to create a user interface that is simple to understand and use as the system grows increasingly complicated and featurerich. We will investigate several methods for creating a user interface that permits individuals to swiftly and simply manipulate the virtual world in upcoming work.
• User Experience: In order to enhance the user experience, we will research several ways to let users know when their gestures have been effectively identified. This can entail adding haptic or visual feedback features to let users know when they have performed a motion correctly.
• Reduce delay: Making the virtual reality environment feel more realistic by lowering system delay may significantly enhance user experience. Optimizing data communication between the AI model and the Unity3D engine is one method for lowering latency. This can entail transferring data using more effective methods or requiring less data to be transferred in real-time.
In conclusion, there are a wide range of prospective directions for further research in this field, including increasing the gesture lexicon, improving gesture detection, enhancing user interface, offering users input, and lowering latency. Future research in these areas has the potential to dramatically improve user engagement and increase the effectiveness and efficiency of the gesture-based movement mechanism.