SkySculptor: Intuitive Drone Control Through Ground-Integrated Radar and Foot Gestures in Smart Indoor Environments

—SkySculptor is a software application designed to optimize drone control in smart indoor environments. The primary focus is on using gesture input for drone control, particularly investigating mid-air free-foot interactions detected by radar sensing. This software application simplifies the process of controlling drones in smart indoor environments. Additionally, outcomes of utilizing a 15-antenna ultra-wideband 3D radar are presented, establishing a dictionary of six directional swipe gestures for controlling drone functions. Based on the findings of this research article, guidelines for the future development of software applications for drone control in intelligent indoor environments are proposed.


I. INTRODUCTION
Efficient, fluid, and expressive interactions that take advantage of body position and movement can be achieved by integrating radar gestures into smart indoor environments.This involves recognizing body position and movement and employing taxonomies to incorporate radar sensing technology [1].This integration facilitates various interactions, including controlling smart TVs [2], utilizing wearable devices for gesture detection [3], and drone interactions [4].Airborne gestures can also interact with content displayed on ambient screens [5], while foot movements offer additional avenues for interaction, covering both natural and unnatural gestures [6], [7].Symbolic touch gestures commonly used on mobile devices can also be incorporated [8].
Recognizing user gestures involves a combination of hardware devices such as accelerometers embedded in smart devices like smart rings [3], [9], [10], various recognition algorithms (e.g., KNN, Random Forests, Convolutional Neural Networks, DTW) and data sets establishing correlations between gestures and system functions.Radar devices such as Google Soli, Walabot, and Sense2Go exemplify non-contact gesture sensors that offer new interaction possibilities with smart environments.Furthermore, attaching a radar device to a drone enables the control of various objects in a smart indoor environment [11].Radar-detected gestures can range from basic presence and proximity detection to multitouch gestures or foot-based gestures [12], [13], [6].Studies on radar device interactions suggest various sets of gestures, some focusing on hand gestures [14], [15] while others advocate interactions that involve the whole body [16].These examples highlight that radar device interactions typically involve the upper body, as detailed by S , iean et al. [1].Despite these advances, the lower part of the body was overlooked in interaction with radar devices.
In our contribution, emphasis was placed on the following aspects: 1) Software design requirements were defined and incorporated into the design of an interactive system that enables drone control using a radar device integrated, attach, or placed on the floor of a smart environment.2) A set of six radar gestures executed with the foot was proposed, each associated with specific commands for drone control.3) An implementation of the software application using a radar with a 15-antenna ultra-wideband 3D radar.

II. RELATED WORK
Gesture-based drone control and radar-based gesture sensing and recognition will be our focus.

A. Gesture-based Drone Control
Research on gesture-based human-drone interaction has explored various approaches [17], [18].Gestures serve as an effective means of conveying emotions, thoughts, and nonverbal communication, and hand gestures are widely accepted as intuitive communication methods [19].Controlling drones through hand gestures offers simplicity and intuitiveness.Past studies have investigated hand gesture recognition using vision-based sensors such as RGB and depth cameras, as well as Microsoft Kinect sensors [20].However, recent developments in safe-totouch drones have spurred interest in novel forms of interaction, including direct touch and manipulation.For example, HoverBall is a ball-shaped quadcopter capable of sports or game applications [21].Drones have also been used as haptic feedback devices in virtual reality scenarios [22].However, these systems face limitations related to environmental factors such as light conditions, viewing angles, and spatial positioning.

B. Radar-based Gesture Sensing and Recognition
The core of radar detection lies in the use of electromagnetic waves.When these waves strike a target, a portion of the signal reflects back to the radar source and the receiver captures the signals.The characteristics of the received signal, such as frequency, amplitude, and delay time, provide information about the detected object, including its shape, orientation, distance, and speed relative to the radar.Radars are advantageous for human detection (e.g.gestures, posts, movement, etc.) as they operate under conditions of high illumination, low lighting, or darkness, when obstructed by surfaces or objects [23], [24], or during different weather conditions [25].Previous research has presented and evaluated multiple techniques to recognize radar gestures.For example, Gigie et al. [14] showcase a case study of data explosion for radar-based human gesture detection.They introduced a simulation framework based on a physical model to generate radar signals corresponding to various human gestures.With the availability of Google Soli in the mobile ecosystem, Leiva et al. [26] devised a hybrid CNN+LSTM deep learning model.They conducted a comprehensive study exploring the performance of mid-air gesture recognition while covering the radar sensor with three distinct fabrics (leather, wool, and cotton).The model demonstrated an exceptional average performance accuracy of 95%, AUC of 99%; RadarNet [27] utilizes an effective recognition approach for radar gestures, utilizing a Convolutional Neural Network.It operates efficiently on processors with limited computational capabilities.On the other hand, mHomeGes [28] is a system designed to detect radar gestures with an accuracy of 95%, particularly designed for smart home interactions.Therefore, we propose a radar sensor-based gesture recognition system for drone control that extends body gestures using foot gestures.

III. DESIGN REQUIREMENTS FOR SKYSCULPTOR
By analyzing the relevant scientific literature and considering the potential application domain, which focuses on the interaction between drones and gesture-based foot movements in smart indoor environments using integrated, attached or floor-placed radar devices, we can establish the necessary design requirements for SkySculptor as shown in Fig. 1.Next, the design criteria for the development of the SkySculptor software application.
(a) Open source technology: SkySculptor strongly emphasizes the adoption of open technologies to promote the development and progress of innovative specialized software applications designed for users who operate drones in smart indoor environments.This guiding principle also applies to SkySculptor, which features a fully developed software application created using open technologies.The goal of this system is to encourage research and support the ongoing advancement of drone interactions in intelligent indoor environments through the use of radar gestures.
(b) Smart spaces orientation: Accurate positioning of the radar device in the recommended locations, as proposed by S ¸iean et al. [1], enables effective handling of complex gestural interactions between users.Similarly, for interactions between drones and physical objects, the approach suggested by [2] can be applied.This approach is seamlessly integrated into the software application SkySculptor, with a particular emphasis on the areas of the floor and the interaction facilitated by the movements of the feet.
(c) Easily integrate, attach, or place the radar sensor on objects or devices: Traditionally, drone control has been done using joysticks or smartphones.However, in our scenario, we propose a shift towards using an integrated radar device that can be attached to or placed on the floor for drone interaction.Radar devices have unique features that allow them to integrate seamlessly into various objects or be attached by users, providing researchers with the opportunity to propose innovative methods of interaction.The SkySculptor software application places an emphasis on integrating the radar device in the floor and facilitating interaction with the feet.This is motivated by the lack of gestures involving the lower body, despite the numerous proposals for interaction focused on the upper body, as discussed by S ¸iean et al. [1].By prioritizing the integration of the radar device into the floor and enabling interaction based on leg movements, we aim to enrich and diversify gestures and interactions in intelligent indoor environments within the context of HDI.
Requirement (c) can be satisfied by utilizing the principles that govern the functioning of radar devices and their ability to detect objects.This requirement provides various options for the installation of the radar device, allowing interaction with the drone.At the same time, it addresses the challenges associated with gesture recognition that may arise due to low light or unfavorable weather conditions.When open technology is employed in the development of a software application, it becomes easier to extend and improve its functionality.As a result, a new requirement is introduced for SkySculptor, which specifically describes the ease of its development.SkySculptor are those that have a Python interpreter and are equipped with a USB port for connecting the radar device.
Software applications developed using the Python framework are well known for their fast and reliable performance.This decision was made to ensure that the tools are compatible with various devices that have a Python interpreter, providing a consistent experience for users.Using Python 1 , the process of integrating drone and radar device SDKs became easy, making it easier to connect to these devices.Using the drone SDK 2 and establishing a connection to the radar device, we successfully incorporated the necessary functionality to control the drone without any difficulty.

IV. IMPLEMENTATION APPROACH AND PROTOTYPE
We propose a gesture recognition system to control drones using radar sensors.The system focuses on using foot gestures for control.To evaluate the technical feasibility of this radar sensor-based gesture recognition system for drone control, taking into account the availability and affordability of radar technology on the market, a prototype SkySculptor was developed using the 15-antenna Walabot Creator device and the Walabot API. 3 The Mambo Parrot drone was used in conjunction with the prototype.The trajectory of the detected target above the radar was captured and expressed as x, y, and z coordinates.Two directional swipe gestures (referred to as swipe left and swipe right) along the y axis.Furthermore, the distance from the sensor on the z axis at which these gestures were made was used to define three active zones (near, close, and far) above the radar.Fig. 2 shows visual representation of the gesture set.When the two directions and three zones were combined, a total of six gestures were obtained.These gestures can be assigned to three types of functions commonly used in drone control: take-off/land for initiating or ending the flight of the drone, start/stop video for controlling the camera, such as taking photos or starting/stopping video recording, and forward/backward for moving the drone in the forward or backward direction.This method is the same as the paper [2]. 1 The Python programming language is widely recognized as one of the most popular programming languages.It consistently holds the top position in the Tiobe ranking (https://www.tiobe.com/tiobe-index/),with a steady rating of 13.97% from 2023 to 2024.
2 https://developer.parrot.com/docs/index.html 3 https://api.walabot.com/sample.html We conducted a preliminary evaluation of our application by performing foot gestures that corresponded to various scenarios, including take-off, landing, starting and stopping video recording, moving forward, and moving backward a drone.A visual representation of these gestures can be seen in Fig. 2. If additional gestures are required, modifications to our basic gesture recognition pipeline may be necessary, such as preprocessing the raw signal or implementing new recognition techniques [29].Fig. 3 displays θ−R images obtained from the Walabot radar when placed on the floor and corresponding to the gestures shown in Fig. 2, with varying distances measured by the radar sensor.
The Walabot radar sensor was placed on the floor at various locations to generate the heatmap screenshot for different scenarios.This placement was chosen to allow for interaction with the feet.In each scenario, the experimenter sat on a chair and used his right foot to perform gestures (a) and (b) within the active zone labeled near.For scenarios (c) and (d), a mouse pad was placed on the floor with the Walabot radar sensor underneath.To represent the situation in which the radar sensor is placed below an object such as a carpet, the experimenter raised his foot higher on the z axis and made gestures for the active zone labeled close.In the last case, the radar sensor was placed on a vibrotactile floor, which is available in our research laboratory.The experimenter executed gestures (e) and (f) to simulate the scenario in which the sensor is integrated into the floor within the active zone labeled far.This method is the same as the paper [2].
Another assessment was made when the experimenter was seated, allowing interaction through knee gestures.By sliding the knee to the left and right at varying distances from the radar sensor, the experimenter could access three active areas: near, close, and far for interacting with drone.The Walabot radar device was placed on the wall, 20 cm above the floor.Alternatively, when the experimenter was not seated, more extensive sliding movements of the left or right foot could be performed to execute foot gestures.Fig. 3 illustrates foot gestures, the distance measured by the radar sensor, and the different placements of the sensors.For more examples of where to place radars, we recommend looking at [1].
Different types of data can be extracted from a radar system, such as velocity, range, and direction of motion.The specific details obtained vary depending on the modulation technique utilized to control the drone, for example: • 1D, continuous-wave (CW) -modulation separates objects by their velocity.There is no possibility of distinguishing objects of similar velocity from objects' location.Frequently used for motion detection applications.Drones have the potential to be used in various sports, including running [30].In addition, a 1D radar system can be used to indicate the moment when a runner crosses the finish line.
• 2D, frequency-shift keying (FSK) -modulation separates objects by distance and velocity.Location of objects in a one-dimensional environment without information about the angle.The modulation being FSK, it only separates objects based on speed, and it offers the advantage of measuring the distance.Drones, for example, are found on construction sites before or during the execution of construction work, working alongside human workers [31].The 2D radar will be utilized to gauge the distance between objects.
• 3D, frequency-modulated continuous wave (FMCW)multiple input/multiple output (MIMO) modulation separates objects by velocity, distance and angle.Objects of the same speed, distance, and angular position can be detected and their 2D location determined.Multiple transmission and reception antennas increase the resolution of the sensor.Drones have various applications in emergency situations, such as search and rescue operations [32].In the event of an avalanche, a 3D radar sensor can be used to determine the distance between the rescuers and the individuals in need of assistance.
Objects with the same speed, distance, and angular distance can be detected.Objects can be located in 3D.
Compared to a 3D radar, 4D radars use more antennas and can detect the angle between horizontal and vertical.This brings about the advantage of locating objects in a 3D environment.Drones or clusters of drones are commonly used for aerial shows [33], as well as for image projection [34] and aiding in projections [35].The 4D radar sensor can be used to calculate the speed of the aerial displays to coordinate their positions.
We did not use the 1D-RADAR and 2D-RADAR categories because we want to cover as many functionalities SkySculptor as possible and, in certain situations, distinguish gestures from different objects in the room.Depending on the resolution, we can choose different commercial radar sensors, also putting emphasis on the operating distance.For example, in the case of integrating the radar sensor into the floor or wall, we are going to need a much smaller distance, compared to the situation where the sensor is integrated ceil.The initial form of the algorithm scans targets to flip pages of an opened document; If the y coordinates of the identified targets were decreasing, the up button was pressed to access the previous page, and if the targets were increasing, the down button was pressed to access the next page.We modified this algorithm at both the recognition level and the semantic level of gestures.We wanted to better recognize the original form, so we calculated the sum of the evaluated y coordinates.Now, if the values are decreasing and the sum is negative, we stand for a swipe-left; otherwise, if the values are increasing and the sum is positive, a swipe-right is identified.For each identified gesture, we send a specific command via WebSocket to the Drone running on the PC in our situation, such as: forward [36], [37], backward [38], [39], take a photo [11], etc.
Radar-sensing gestures imply that a gesture is made in air, without touch.To this end, another feature that is interesting for a gesture is the distance of foot from the Walabot.In this way, we modified the original algorithm by also capturing the z coordinate.Because a gesture is made up of a set of points, we calculate the average distance between the foot and Walabot.We obtain the result that a gesture can be performed near near Walabot, at a reasonable distance from it close or very far from Walabot far.In summary, we identified two swipe gestures x 3 possible positions of each gesture, that is, a set of six gestures.

V. RESULTS
The creation and implementation of SkySculptor, a specialized software designed to improve drone interactions using foot gestures detected by radar, have produced positive results.The software was developed with specific design requirements to ensure smooth compatibility with open-source technologies.Integrating radar sensors into the application's framework enabled drone control by recognizing foot movements detected by the Walabot radar device.A set of six unique directional swipe gestures was introduced, each linked to predefined drone functions, allowing users to easily perform actions like drone take-off/landing, starting/stopping video recording, and changing directions.During the prototype phase, the use of a 15-antenna ultra-wideband 3D radar in combination with the Parrot Mambo drone was used to validate the effectiveness of the gesture recognition system.The ability of the radar sensor to capture detailed foot trajectories at various distances and angles provided reliable data for accurate gesture recognition.Detailed radar heat maps showed the gestures and spatial positions of the radar sensor.Overall, the successful implementation of SkySculptor demonstrates its potential to extend the human-drone interaction in smart indoor environments, paving the way for future advancements in this evolving field.

VI. DISCUSSIONS
The introduction of SkySculptor, a software application aimed at simplifying drone control through radar-based foot gestures in intelligent indoor spaces, involves a series of discussions.The central theme of this scientific article revolves around the effective merging of open-source technologies with the intricacies of smart indoor environments to ensure a variety of user interaction methods and technological functionalities.Using radar sensors, SkySculptor uses conventional drone control techniques, offering users an intuitive interface based on foot gestures, which improves the accessibility of indoor drone manipulation.Moreover, the successful deployment and verification of SkySculptor underscore the potential to extend radarbased gesture recognition systems to transform the dynamics of human-drone interactions.These outcomes collectively pave the way for a wider acceptance of radar-based foot gesture control systems.

VII. LIMITATIONS AND FUTURE WORK
Our implementation uses Python-based APIs to control drones and radar devices.While Python's versatility and widespread support are commendable, certain specialized or restricted environments may encounter difficulties in fully accommodating Python.This is particularly true for wearable devices or embedded systems, which may have limited resources or be optimized for alternative programming languages.Another limitation is associated with the data transmission capacity of the radar device.Enhancing this capability could enable for a more extensive implementation of a wider range of gestures for drone control.Currently, the SkySculptor software tool does not have a predefined set of user-customized gestures.Future development considerations include improving the gesture dictionary through code generation and incorporating a radar device capable of processing larger volumes of data.Lastly, there is a limitation in terms of drone compatibility, which is currently limited to Parrot drones.Adapting the software implementation to be compatible with other drone brands presents a challenge.In addition, limitations extend to factors such as drone size, flight time, and sensor configurations.These challenges present opportunities for further development within the SkySculptor software system.Furthermore, different types of radar will produce different types of data.Therefore, it is useful to investigate other characteristics of radar sensors, such as resolution or field of view, in future work to implement various designs of gestures.We leave the examination of diverse radar technologies for future research.

VIII. CONCLUSION
The software application, SkySculptor, presents a novel approach to controlling drones in smart indoor environments by using gestures of the foot on radar.This addresses a gap in the current literature regarding the use of foot gestures for drone control.The article provides an overview of the context and scope of SkySculptor, emphasizing the fundamental principles of radar sensing and the potential of this technology for controlling drones in intelligent indoor environments.The related work highlights the limited information available on foot gestures for drone control despite previous research efforts.Additionally, the paper presents a detailed workflow for SkySculptor, which includes design requirements, an implementation approach, and a prototype of the proposed system.

Fig. 1 .
Fig. 1.The walabot radar device, used in SkySculptor, is shown alongside the parrot mambo drone and interaction modes.

Fig. 2 .
Fig. 2. Various mid-air gestures combining directional swipes and distance from the sensor enable control of drone functions.

Fig. 3 .
Fig. 3.The radar sensor captures θ−R images of various actions, including (a) take-off, (b) landing, (c) starting and (d) stopping video recording, (e) moving forward, and (f) moving backward.Each image shows foot gestures and the distance measured by the radar sensor.