Oscillation Preventing Closed-Loop Controllers via Genetic Algorithm for Biped Walking on Flat and Inclined Surfaces

In this study, a closed-loop controller is designed to overcome the dynamical insufficiency of the 3D Linear Inverted Pendulum Model (LIPM) via the Genetic Algorithm (GA). The main idea is to still use the 3D LIPM with a closed-loop controller because of its ease at modeling. While suppressing the dynamical flaws only the legs are used, in other words a robot is used which does not have any upper body elements to have a more modular robot. For this purpose, a biped is modeled with the 3D LIPM which is one of the most famous modeling methods of humanoid robots for the ease of modeling and fast calculations during the trajectory planning. After obtaining the simple model, Model Predictive Control (MPC) is applied to the 3D LIPM to find the reference trajectories for the biped while satisfying the Zero Moment Point (ZMP) criteria. The found reference trajectories applied to the full dynamical model on Matlab Simulink and the real biped in the laboratory at Istanbul Technical University. From the simulation results on the flat and inclined surfaces and real-time experiments on a flat surface some dynamical flaws are observed due to the simple modeling. To overcome these flaws a Proportional-Integral (PI) controller is designed, and the optimal value of the controller gains are found by the GA. The results assert that the designed controller can overcome the observed flaws and makes biped move more stable, smoother, and move without steady-state error. Keywords—Humanoid robot; biped walking; Model Predictive Control (MPC); Genetic Algorithm (GA); trajectory planning; Zero Moment Point (ZMP); linear inverted pendulum


I. INTRODUCTION
In recent years, with the growing interest in humanoid robots, they have been started replacing humans in hazardous environments. Although the use is growing with the interest, there are also lots of difficulties to imitate human-like movements [1,2]. One of the most interesting movements of the humans is locomoting on two legs since this movement can adapt itself for flat or inclined surfaces, even if uneven surfaces. A stable walking can be described as walking without falling on the ground.
To maintain stability during walking, the Visual Simultaneous Localization and Mapping, also known as Visual SLAM, has been used in [3] to estimate the Center of Mass (CoM), while Zero-Moment Point (ZMP) is measured via force/torque sensors. The increased sensing and computational load results in a promising performance, also tested under push and perturbations. Moving the torso to maintain stability [4] is another popular approach, extending the stability control problem to that of the humanoid body [5]. Those studies do address the dynamical flaws of control approaches using the simple 3D Linear Inverted Pendulum (LIPM) but at the cost of an increased number of sensors, computational load, and system complexity.
In this study, we aim to improve the walking performance of a biped further exhausting the capabilities of the 3D LIPM based simpler control approaches. The proposed method combines the approach in [6] developed for uneven and inclined surfaces, and the method in [7] based on the kinematic resolution of CoM, and also compensated for some of the dynamical deficiencies of the 3D LIPM. The trajectory generation is performed with the ZMP, but unlike other studies, such as [8] and [9], our objective function takes the ZMP into account as a constraint and aims to minimize the hip tracking error, but not the ZMP error. The justification of this approach is that the derivation of ZMP uses approximations, while the hip point can be derived more accurately by conventional Jacobian kinematics. Simulation results show that the objective function defined at [8] and [9] causes biped to oscillate without a closed-loop controller. Another novelty of this study is the consideration of a biped system alone in the development of improved walking performance and stability, without any compensation coming from the increased number of DoFs of the torso or rest of the humanoid body as in the abovementioned studies. The use of the simplified model still gives rise to some oscillations at the hip during walking, and these oscillations are eliminated with the use of a simple feedback control, the coefficients of which were determined by Genetic Algorithm (GA) which is a benchmark optimization algorithm used at various areas such as redundant robots [10], tuning of controller parameters [11]. The success of the proposed method is shown both on simulation results on the flat and inclined surfaces and real-time experimental results on a flat surface. This paper is organized as follows. The second section provides background. In the third section modeling of the biped as a 3D LIPM is explained and the relation between the pendulum and the biped with 12 DoF at Istanbul Technical University is given. In the fourth section, the concept of MPC www.ijacsa.thesai.org is explained, and the results of the simulation with MPC are presented with discussions. In the fifth section, the proposed closed-loop control method, and the developed GA is described and tested with simulation results. In the sixth section, realtime experimental results are provided, with final discussions in the conclusion section.

II. BACKGROUND
In order to maintain a stable walking, humanoids must meet some stability criteria. One of the most popular concepts of stable walking is ZMP [12]. ZMP is the point, where the normal forces caused by the movement of the humanoid do not produce any moment, hence, this point is concurrent with the center of mass when the robot is inactive. Consequently, keeping the ZMP inside the Support Polygon (SP) of the humanoid during the locomotion guarantees the balance of the humanoid [13].
On the other hand, the exact derivation of the ZMP is a complex task. In order to obtain ZMP easily, approximate dynamical models have been investigated. Because of its simplicity in the representation of ZMP, the most commonly used approximate model is the 3D LIPM [14]. This model provides a reasonable ZMP position while the humanoid is walking and can be used for ZMP trajectory planning, but it is too simple to reflect all the dynamical properties of a humanoid; e.g. it does not contain any relation between the foot and the ground. In some studies, the actuator positions are used for the calculation of CoM [15], but this function does not take contact forces into consideration, hence, it cannot suppress the disturbances associated with contact forces. By position feedback of the joints only joints" reference tracking control can be done without any reaction with the floor, e.g. if the biped is on an inclined surface the position feedback of the joints cannot reflect the inclination of the surface.
The simplicity of the 3D LIPM model has opened the path for the use of Model Predictive Control as the "trajectory planner" for walking in several studies. MPC allows for realtime implementation of optimal control principles and has gained increasing popularity in many areas, from numerous automotive applications [16][17][18] to pH neutralization process [19]. It is also a suitable control method for trajectory planning of humanoid robots since the ZMP can be defined by a simple model and the online optimization problem can be solved sufficiently fast, while ensuring the constraints on ZMP. This aspect of MPC is recognized by many researchers in humanoids, starting with the design of a preview for the ZMP to generate the walking patterns [8]. After preview controllers, [9] redefined the trajectory generation problem with the constraints on the ZMP using simulations on a humanoid model. The study also considers push recovery similar to several other studies, such as [20,21].

III. MODELING OF THE BIPED
In this section, the modeling of the biped with the 3D LIPM will be discussed and the expressions of the 3D LIPM will be given. 3D LIPM can be used to compute the ZMP simply. There is a point mass is accepted to be concentrated on the tip of the 3D LIPM and the pendulum is accepted to be massless. Since the pendulum is 3D, the same equations of motion can be used for modeling the pendulum both on the x-axis and y-axis. In this study 3D LIPM is used for modeling the biped for calculating the ZMP. Fig. 1 shows the isometric view of the 3D LIPM.
The dynamic model of 3D LIPM can be stated in matrix form by taking jerks of the mass that is concentrated on the tip of the 3D LIPM ⃛ and ⃛ as control inputs.  Here and are the linear positions, ̇ and ̇ are the linear velocities and ̈ and ̈ are the linear accelerations of the 3D LIPM. As stated, before ⃛ and ⃛ are the jerks of the 3D LIPM. and which are the linear positions of the ZMP on the x-axis and y-axis respectively, can be written in terms of the three states. Here is the linear position of the 3D LIPM on the z-axis, which is the height of the 3D LIPM and is the acceleration due to the gravitational forces. From now on the height of the 3D LIPM will be taken as constant . (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 5, 2020 534 | P a g e www.ijacsa.thesai.org Fig. 2 shows Solidworks drawing of the biped with the pendulum. The point mass of the pendulum is located on the hip of the biped so that the CoM of the biped is accepted to be located on the hip.

A. Preview of the Model Predictive Control
MPC is an optimal control algorithm that uses a model to make predictions about future outputs of a process while satisfying inequality constraints on the input and output variables. MPC can be used for controlling the multi-input multi-output systems. In order to control a system, MPC needs a reasonable accurate model of the system and MPC solves an online optimization problem to find the best control action that makes the output follow the reference. Fig. 3 shows the block diagram of the 3D LIPM that is controlled by a Linear MPC during this study. By using the MPC, the reference trajectories for the hip of the biped can be produced while ensuring the constraints on ZMP. As stated in Section 3 3D LIPM is transformed into a linear system by taking the height of the pendulum as constant.
In this section all the equations for MPC will be derived according to equations, derived by Wieber [9] for only x-axis as can be seen from the Equations (1) and (2) in Section 3, the derivations of the position of ZMP for x and y axes are analogical to each other. By taking the height of the 3D LIPM as constant, the output equation turns into a linear equation from a nonlinear equation. Equations (1) and (2) can be discretized by trivial integration. With trivial integration, the relation of the next states with the current states and control signal can be written as follows: Here ̂ is the state vector at step, ⃛ is the control signal at step and is the ZMP position at step.
The constraint on the position of ZMP for a stable walking at step can be defined as follows, where the minimum allowed value of the ZMP and the maximum allowed value of the ZMP at step:

  
The main purpose of the optimization problem is to find all ⃛ , those minimize the cost function stated below. Here iterating the Quadratic Program (QP) by finite N times which is the prediction horizon, allows solving the Optimal Control problem analytically through some simple matrix manipulations instead of having to solve a more complex algebraic Riccati Equation. With a difference to Kajita"s and Wieber"s proposed objective functions, here the aim is to minimize the tracking error of the CoM while minimizing the jerks, instead of minimizing the tracking error of the ZMP [8,9]. Because as seen from the Equation (5) optimization problem guarantees that the ZMP will stay inside the SP. Since the CoM of the 3D LIPM is accepted to be on the hip of the biped, the kinematic relation between the feet soles and the hip can be expressed directly with Jacobian Kinematics. Equation (6) shows the objective function on both the x-axis and y-axis defined in this study. Here and are the weight values of the tracking error and and are the weight values of the control signal.
By using the recursive relation iterated N times, all the relation between the jerks and coordinates of ZMP can be defined as follows: can be shown in compact form as follows:

B. Simulation Results of the Linear Model Predictive Control
In this section, Matlab Simulink simulation results of the biped are given and discussed. In order to use the Linear MPC some parameters need to be defined before simulations. These are the number of states, number of outputs, number of the control signals, sampling period, prediction horizon, control horizon, initial conditions of the states and the control signals, and weights at the optimization problem. There are 3 states, 1 output and 1 control signal for each axis. The sampling period is chosen to be 0.01 s. The prediction horizon is 150 and the control horizon is 16 steps. The weights" ratios are selected as www.ijacsa.thesai.org 1000 both on x and y axes. All the initial conditions are set to 0. Fig. 4 shows the general block diagram of the system. As stated, before MPC produces jerk inputs for the 3D LIPM while satisfying constraints on the ZMP locations both on the x-axis and y-axis. While 3D LIPM is tracking the reference trajectories, it also produces reference positions for the biped"s CoM, which is accepted to be on the hip of the biped as the point mass of the pendulum. The reference positions will be evaluated at the swing leg and stance leg selector block. Mainly it evaluates the reference positions for the swing leg and stance leg with respect to phases of cyclic walking and foot positions. Finally, this block derives reference velocities for the biped"s hip point. The evaluated reference velocities will be applied to inverse kinematics function and this function produces angular velocity references for 12 DoF of the biped. These angular velocities are derived and integrated to find the angular accelerations and angular positions. The angular accelerations, velocities, and positions are applied to the biped. Here the magenta line shows the trajectory of the CoM and the green line shows the projection of the CoM. When the figure is examined, it can be seen that the biped is moving in positive y direction also, although the only position change is expected at x-direction when walking linearly. Fig. 6 shows the biped"s final position and the trajectory of the CoM on xy-plane. The projection of the trajectory of the CoM overlaps with itself, so there is only one line that can be seen in the figure. Also, the movement on positive y-direction can be seen clearly as a result of the oscillation of the biped during this movement. When one leg gets off the ground and becomes the swinging leg, so the biped is at the Single Support Phase (SSP). At this time the support polygon reaches its smallest surface area and the walking becomes less robust as stated before. Also, the glitches on the trajectory of the CoM can be seen from the figure.  (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 5, 2020 536 | P a g e www.ijacsa.thesai.org     7 shows the trajectory of the CoM on all x, y, and z axes. The first one is the trajectory on the x-axis, the second one is the trajectory on the y-axis and the third one is the trajectory on the z-axis. From the graph of the trajectory on the y-axis, it can be seen that when the biped is at its maximum distance from the middle position on the y-axis, it starts oscillating. In addition to this, as time progresses the offset of the trajectory on the y-axis shifts to positive values.   9 shows the trajectories on the x-axis. The blue line is the reference trajectory that is defined for the 3D LIPM. The red line is the trajectory of the 3D LIPM or in other words output of the closed-loop system with Linear MPC. The green line is the trajectory of the biped"s hip and the magenta line is the trajectory of the calculated ZMP. It can be seen that the 3D LIPM can follow the reference trajectory without any steadystate error, however, the biped cannot follow this trajectory. ZMP has peaks at the beginnings and the endings of the steps because of the inertia of the biped.   From the figure it can be seen that the 3D LIPM cannot track the reference well or in other words tracks the reference slowly, but it has a smooth movement. The biped has a steadystate error that can be seen from the offsets and also oscillates when it is on the limits on the y-axis. The ZMP has peaks, nevertheless has no error because it is a calculated value, not a measured value as stated before. Fig. 11 and Fig. 12 show the trajectories of the biped with objective functions defined by Kajita [8] and Wieber [9] on the x-axis and y-axis respectively. The blue lines are the reference trajectories, red lines are the 3D LIPM trajectories, green lines are the biped"s hip trajectories and magenta lines are the ZMP trajectories. The main difference is the ZMP trajectories if compared to the objective function that is defined in this study. The oscillations of the biped on SSP can be seen from the figures although the optimization problem tries to minimize the ZMP tracking error. These oscillations cause biped to have steady-state errors on both axes. Figures show that whether the optimization problem tries to minimize the CoM trajectory error or the ZMP trajectory error, the biped oscillates. www.ijacsa.thesai.org

V. CLOSED-LOOP CONTROLLER VIA GENETIC ALGORITHM
In this section the proposed closed-loop control method and the search for the optimal controller gains are explained.

A. Closed-loop Controller
3D LIPM is one of the most used models for modeling a biped and deriving the ZMP definition as stated before. However, this ease at modeling and less calculation load, the 3D LIPM has some dynamical flaws. The first one is the concentrated mass at the tip of the inverted pendulum, which does not reflect the change at the position of real CoM of the biped during the movement. The biped, which used in this study has 12 motors, and these motors" weights and inertia tensors are much bigger than the weights and inertia tensors of the links, so the links can be negligible during the dynamical analysis. The real CoM of the biped and ZMP of the biped are the functions of these motors" positions and accelerations so the results of these functions change during the movement with respect to positions and accelerations of each motor. These changes cannot be expressed by a simple model. So, although the 3D LIPM tracks the reference trajectory well during the movement, the biped cannot track its reference as successful as the 3D LIPM.
SSP is the least robust phase of the walking because the support polygon has its smallest surface area, which is the equivalent area of the support foot"s projection on the ground. And also, while the support polygon has its smallest area, one leg of the biped is swinging and the dynamics of the biped are changing roughly, which cannot be expressed by the 3D LIPM. From Fig. 13 it can be clearly seen that the biped is oscillating on the y-axis. These oscillations do not cause the biped to fall during this study, but at higher walking speeds they can cause the biped to fell to the ground. Even if biped can still walk it has steady-state errors on both x-axis and y-axis as stated before. These errors can also cause biped to walk into undesired locations.
In order to get rid of these oscillations a closed-loop control method based on a PI controller is searched. The tracking error on y-axis stated as follows: Here ̂ is the position of the 3D LIPM on y-axis and is the position of the biped on the y-axis. The position reference of the biped is the position output of the 3D LIPM. The results of the simulations are examined, and the tracking error is observed at SSP as expected, so this correction must be applied during the SSP. Because of the discontinuity applying this correction as a square wave makes the system unstable. To get rid of instability, the correction must be applied as a sine waveform, so the error must be modulated. In order to modulate the error signal, it is multiplied by the movement of the swinging leg on the z-axis. The proposed PI-controller can be expressed as follows: Here ̂ is the position of the swinging foot on z-axis and is the tracking error on the y-axis. Fig. 14 shows the detailed block diagram of the biped system with PI-controller and 3D LIPM with Linear MPC. The red blocks are the added blocks to make the biped system as a closed-loop system. An accelerometer is added on the hip of the biped to measure the accelerations both on three axes. The measured accelerations of the hip are processed through the position estimator block in order the estimate the position of the hip on the y-axis. This block mainly filters the accelerations and integrates the positions from accelerations. After finding the position of the hip on the y-axis the Equations (9) and (10) implemented by the added blocks.
The optimal values of the controller gains and will be searched by the GA in the next section. 538 | P a g e www.ijacsa.thesai.org

B. Genetic Algorithm
The GA is a random based search algorithm based on the theory of natural evolution. It can be used on both constrained and unconstrained optimization problems. Like most of the other optimization methods, GA starts with a population size which is the number of solutions. At every step, GA selects individual solutions randomly from the population and as chromosomes the selected individuals produce a new generation. These newly produced generations inherit the characteristics of the parents as expected. As in evolution, the generations produced from high-quality parents, expected to have better characteristics. In an optimization problem these characteristics can be named as a fitness function. After iterations with successive generations, the population ends up with an optimal solution.
The fitness function is identified as the integral square error on the y-axis and defined as follows: Here the error is defined as the tracking error of the biped on the y-axis.
is the position of the 3D LIPM on y-axis and is the position of the biped on the y-axis. The population size is selected as 50, the crossover is selected as 0.8. The optimal gain values and for 20 steps of walking are found as 265.31 and 115.33 respectively. The cost function value is found as 0.0000801 after 34 iterations. Fig. 15 shows the biped"s final position after 20 steps. Here the magenta line shows the trajectory of the CoM and the green line shows the projection of the CoM. As a result, no anomaly can be seen in the figure. For example, without the PI controller, biped was changing position"s offset on the y-axis and resulting as an anomaly in the figure. It seems that the PI controller can overcome this steady-state error on the y-axis.  (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 5, 2020 539 | P a g e www.ijacsa.thesai.org     Again, there is no offset change in y-direction for the CoM, so it can be explained that the biped is walking linearly on the xaxis without any slipping in the y-axis. Fig. 19 shows the trajectories on the x-axis. The blue line is the reference trajectory to the 3D LIPM, the red line is the trajectory output of the 3D LIPM, the green line is the hip trajectory of the biped and the magenta line is the calculated ZMP trajectory. 3D LIPM can follow the reference trajectory since there is no addition to the Linear MPC used in the previous section. Additionally, the biped tracks the output of the 3D LIPM without any error if compared with the previous section. ZMP has peaks at the beginnings and the endings of the steps because of the inertia of the biped.   Fig. 20 shows the trajectories on the y-axis with the same order as the previous figure. Again, the 3D LIPM tracks the reference trajectory as in the previous section, but here the biped can track the reference not without any, but with a little oscillation. The ZMP has peaks on the x-axis.   Fig. 21 shows the comparison between the open-loop biped system and the closed-loop biped system on the x-axis. As mentioned before there is a PI controller has been added to the system mainly to overcome the oscillations of the biped when it is on SSP. However, it can be seen that the biped also slips on positive x-axis and positive y-axis, because of these oscillations. By inspecting the figure on the x-axis, the red line which is the biped"s trajectory with the closed-loop controller can track the green line, which is the reference without an error, but the blue line which is the biped"s trajectory without the PI controller cannot track the reference. It can be understood that these oscillations when the biped is on a less robust phase can cause the biped slip and by overcoming these oscillations biped can follow the trajectory on the sagittal plane without an error. Fig. 22 shows the effects of the PI controller on the y-axis. The blue line is the biped"s trajectory without the PI controller, the red line is the biped"s trajectory with the PI controller and the green line is the reference trajectory of the biped. It can be seen that biped cannot track the reference without the PI controller and the biped is oscillating when it"s on the SSP. Additionally, the amplitude of the movement on the y-axis is not equal to the amplitude of the reference. As time progresses, the biped is moving on the positive y-axis resulting in an offset error. When the red line is examined, there are very few oscillations compared to the blue line and it can also be observed that there is neither an offset error nor an amplitude error on the y-axis with the PI controller.

C. Simulation Results of the Linear MPC with PI Controller
As a final result of all these progression steps, it can be said that the PI controller is suggested for preventing the biped from oscillating when it is on SSP. Unless the model overcomes the oscillations, these will result in slipping on both the x-axis and the y-axis at normal walking speeds. Overcoming oscillations also avoids slipping which can cause the biped to enter an unstable region instead of a stable walking. Fig. 23 shows all the phases of cyclic walking. The biped starts from the Double Support Phase (DSP) and reaches SSP and DSP consecutively while walking. The shape and the surface area of the support polygon change at every phase with an order. The support polygon is a rectangle at the beginning and then becomes a convex hull which is followed by a smaller rectangle. This two-shape transition goes until the last step which ends with the same rectangle as at the beginning.

D. Simulation Results of the Closed-Loop Controller on Inclined Surfaces
The suggested PI-controller is examined on inclined surfaces to show its success. Firstly, simulation is done on the ascending surface, the slope of the surface is selected as and both the kinetic and viscous friction selected as same as the previous simulations. Secondly, simulation is done on a descending surface, the slope of the surface is selected as as the ascending surface, only the slope direction is changed and the friction coefficients are selected the same as the ascending surface. Fig. 24 shows the ascending and descending surfaces from left to right.        It can be seen that, biped slips on both situations on the x-axis, the closed-loop system does nothing on the x-axis, because biped mainly slips forward because of the slope. But from the second figure, it can be seen that the PI controller suppresses the oscillations on the y-axis as it does on the flat and ascending surfaces.

VI. EXPERIMENTAL RESULTS
In this section experimental results of the biped both without the PI controller and with PI controller will be given. The biped consists of 12 Dynamixel servo motors. The main controller is Microautobox 2 from Dspace. All the motors are connected to Microautobox 2 from the serial port and the calculations are done real-time on Microautobox 2.  It can be seen that the biped also moved on the y-axis, although there is no translation reference is applied. Because of the slipping, the biped nearly moved out of the walking surface. www.ijacsa.thesai.org   Fig. 31 shows the scenes from the initial to the final position of the biped during walking for 20 steps with the suggested PI controller. It can be seen that the biped has moved too less on the y-axis if compared with Fig. 29. Again, there is no translation reference is applied but the reason for this little translation is the backlash effects on motors, so it can be said that the suggested PI controller is also successful in the realtime experiment. And also, the steady-state error on the x-axis is prevented by the closed-loop controller. www.ijacsa.thesai.org

VII. CONCLUSIONS
The effectiveness of the MPC and suitability of the 3D LIPM for biped modeling have explained and proved in literature. But the simple 3D LIPM cannot reflect all the dynamical properties during the walking as expected, it can be used for trajectory planning and making calculations faster.
In this paper a PI controller is suggested, in order to not have a battle with the highly complicated dynamical model of the 12 DoF biped. While using the simple model because of its ease at use, overcoming the dynamical flaws of the simple model is aimed. For this purpose, a biped is modeled with a conventional 3D LIPM model, reference trajectories are created with Linear MPC. The least robust phase of the walking, which is the SSP is examined and PI-controller is added during this phase. The optimal values of the controller gains are searched by the GA, which is a well-known optimization method, to minimize the tracking error during the SSP. After finding the optimal gain values, the suggested method is firstly examined on the flat and inclined surfaces on Matlab Simulink simulations and then applied to the biped in the laboratory on a flat surface.
The success of the suggested method can be both seen in simulation results and real-time experiment results. With the help of the suggested method, all the dynamical flaws of the simple model which cause oscillations and steady-state error on the moving surface can be compensated during the walking. It can be said that, without facing a highly complicated dynamical model of a biped, the dynamical flaws of the simple 3D LIPM can be suppressed by the suggested method. And by this suggested method it is also clear that, this robot does not need any upper body element to overcome the oscillations and establish a stable walking. Furthermore, the proposed closed-loop controller algorithm will be tried on two-dimensional walking in order to access all the points on a surface. This walking includes rotation of a biped which is not included in this study, this two-dimensional walking will also include an angular position tracking added to the linear position tracking. Another future study will be push recovery as it is a very important task for bipeds to maintain stable walking. And also the control algorithm will be tried on both climbing and descending ladders as future work.