Linear Quadratic Regulator Design for Position Control of an Inverted Pendulum by Grey Wolf Optimizer

In this study, a linear quadratic regulator (LQR) based position controller is designed and optimized for an inverted pendulum system. Two parameters, vertical pendulum angle and horizontal cart position, must be controlled together to move a pendulum to desired position. PID controllers are conventionally used for this purpose and two different PID controllers must be used to move the pendulum. LQR is an alternative method. Angle and position of inverted pendulum can be controlled using only one LQR. Determination of Q and R matrices is the main problem when designing an LQR and they must be minimized a defined performance index. Determination of the Q and R matrices is generally made by trial and error method but finding the optimum parameters using this method is difficult and not guaranty. An optimization algorithm can be used for this purpose and in this way; it is possible to obtain optimum controller parameters and high performance. That’s why an optimization method, grey wolf optimizer, is used to tune controller parameters in this study. Keywords—Grey wolf optimizer; inverted pendulum; position controller; linear quadratic regulator; optimized controller design


INTRODUCTION
Linear Quadratic Regulator (LQR) is one of the optimum control methods and it is successfully applied to many systems.Selection of the controller parameters is the main problem when designing an LQR controller.The selected parameters must minimize a performance index.The selection process is conventionally made by trial and error method and it makes the process difficult, not guarantees finding the optimum parameters and may take long time.Optimization algorithms help designers to overcome these problems and guarantee finding one of the optimum solutions.
One of the basic systems for control theory is DC motor and LQR controller is one of the methods to control its speed and position.Ruderman et al. designed an LQR based speed controller for a DC motor [1].Abut compared the PID controlled DC motor and the LQR controlled DC motor under disturbance and the results showed the LQR based system has better performance than PID based one [2].Haron deigned speed and position controllers for a DC motor in his study.In the study, PID and LQR controllers was used and made a performance comparison.The results show again the LQR controller has better performance than PID controller [3].
Another popular system for control theory applications is inverted pendulum.Kumar et al. designed an LQR based controller for balance and trajectory tracking problem of a Self-Erecting Single Inverted Pendulum [4].They reported that LQR based system had faster and smooth stabilizing process compared to Full State Feedback controller designed by pole placement.Prasad et al. made a study to analyze and compare the PID and LQR controlled system under disturbance [5].The results was justified that the advantages of the LQR controller.
Trial and error method is widely used method to determine the elements of the Q and R matrices of an LQR controller [6].However there are many study shows the optimization algorithms help to determine the optimum parameters for the controller.Ata et al. designed an LQR based controller for an inverted pendulum on a cart.In the study, elements of the Q and the R matrices of the controller were selected by Artificial Bee Colony Algorithm to achieve the optimum performance.Optimization process was made on a nonlinear model and the results showed the ABC optimized system had good performance [7].
In another study, an unmanned rotorcraft pendulum was controlled using LQR optimized by ABC and Particle Swarm Optimization algorithm [8].The designed system was also tested under disturbance and the results showed that the ABC optimized system had better performance than the PSO optimized system.Çatalbaş et al. was designed an LQR controller for a Boeing 707 flight model and the unstable model was controlled successfully by the LQR controller [9].
In this study, an inverted pendulum system is modeled and controlled by LQR.Q and R matrices of the LQR are optimized by Grey Wolf Optimizer (GWO).All the study is made by simulation using Matlab program.Two different objective functions are used for the optimization process: firstly performance index of the LQR is used and then an improved objective function obtained adding settling time and total absolute error to the performance index is used.The controller is successfully optimized using both of the objective functions.

II. LINEAR QUADRATIC REGULATOR
LQR is one of the optimal control methods and widely used in the optimal control problems.The LQR method is used to control of complex systems that needs high performance.A www.ijacsa.thesai.orgsystem described by linear differential equations can be shown in steady-state form given in ( 1) and (2).A is system matrix, B is input matrix, C is output matrix and D is feed forward matrix."x" is the state vector, "y" is the output vector and "v" is the input vector.A conventional LQR problem is to find the Q an R matrix which minimizes the cost function (performance index) based on the input "v" [10].Performance index "J" is defined as given in (3).The control energy is represented by v(t) T Rv(t), while the transient energy is expressed as x(t) T Qx(t) [11].Q is symmetric positive semi definite matrix and R is symmetric positive definite matrix.
Designing a LQR controller consists of the following steps: Step 1: Q and R matrix, minimizing J, must be chosen.
Step 2: Then the algebraic Riccati equation, given in (4), is solved to obtain P using Q and R.
Step 4: System response is checked.If the system response is not met the required specifications, repeat all steps again.

     
A pre-compensation factor must be used when the system has a bigger steady-state error than expected.Precompensation factor calculation can be made by the equation given in (6).

 ( ( ) )  
As seen as, the system must be well modeled to design an LQR controller.The system must be linearized if the system is not linear.All states of the system must also be measurable or observable.Therefore, LQR design has a complex procedure but it has an important advantage.Controlling the all system states is possible with one LQR controller.In this study, pendulum position and vertical angle of the pendulum are controlled by one LQR controller.

III. INVERTED PENDULUM ON A CART
Inverted pendulum is a popular system, which is naturally nonlinear and unstable, in control theory.Inverted pendulum balance research is classically based on an inverted pendulum on a cart and the aim is balancing the pendulum by moving the cart [12]- [14].The basic system is given in Fig. 1.Invers pendulum is fixed on the cart by a rotating joint."θ" angle changes when an enough amount of force applied to the cart.The aim of the system is balancing the pendulum on vertical axis.The position control of the cart is also possible using an extra controller.The differential equations of the system can be derived using Euler-Lagrange method.The equations of the system given in Fig. 1 are given in ( 7) and ( 8) [15], [16]."I" is the moment of inertia of the pendulum, "m" is the mass of the pendulum, "M" is the mass of the cart, "l" is the length of the pendulum, "x" is the cart position, "θ" is the angle between the pendulum and the vertical axis, "F" is the input force.
When the "θ" is enough small, the equations can be linearized and steady-space equations (given in ( 9)) of the system can be obtained."a" used in ( 9) is given in (10).The system parameters are given in Table I.
IV. GREY WOLF OPTIMIZER An optimization algorithm minimizes (or maximizes) a function called objective function.The objective function is a special function defined for a system (or for a problem).It is affected by the parameters of the system.The optimization algorithms minimize the objective function by changing the www.ijacsa.thesai.orgfunction variables in a special way.This special way is inspired from the creatures in nature in some algorithms like Artificial Bee Colony Algorithm, Particle Swarm Algorithm or Grey Wolf Optimizer.
Grey Wolf Optimizer algorithm is inspired from the Grey Wolfs in nature.They are social animals and live in groups which size is generally 5-12.There are four levels in a group called as alpha, beta, delta and omega.The group leaders called alpha make important decisions like about hunting, sleeping and etc.The alphas are the most dominant wolves in the group.The alphas may not be the most powerful member of the group but they are best in managing.Beta wolves help alpha wolves for everything.When the alphas get away, ill or very old, betas do coordination and decision making processes for the group.They are under control of the alpha wolves but they can command the other wolves in the group.They also give feedback to alphas about the other wolves and works.There are omega wolves at the end of the hierarchy.They always do what the dominant wolves want.They are the last wolves allowed to eat.It seems like omega wolves do not have an important role in the group but it is observed that the group has some problems like internal fighting in the absence of omegas [17], [18].
The delta wolves are another type and they are responsible of hunting, scouting, sentineling, and some of them may be caretakers or elders.Hunters help the alphas and betas.Sentinels protect the group, scouts watches around and warns the group if there is any danger.Caretakers care the weak or ill wolves.
They have also a special hunting strategy.They track and approach the prey.Then they encircle, pursue and harass the prey until it stops moving.Finally they attack the prey.Detailed information of the mathematical model of the algorithm can be found in [17].

V. EXPERIMENTAL STUDY
In this study, an inverted pendulum model is designed, and controlled by an LQR controller.All study is made by simulations using Matlab program.Q and R matrices of the LQR controller are optimized by GWO algorithm.General block diagram of the LQR controlled system is given in Fig. 2. A, B and C are system matrices; K is feedback gain matrix and N is pre-compensation factor.
Solution of the differential equation in the simulation is made by the four steps Runge-Kutta method and the used time step is 0.001s.Total simulation time is 10s.Number of Search agents (individuals in the group) is selected as 30 and the iteration number is selected as 50 for the GWO algorithm.Q and R matrices are defined as diagonal matrices and the range of the each element is 1.10 -4 -1.10 10 .
An objective function is needed to tune the Q and R matrices when used an optimization algorithm.The main objective function is the performance index J, given in (3), for LQR design.Optimum controller design is possible when J is used as an objective function.The system outputs are given in Fig. 3 when only J is used as objective function.The settling time of the position output is 14.18s with 2% tolerance.The settling time of the θ output is 19.36s.Maximum error of θ is 0.04° and performance index J is 0.141.As seen as, the controller works good but the settling time is very long.That means, the results may not meet design requirements.In this case, an improved objective function is needed.The used objective function to meet the design requirements is given in (11).ST denotes the settling time and Z 1,2 is a coefficient to increase the effect of ST and integral of absolute error on objective function.Z 1,2 is selected as 1x10 8 .

 ∫ ( )  
At the end of the optimization process, Q and R matrices optimized as given in equation 12 and equation 13.The value of the performance index J is 3.195x10 5 for the given Q and R matrices.Pre-compensation factor, N is calculated as -19.884.The system outputs are given in Fig. 4. The settling time of the system for position control is 1.26s with 2% tolerance and it is 2.06s for θ control.Maximum error of the θ angle is 1.77°.As seen as, the settling time is shorted using the improved objective function but error of θ angle is increased.As a result, both of ( 3) and ( 11) are successful with GWO and selection of the objective function is depended on the design requirements.The speed of the algorithm is another important parameter.Objective function output vs. number of iteration graph is given in Fig. 5. GWO reaches the best solution at 35 th iteration.

VI. CONCLUSION
In this study, an LQR based position controller is designed using GWO algorithm.Determination of Q and R matrices, minimizing the performance index, is the main problem when designing an LQR controller.Minimizing the performance index using Q and R matrices is an optimization problem and GWO is successfully used to obtain optimum Q and R matrices.
Using only performance index J helps to design optimum controller but it may not meet the design requirements like settling time or maximum overshoot.In this case, the objective function must be improved using the effect of the system outputs which must be meet design requirements.Settling time and integral absolute error may be added to the objective function to obtain shorter settling time.