Outlier-Tolerance RML Identification of Parameters in CAR Model

The measured data inevitably contain abnormal data under the normal operating conditions. Most of the existing algorithms, such as least squares identification and maximum likelihood estimation, are easily affected by abnormal data and appear large indentation deviation. It is a difficult task needed to be addressed that how to improve the sensitivity of the existing algorithm or build a new parameter identifying algorithm with outlier-tolerance ability to abnormal data in system identification technology application. In this paper, the sensitivity of the RML to the sampled abnormal data was analyzed and a new improvement algorithm of CAR process is established to improve outlier-tolerance ability of the RML identification when there are outliers in the sampling series. The improved algorithm not only effectively inhibits the negative impact of the abnormal data but also effectively improve the quality of the parameter identification results. Some simulation given in this paper shows that the improved RML algorithm has strong outlier-tolerance. This paper’s research results play an important role in engineering control, signal processing, industrial automation and aerospace or other fields. Keywords—recursive maximum likelihood identification; parameter identification; outliers; outlier-tolerance identification


INTRODUCTION
The widely used model that describes the relationship of input and output is difference equation model, which is widely used in many different fields such as the discrete time control system [1] and the computer controlled system [2], [26], Generally, there are error in the measurement data sequence )} ( { k t y under the normal operating conditions .This paper adopts the Controlled Autoregressive (short as CAR) process model to describe a discrete time linear time-invariant control system.
There are quite a lot of literatures which discuss how to identify parameters in the CAR model.According to the basic principle of system identification, it can be divided into the least square method, the maximum likelihood method, the moment estimation method and the gradient correction method.According to the algorithm implementation methods, it can be divided into batch processing algorithm and sequential method.According to the real-time performance of algorithm, it can be divided into offline and online identification recognition.And according to the calculation domain, it can be divided into time domain and frequency domain method.For example, Wang.etc (2012Wang.etc ( , 2011) ) studied the recursive maximum likelihood (short as RML) identification method of controlled Autoregressive Moving Average (short as CARMA) model and CAR model [3]- [4].Blind maximum likelihood (short as BML) filter identification of the single input and single output moving average method was studied in paper [5].What's more, BML used the maximizing expectation method to calculate the maximum likelihood estimation of parameter.Wang.etc (2008Wang.etc ( , 2012) ) proposed augmented stochastic gradient identification algorithm to the Hammerstein-Wiener system and hierarchical least-square algorithm [6]- [7].Wills.etc (2013) researched Hammerstein-Wiener model identification problem and put forward a new maximum likelihood (short as ML) identification method [8].Gibson.etc (2005) researched the ML estimators of multivariable bilinear model and put forward a new ML estimation based on the maximizing expectation algorithms [9].
The ML identification was put forward by British statistician Fisher based on the parameter estimation method in probability theory & mathematical statistics, which can be used to seek the ML values of parameters.It has come to light that the maximum likelihood method has intuitive reasonable statistical explanation and the good properties.In addition, the maximum likelihood method is deeply studied and widely used in many different fields, such as statistical inference and process identification [10]- [11].
But, from the perspective of practical application [12]- [16], the ML method has some limitations and weaknesses which cannot be ignored.For example, the ML identification algorithm lacks of the tolerance to abnormal data [17], [27]- [28].By tracking and analyzing researches and developments about the stability of the ML identification algorithm [18]- [22] and the immune ability to abnormal data at home and abroad, there are few achievements shown in literatures [23]- [25], and the work about fault-tolerance algorithm is also rare.This paper focuses on improving the ML identification method so as to make sure the improved algorithm can be fault-tolerant.In this paper, the RML identification algorithms of the CAR model parameters are selected as object [15], and some improvement approaches are suggested to modify the RML algorithms against bad interference from the sensor pulse type faults of control system.In addition, the applicability of the algorithm and the quality of the new identification results were analyzed in detail when measurement data contain outliers.The sampling data inevitably contain outliers in the actual production.This method can make active fault tolerance to outliers, improve the processing speed and model precision of the data.It is of great significance in engineering application.
Supported by National Natural Science Foundation of China (No.6147322, No.61074077) www.ijarai.thesai.org In order to overcome the bad impacts of outliers on the RML algorithm, impacts analysis is given in section II, a new kind of outlier-tolerant RML identification algorithm is set up in section III, In section IV, simulation computation and result analysis are presented, which shows that this new algorithm is outlier-tolerant to outliers.Finally, some conclusions are given in Section V.

II. IMPACT ANALYSIS OF OUTLIERS N RML IDENTIFICATION
For the discrete time linear time-invariant control system with sensor measurement error, the difference equations that describes the relationship between input and output system can be expressed as


Where, z is sampling step sliding operator, k u is the input of the system, k y is sensor measurement data, k v is the measurement noise, ) (z A and ) (z are n order and m order polynomial of operator z are gotten in the process of control system, By using the measurement noise ,which are derived from (1), the likelihood function can be constructed as The maximum of parameter vector  can be achieved from ( 2) or (3).This maximum argument is called the ML estimator of parameter vector .
In order to solve the (2) or (3) to obtain the ML estimator of parameter vector  , the partial derivative equation can be deduced and expressed as (4) obeys unrelated random sequence Gaussian distribution.Equation ( 3) and ( 4) can be expressed as The ML estimation of parameter vector is noted as ML  ˆ in (6).The RML algorithm is acquired from the paper [11], which is given in ( 7)

B. Impact Analysis of Outliers on RML identification
It is obvious that if the past  Proposition1.For discrete time controlled autoregressive process, if the measurement data of the sensor is abnormal data at k t , adverse effects that the abnormal data affect the model parameters RML estimation will start at k t and last a long time.
In order to describe the continuous impact of abnormal sensor data to RML identification more intuitively, this section uses the2-order CAR process in (11) Where, the model coefficient 1 The curve of simulation "measure" data sequence with two abnormal data is shown in Fig. 1, where x-coordinate k means time and y-coordinate z means "measure" data, and the curves of residual sequence with abnormal data and without abnormal data are shown in Fig. 2 ,where x-coordinate k means time and y-coordinate z e means "measure" data error.Figure 3 is curves for the four components [ 1,2,3,4] k k k k of gain vector K with abnormal data, where x-coordinate k means time and y-coordinate 1 k , 2 k , 3 k , 4 k means the four components of gain vector K .The curve of RML identification coefficient 1 2 1 2 { , , , } a a b b with abnormal data is shown in section IV.In order to prevent the negative influence of abnormal data, the literature [12] has successfully proposed a bounded constraint.This method has achieved a good effect to the linear regression model parameter identification.
The RML estimation algorithm (7) was revised as Then, the identification algorithm (13) can not only make full use of normal information from the measured data k y , also can effectively restrain the adverse impact of abnormal data at k t , improving the quality and accuracy of the identification results.
It is worth pointing out that the abnormal data which appears before the current moment, not only affects a step prediction residual error , but also affects subsequent calculations gain vector k K .In addition to, the influence is likely to continue for a period of time.Therefore, in order to guarantee tolerance ability of CAR model parameter recursive identification algorithm to abnormal data, it is necessary to revise the gain vector calculation formulation as Based on the above analysis, if the measurement data contains abnormal data for discrete time controlled autoregressive process, the following recursion method can be used instead of RML identification algorithm.
 function which shows as (12) to the recursion method of CAR model parameters in (16), and calibrating outliers which are in the sample points online according to (15), it can effectively improve the tolerance ability of the recursive identification algorithm.In this paper, the modified recursive identification algorithm which is composed of ( 15) and ( 16) is called outlier tolerant RML identification.

IV. SIMULATION COMPUTATION AND RESULT ANALYSIS
The simulation object is 2-order CAR model shown in (11), which used Monte Carlo simulation data including two abnormal data shown in Figure 1.The data uses the identification of RML algorithm (7) and outlier tolerant RML identification algorithm parameter identification (16)  respectively.The result is shown in Figure 4.In Figure 4, the dotted line is the coefficient curve that use RML algorithm when data exists outliers.The solid line shows the coefficient curve that use tolerance RML algorithm to abnormal data.
It is clear that there are two step changes when the data is abnormal from the Figure 4, where x-coordinate k means time and y-coordinate 1 a , 2 a , 1 b , 2 b means the parameters.That's to say, the algorithm of recursive likelihood estimation is obvious instability.Therefore, if the RML directly used in engineering practice, there will be a big deviation, even leading system crashes and influencing the safety operation of the system.It can be clearly seen that the modified algorithm can more accurately estimate coefficient when the data contain abnormal points, and the changes are more smoothly by comparing figures.It also effectively overcomes the adverse impact of abnormal data.
and comparing results between RML and fault tolerance RML parameter identification.The results are shown in Table 1 and Table 2.
Comparing Table 1 and Table 2, it can be seen that the estimation effect of fault tolerant RML parameter is near with the ordinary RML estimation results when a data segment does not contain abnormal points (Segment I), the estimation effect of fault tolerant RML parameter is significantly better than the ordinary RML estimation results when the data segment contains abnormal points (Segment II) ， the abnormal data segment , that is to say, when data section is in segment III, fault tolerant RML parameter estimation effect also is superior to the ordinary RML estimation results even keep a period of time.It is easy to come to conclusion that, the reliability of the outlier-tolerant RML algorithm is superior to the ordinary RML algorithm of CAR model in this paper.

V. CONCLUSION
In this paper, a new kind of outlier-tolerant RML algorithm of CAR model is built up.Simulation results show that the outlier-tolerant RML algorithm is reliable and strong outliertolerant to outliers in sampling time series, which can avoid algorithm collapse even if there are outliers in measurement data set as well as in sampling time series.
Outlier-tolerance ideas have important reference value in many fields, such as complex system automation, signal processing and statistical data processing.The outlier-tolerance ideas and technologies have very important scientific significance and engineering application merit to improve the reliability of the algorithm and the accuracy of the algorithm of data processing.It gets more and more attention in computer control of the dynamic systems, process automation, high performance computing, aerospace engineering, and many other areas.

Fig. 4 .
Fig. 4. Parameter estimators curve using RML algorithm and tolerance RML algorithm separatelyIn order to clearly express the accuracy and reliability of fault-tolerant algorithm, three statistical indexes of a parameter identification results were established in this section.They are maximum absolute error a M , mean absolute error ae M and the it is clear from the above analysis that the abnormal data not only affect the residual error ˆ() vk at k t ,but www.ijarai.thesai.org