Expectation-Maximization Algorithms for Obtaining Estimations of Generalized Failure Intensity Parameters

This paper presents several iterative methods based on Stochastic Expectation-Maximization (EM) methodology in order to estimate parametric reliability models for randomly lifetime data. The methodology is related to Maximum Likelihood Estimates (MLE) in the case of missing data. A bathtub form of failure intensity formulation of a repairable system reliability is presented where the estimation of its parameters is considered through EM algorithm . Field of failures data from industrial site are used to fit the model. Finally, the interval estimation basing on large-sample in literature is discussed and the examination of the actual coverage probabilities of these confidence intervals is presented using Monte Carlo simulation method. Keywords—Repairable systems reliability; bathtub failure intensity; EM algorithm; estimation; likelihood; Monte Carlo simulation


INTRODUCTION
There is an ongoing effort in the industrial fields to reach more reliability and efficiency of their systems.The major risks in certainty are mainly safety, availability, costs and especially those of maintenance and lifetime.Near the industrial companies, we can go over these risks by the competitiveness and the safety which became a temptation responsible for the management of maintenance to improve the reliability objectives.The majority of the approaches of the maintenance are based on reliability such as Peña and al., 2007, Y. Dijoux (2009), L. Doyen (2012).However, the reliability of the industrial systems depend closely on the efficiency of these maintenance actions and the effective management of the maintenance policy wich requires a realistic modeling of their effects.on the other hand, when a maintenance program is chosen, its efficiency and its impact on the system operation are unkonwn.Then appears the idea in this paper to model the system lifespan and to quantify its degradation state or its failure to realize the impact of a maintenance action on system behavior.
The most important characteristics is the evaluation of the system failure's intensity, and the discovery of its degradation at the appropriate time.And in order to optimize the maintenance programs by reducing the costs we use the Maintenance Optimization by reliability (MOR) as presented in Dewan and Dijoux (2015).
First of all stochastic models of failures process and repairs of various systems are builded.Secondly, the statistical methods are implemented to exploit the failures and maintenances data raised by experts to evaluate the performance of these systems.In this context, different methods as MLE, moment estimation, and EM algorithm are presented in Doyen (2012).Sethuraman and Hollander (2009) developed a non-parametric Bayes estimator for a general imperfect repair model including Brown-Proschan model.Doyen (2011) generalized this approach and considered the maximum likelihood estimation.The performance of the Brown-Proschan model when repair effects are unknown as resulting in the work of Krit and Rebai (2012) and Krit (2014).Babykina and Couallier (2012) used EM algorithm to estimate the parameters of a generalization of this model wich allows first-order dependency between two consecutive repair effects, they assumed that only some repair effects were unknown.Lim and Lie (2000) proposed another method based on Bayesian analysis: they assumed a prior beta distribution for parameter p. Langseth and Lindqvist (2003) generalized the Brown-Proschan model for imperfect preventive maintenance, and they proposed to estimate the parameters of the model with the likelihood function.Franco and al. (2011) study the classification of the aging properties of generalized mixtures of two or three weibull distributions in terms of the mixing weights, scale parameters and a common shape parameter, which extends the cases of exponential distributions.Formerly, some work has been done on the estimation of the three-parameter log-normal distribution based on complete and censored samples.Basak and al. (2009) developed inferential methods based on progressively censored samples from a three-parameter log-normal distribution.In particular, they use the EM algorithm as well as some other numerical methods to determine MLE of parameters.The asymptotic variances and covariances of the MLE from the EM algorithm are computed by using the missing information principle.
The purpose of this paper is to formulate a genreal and realistic model, in order to identify the behavior evolution of reparable system during all its lifetime.The paper is organized as follows; Section 2 presents the characteristics of the failures process.Section 3 analysis the various models wit mixing corrective and preventive maintenance.An application to real data on the quality of the model parameters estimators is completed in section 4. Finally, conclusions are presented in section 5.

II. CARACTERISTICS OF THE FAILURE PROCESS
In this section, the failure intensity in bathtub form is presented to formulate pace of such intensity on the three phases of the system life, Krit and Rebai (2013).Two forms are distinguished one from the other by a small change over the service life time.In the first form, as it indicates the hereafter form, the failures process is modeled by superposition of three Poisson processes; the first and the third non-homogeneous and the second is homogeneous, of which the intensity is selected in following way: It declines up one instant noted by  0 , according to the function of the form (there will not be an advance of system degradation in this phase) up to an instant  1 which is beyond the intensity increases in accordance with the form function , realizing a degradation case.This idea is originally proposed by Mudholkar-Srivastava (1993) in the context of non-reparable system.It is proved that this degradation modeling comprises two terms ; one finded in Weibull process wich is proceeded by admitting the assumption of perfect corrective maintenance stated in Bertholon and al.
(2004) like an alternative against Weibull law.The waiting duration of next failure can be written by the form  = min (, , ), where: •  a random variable, independent of  and  , of Weibull law having as form the first expression, with a shift parameter equal to zero.
•  a random variable, independent of  and  , of exponential law with parameter  0 , which corresponds to constant failure intensity equalizes to •  a random variable of Weibull law having a shift parameter equal to  1 .
Our proposal, with the help of system behavior modeling, characterizes the failures process by intensity which is formulated as follows: Knowing this intensity, the implicitly of system reliability can be removed, by using the following relation: The failure number until the instant t, noted   , follows formally a Poisson law with parameter Λ() = ∫  0 ().For the present model, Let's announce first that all that times inter-failures are not independent.In this case, the function ℱ  +1 / 1 = 1 ,…,  =  have a conditional law of the next failure instant  +1 such as:

A. Basic Theory of the EM Algorithm
The EM Algorithm, proposed by Dempster and al. (1977), is an algorithm largely used to find a solution of the likelihood equation in the situations of the incomplete data.A suitable formulation is needed to facilitate the application of EM algorithm in our context.We present initially the algorithm in its general information.
We note  the random vector corresponding to the data observed  .The probability distribution of  is (; ) , where  is the vectorial parameter of the statistical model.Moreover,   the vector of complete data with the distribution function   (  ; ), and  indicate the vector of the missing data, then   = (, ) .McLachlan and Krishnan (1997) presented an extension of EM algorithm so that the vector of data observed is foreseen according to complete data   , of which a relation is resulted as follows: (; ) = ∫ ()   (  ; )   (5) When in two spaces  and , we examine the vector of incomplete data  = () in  instead of examining the vector of complete data   in .Moreover, there are several traces of  surrounding to .We note: • ℒ(; ) and (; ) , respectively the likelihood and log-likelihood of data observed; • ℒ  (;   ) and   (;   ), respectively the likelihood and log-likelihood of de complete data; • (/ (ℎ) ) = �  (;   )/,  (ℎ) � where  (ℎ) is the current estimate of the parameter.
In the EM algorithm, the objective is not to maximize (; ) directly in seen to obtain the MLE.Nevertheless, we maximize repeatedly   (;   ) in average on all the possible values of the missing data .In fact, it is the objective function (/ (ℎ) ) which is to be maximized repeatedly.

B. The use of the EM algorithm
instants of failure are considered on an obviously reparable system, noted = ( 1 , … ,   ).In the example, certain observations can be censured on the right.The vector of the parameters  of the model is ( 0 ,  1 ,  2 ,  1 ,  2 ),  0 and  1 are fixed at two unspecified values, checking 0 <  1 .
In order to simplify the formulas, we note: the failure intensity of the Homogeneous Poisson Process (HPP); In this case, the missing data are the indicators   = (   ,     ,     ) that a failure is accidental (   = 1), or it is due to degradation (    = 1,     = 1), respectively either to the youth period, or to the marked degradation period.
We note then that the data observed   are achievements of the random variablemin(ℋ  ,    ,    ,   ), while the vector ℋ (respectively   ,   ) is a HPP (respectively two forms of Weibull) and   is the censure instant of the  ℎ observation.
Formally, the iteration ℎ + 1 of the EM algorithm requires the following calculations: IV.

NUMERICAL EXAMPLES
The data analysis is based on real example concerning reparable system (hydraulic pump) about nuclear sector of France which was used in Bertholon and al. (2004).The studied system retains a hydraulic pump on which the observation of 6 successive failures are used (18 months, 30, 82, 113, 121, 126).The estimation of model's parameters using the EM algorithm gives the following results: •  0 and  1 , respective instants of improvement end and degradation beginning are estimated to 26.685 and 101.412.
• The reverse of accidental failure rate  0 is estimated by ̂0 = 43.855.
In order to obtain a solid numerical results, a Monte-Carlo simulation is employed, allowing to compare the estimation of our model by MLE and EM algorithms.Two different cases are presented as follow: • The first case retains 100 simulations of 50 size sample of our model with parameters  0 = 1 ,  1 = 1 ,  1 = 0.5, 2 = 1, 2 = 2, 0 = 30,  1 = 100.
• The second case retains 100 simulations of 50 size sample of our model with the same parameters except for  2 = 3.
The results are stated in form of mean and a 95% confidence interval.The next table presents the estimation results:

V. DISCUSSION
In the long run, following the results of preceding tests, the failures process is a NHPP.The empirical cumulative distribution function of real data is evolved in the same direction as the simulated one.This process is then managed by our reliability model.Consequently, the effects of estimation show the way that there is an improvement of the system until the second failure (during 2.2 years of operation) and degradation starts from the fourth failure (beyond 8.5 years of operation).Considering the same unit of data over the improvement period and of degradation, the scale parameters  1 and  2 over these two periods do not have a significant difference.This can be easily guaranteed with skew of an averages difference traditional test.The estimate value of  2 is higher than 2, the failure intensity is increasing and convex announcing a marginal progress in degradation state.At the same time,  1 takes an estimate value very near to 1 indicates that the intensity is practically constant.as a result, the failures are rather accidental and cannot be due to youth diseases.This purified model of improvement period, which is presented in Bertholon and al. (2004), remains able independently to concretize the hydraulic pump behavior.In light of simulations, we state the following criticisms: • The ̂ ( = 0,1 ou 2) have the best behavior to one side for the first case where ̂0 appears to degrade.
As a final point , the EM procedures offer better estimators for the second case.The values of  2 are rather higher than 2, then the curve is convex over the degradation period as it is presented in our model.A potential limitation of our model is that it involves seven parameters.In fact, it is difficult to estimate these parameters for small-sized and/or censured samples.For this reason even, the MLE appear more reliable for industrial applications.