A Novel Assessment to Achieve Maximum Efficiency in Optimizing Software Failures An SRGM with Exponential Log-Normal Distribution

Software Reliability is a specialized area of software engineering which deals with the identification of failures while developing the software. Effective analysis of the reliability helps to signify the number of failures occurred during the development phase. This in turn aid in the refinement of the failures occurred during the development of software. This paper identifies a novel assessment to detect and eliminate the actual software failures efficiently. The approach fits in an exponential log normal distribution of Generalized Gamma Mixture Model (GGMM). The approach estimates two parameters using the Maximum Likelihood Estimate (MLE). Standard Evaluation metrics like Mean Square Error (MSE), Coefficient of Determination (R2), Sum of Squares (SSE), and Root Means Square Error (RMSE) were calculated. The experimentation was carried out on five benchmark datasets which interpret the considered novel technique identifies the actual failures on par with the existing models. This novel software reliability growth model which is more effectual in the identification of the failures significantly and facilitate the present software organizations in the release of software free from bugs just in time. Keywords—Software reliability; failure rate; reviews; software cost; optimization


I. INTRODUCTION
Software reliability deals with the process of analyzing the failures obtained during the process of designing software.This methodology helps to evaluate the reliability of software grounded on the developed model and where it takes a generated failure into account and formulates a basis for the identification of reliability process.However, these methods help to underline the present methodology of the software, identifying the Mean Time To failure (MTTF), identify the Mean Absolute Error (MAE) and understand the Mean Square Error (MSE).However, no serious attempts were made to initiate software failures during the initial development phases that help in the total analysis of the system together with a procedure by minimizing the failures such that at the failurefree software can be released just in time.As the number of failures increases, the present literature formulates various strategies and presented diverse thoughts where different models have been constituted with the only objective to identify the software failures and develop strategies to refine the failures, which are entitled as review procedures in software industries with a core intention to minimize the software failures.If the number of failures increases the number of reviews to diminish failures increase substantially making it difficult for the software to release just-in-time.
This increases the overheads of the software cost indirectly.Also in the approaches being followed by the traditional methods of estimating the reliability, the developers are only concentrating on the failures generate.However there is no serious attempt in analyzing the failure notified is a true fault or failure generated due to some of the inside errors such as network fault, data transmission failure, other failures at the internal source and because of the internal failures the end output may be tinted as a failure.Neglecting this basic ideology of analyzing a true failure and an accidental failure, the present traditional systems are evaluating the efficiency of the developed software.Also, the traditional approaches being followed by software team in estimating the failures is totally dependent on the knowledge base present in the literature i.e., the failure rate is totally based on the supervised learning approach where the assessment is carried out mainly based on the knowledge source.However, whenever a new novel software is to be designed no such knowledge respiratory will be available and as such identification of the failure together with the clear-cut distinction among the true failure and actual failure seemed to be a potentially challenging task.
The present article makes an attempt in this direction by full filling the gaps and meeting the above two objectives listed viz., discrimination of true failures and actual failure, identification of the failure in the software where no such history is available.This article also proposes an approach wherein the failure rate can be minimized and the true failure is thereby reflected.This approach is totally based on the derived mathematical model based on Exponential Logarithmic Normal Distribution (ELND).This article is structured as follows: Section 2, Background Study precisely highlights the numerous research carried out in the area of software reliability.Section 3 of the article gives zest of the ELND approach and its necessity; the datasets considered were presented in Section 4. The methodology is illustrated in Section 5 of the article, Section 6 deals with various performance metrics were considered in order to analyze the efficiency of the developed model.In Section 7 of the article, the results derived were summarized and discussed.In the www.ijacsa.thesai.orgconcluding Section 8, concludes the work presented in the above sections.

II. BACKGROUND STUDY
In order to drive the developed software's towards perfection, every software company tries to adopt the policies of software reliability life cycles with the objective to develop reliable software.In general practice after the software is developed and is assumed to be clear for implementation; the testing phase is conducted generally called as the review.In these reviews, the probability of the failures can be notified.If this failure probability is high steps are to be initiated to substantially bring down the failure rate considerably before releasing the software to the market.Many models are presented in the literature by taking this issue and formulating the objectives like developing user-friendly software, developing software which is fully functional, enhances the capability and ensures maintainability.With these objectives, the software developing should be carried out to prepare failure-free software satisfying the user's requirement.Of late many models showcased in the literature presents models that fulfil the objectives of the user requirements.Some of the predominant models in this area of research that are coined initially are from [1], by proposing the initial study of software reliability and have published and presented a good number of papers to benefit the potential researchers working in this field.Markov Birth death Process is utilized in shaping the failure probability and also suggested methodologies to identify the failure rate.The falls in this regard are identified by using the binomial distribution and Weibull distribution was considered for identifying the mean value function.The research in this area is further taken into life by [5], [6] and [7].The errors if at all exeunt are fixed and failure intensity is proportional to the number of remaining failures [5].A pictorial view of the failure rates and has thrown an insight to identify that the failure rate may decay during different time intervals [6].Bayesian method of approach is followed by [7] which a derivation for estimating the effect of failures on the software cost.Every failure rate can be projected as a twoclass discrete time model, where the first class represents the error detection process and the second class is utilized for estimating the future error.In these works, the authors have assumed that the failure rate formulates a geometric progression [14].
The second level of research in this direction was initiated by [16] and [17] in the research carried out by the authors the estimation of the failure rates were based on measures of dispersions and are limited to the central limit theorem.Authors have also formulated models based on hyper geometric distribution to derive a model that can find the optimal number of failures from a developed software product.
A new direction for estimating the reliability was proposed by [23], where the authors have developed a model namely Gomptez distribution and this methodology is proven to be a most validating method for estimation of the failures.Research is also extended not only using the Non-Homogeneous Poisson Processes but other distributions like a family of Pareto distribution was carried out by [24], [25] and [26], where the authors have formulated new ideologies for estimating the failure rates and identify the mean time to failure.Latest studies were also published where most of the works are based on Weibull distribution, generalized Laplacian distribution, Raleigh distribution and Gaussian distribution.These models are also confined to the study of reliability basing on the error rates.
However, in spite of rigorous research in this area, most of the works presented by the earlier authors are confined to the study of the impact of failure rate and some articles tried to project the time between the failures.No serious attempt was witnessed in the literature to minimize the error rate or to discriminate the true error from the actual error.This article is framed to fulfil this objective in the most novel approach.

III. EXPONENTIAL LOGARITHMIC NORMAL DISTRIBUTION
In order to estimate the failures, it is necessary to understand the pattern of the failures.This analysis of the pattern helps to signify the true failures and the possible nonfailures.However, it is to be notified exactly.For this purpose, many models have been present in the literature [2]- [4] [8] [11]- [15] [18]- [22].However, these models failed to attribute the analysis of the true failure as it is evident that every initial data in the failure data model assumes exponential distribution and hence the article we have considered Exponential Logarithmic Normal Distribution.The Probability Density Function (PDF) for fitting the ELND is given by = 0 otherwise.

Where 'x' represents a failure
Here the values of p and q are estimated using the methodology of lease square and by using the formulae ii np q t    and (2)

IV. DATASETS
In order to present the proposed methodology, we have considered two datasets, namely, [9] and [10] for highlighting the proposed model.The first dataset of Tandem consists of failure data executed in four releases, Release 1 to Release 4. Each of the releases consisted of the failures generated.In the second dataset considered for the experimentation namely, Brooks & Motely contain a failure data set.These datasets are considered for the presentation of the proposed model is given below.V. METHODOLOGY

Labels in the
The data for the experimentation of the proposed model is presented in the above section, each of these datasets is considered and for each dataset the initial estimates of the parameters of the proposed Exponential Logarithmic Normal Distribution, p and q are estimated.Using the method of Least Square Estimation and the values so obtained are presented below: Using these estimates the analysis of the proposed model is considered.
Here the first dataset Tandem is considered containing four releases 1 to 4 is presented along with the second failure dataset considered Brooks & Motely in the above Tables I and II.
Against each of the dataset, the analysis is carried out in a phased manner wherein the first phase the true failures are estimated and the experimentation are processed to minimize the failure rate given in Tables III.
Against each of the data released, the number of the actual defects highlighted is considered and using these defects the actual failures are predicted and are presented as below: Labels in Table IV

VII. RESULTS AND DISCUSSIONS
The results of the performance evaluation metrics are showcased in the following Table IX.
From the above Table IX, it can be clearly seen that the MSE is less for the Release 1 and the R 2 , is almost approaching 1, which signify that the model performs better.
The SSE metrics and RMSE metrics also showcase significant measures.This showcases that the proposed methodology is delivering an outstanding performance in predicting the failures.The experimentation carried out across the two datasets namely, Tandem and Brooks & Motely were represented below.The figures showcase the experimentation carried out across the datasets with respect to the individual failure dataset.Fig. 1, Fig. 4, Fig. 7, Fig. 10 and Fig. 13 depict the actual failures of various datasets.It can be clearly seen that for the values which lie above the curve were reported as failures but not a failure in the original.The present model is novel to identify the true failures and thus drives our attempt in novel nature.Fig. 2, Fig. 5, Fig. 8, Fig. 11 and Fig. 14 depict the predicted failures of various datasets.The same set of failures at the respective time were even predicted.Fig. 3, Fig. 6, Fig. 9, Fig. 12 and Fig. 15 depict the residuals evaluated for various datasets.This clearly showcases the entire methodology and the results keep it on track so that the novelty of the entire concept is justified.The failures that are identified were displayed for the datasets considered.The residuals were calculated across every observation and were presented for the datasets considered.

VIII. CONCLUSION
In this article an ideology is presented which is novel for the minimization of failures and also facilitating the software developer to understand the actual failures that are derived from the project because of some of the technical flaws and also highlighted the predicted failures, which are not the failures but reported as failures due to the issues of technicality or human failures.The works presented in this article on two benchmark datasets helps to understand the potentiality of the model.The results also attribute the significance of the model and this model can be implemented into a software firm helps to not only minimize the review times but also helps to release the software just in time together with enhancing the profit budget.

Fig. 1 .
Fig. 1.Actual Failures vs No. of Defects for the TANDEM Release 1.

Fig. 2 .
Fig. 2. Predicted Defects vs No. of Defects for the TANDEM Release 1.

Table I ,
TW represents the Test Weeks, EH represents the Execution Hours and ND represents the No. of defects.Labels in the Table II, TW represents the Test Weeks, EH represents the Execution Hours and AD represents the No. of defects.www.ijacsa.thesai.org

TABLE I
to Table VIII, TW represent the Test Weeks, ND represents the No. of Defects, PD represents Predicted Defect, RES represents the Residual and Fault classifies whether the failure is a True failure or not.

TABLE III
In this process, the residuals are identified where the actual notified errors are subtracted from the predicted errors and the process carried out on the two datasets namely Tandem and Brooks & Motely are tabulated in TableIVto Table VIII.The Fault column in every table specifies the outcome of the proposed model on the datasets and it clearly specifies how best the proposed model have identified the true failures and in turn reduce the failure rate when compared to the original dataset.

TABLE V
In order to evaluate the outputs derived from the proposed model, we have considered the following metrics such as Mean Squared Error (MSE), R 2 , Sum of Squares Error (SSE) and Root Mean Squared Error (RMSE).The formulas for the calculation of the above metrics are given by

TABLE IX .
ACTUAL FAILURES FOR TANDEM DATASET RELEASE-4