Hybrid Intelligent System for Sale Forecasting Using Delphi and Adaptive Fuzzy Back-propagation Neural Networks

—Sales forecasting is one of the most crucial issues addressed in business. Control and evaluation of future sales still seem concerned both researchers and policy makers and managers of companies. this research propose an intelligent hybrid sales forecasting system Delphi-FCBPN sales forecast based on Delphi Method, fuzzy clustering and Back-propagation (BP) Neural Networks with adaptive learning rate. The proposed model is constructed to integrate expert judgments, using Delphi method, in enhancing the model of FCBPN. Winter's Exponential Smoothing method will be utilized to take the trend effect into consideration. The data for this search come from an industrial company that manufactures packaging. Analyze of results show that the proposed model outperforms other three different forecasting models in MAPE and RMSE measures.


INTRODUCTION
Sales forecasting plays a very important role in business strategy.To strengthen the competitive advantage in a constantly changing environment, the manager of a company must make the right decision in the right time based on thein formation at hand.Obtaining effective sales forecasts in advance can help decision makers to calculate production and material costs, determine also the sales price, strategic Operations Management, etc.
Hybrid intelligent system denotes system that utilizes a parallel combination of methods and techniques from artificial intelligence.As hybrid intelligent systems can solve non-linear prediction, this article proposes an integration of a hybrid system FCBPN within an architecture of ERP to improve and extend the management sales module to provide sales forecasts and meet the needs of decision makers of the company.
The remainder of the article is constructed as follows: Section 2 is the literature review.Section 3 describes the construction and role of each component of the proposed sale forecasting system Delphi-FCBPN Section 4 describes the sample selection and data analysis.Finally, section 5 provides a summary and conclusions.

II. LITERATURE AND RELATED RESEARCH
Enterprise Resource Planning (ERP) is a standard of a complete set of enterprise management system .It emphasizes integration of the flow of information relating to the major functions of the firm [29].There are four typical modules of ERP, and the sales management is one of the most important modules.Sales management is highly relevant to today's business world; it directly impacts the working rate and quality of the enterprise and the quality of business management.Therefore, Integrate sales forecasting system with the module of the sales management has become an urgent project for many companies that implement ERP systems.
Many researchers conclude that the application of BPN is an effective method as a forecasting system, and can also be used to find the key factors for enterprisers to improve their logistics management level.Zhang, Wang and Chang (2009) [28] utilized Back Propagation neural networks (BPN) in order to forecast safety stock.Zhang, Haifeng and Huang (2010)[29] used BPN for Sales Forecasting Based on ERP System.They found out that BPN can be used as an ac-curate sales forecasting system.
Reliable prediction of sales becomes a vital task of business decision making.Companies that use accurate sales forecasting system earn important benefits.Sales forecasting is both necessary and difficult.It is necessary because it is the starting point of many tools for managing the business: production schedules, finance, marketing plans and budgeting, and promotion and advertising plan.It is difficult because it is out of reach regardless of the quality of the methods adopted to predict the future with certainty.Parameters are numerous, complex and often unquantifiable.
Recently, the combined intelligence technique using artificial neural networks (ANNs), fuzzy logic, Particle Swarm Optimization (PSO), and genetic algorithms (GAs) has been demonstrated to be an innovative forecasting approach.Since most sales data are non-linear in relation and complex, many studies tend to apply Hybrid models to time-series forecasting.Kuo and Chen (2004) [20] use a combination of neural networks and fuzzy systems to effectively deal with the marketing problem.www.ijacsa.thesai.org The rate of convergence of the traditional backpropagation networks is very slow because it's dependent upon the choice of value of the learning rate parameter.However, the experimental results (2009 [25]) showed that the use of an adaptive learning rate parameter during the training process can lead to much better results than the traditional neural net-work model (BPN).
Many papers indicate that the system which uses the hybridization of fuzzy logic and neural networks can more accurately perform than the conventional statistical method and single ANN.Kuo and Xue (1999) [21] proposed a fuzzy neural network (FNN) as a model for sales forecasting.They utilized fuzzy logic to extract the expert's fuzzy knowledge.Toly Chen (2003) [27] used a model for wafer fab prediction based on a fuzzy back propagation network (FBPN).The proposed system is constructed to incorporate production control expert judgments in enhancing the performance of an existing crisp back propagation network.The results showed the performance of the FBPN was better than that of the BPN.Efendigil, Önü, and Kahraman (2009) [16] utilized a forecasting system based on artificial neural networks ANNs and adaptive network based fuzzy inference systems (ANFIS) to predict the fuzzy demand with incomplete information.
Attariuas, Bouhorma and el Fallahi [30] propose hybrid sales forecasting system based on fuzzy clustering and Backpropagation (BP) Neural Networks with adaptive learning rate (FCBPN).The experimental results show that the proposed model outperforms the previous and traditional approaches (BPN, FNN , WES, KGFS).Therefore, it is a very promising solution for industrial forecasting.Chang and Wang (2006) [6] used a fuzzy back-propagation network (FBPN) for sales forecasting.The opinions of sales managers about the importance of each input, were converted into prespecified fuzzy numbers to be integrated into a proposed system.They concluded that FBPN approach outperforms other traditional methods such as Grey Forecasting, Multiple Regression Analysis and back propagation networks.Chang, Liu, and Wang (2006)[7] proposed a fusion of SOM, ANNs, GAs and FRBS for PCB sales forecasting.They found that performance of the model was superior to previous methods that proposed for PCB sales forecasting.
Chang, Wang and Liu (2007) [10] developed a weighted evolving fuzzy neural network (WEFuNN) model for PCB sales forecasting.The proposed model was based on combination of sales key factors selected using GRA.The experimental results that this hybrid system is better than previous hybrid models.Chang and Liu (2008) [4] developed a hybrid model based on fusion of cased-based reasoning (CBR) and fuzzy multicriteriadecision making.The experimental results showed that performance of the fuzzy casedbased reasoning (FCBR) model is superior to traditional statistical models and BPN.
A Hybrid Intelligent Clustering Forecasting System was proposed by Kyong and Han (2001) [22].It was based on Change Point Detection and Artificial Neural Networks.The basic concept of proposed model is to obtain significant intervals by change point detection.They found out that the proposed models are more accurate and convergent than the traditional neural network model (BPN).
Chang, Liu and Fan (2009) [5] developed a K-means clustering and fuzzy neural network (FNN) to estimate the future sales of PCB.They used K-means for clustering data in different clusters to be fed into independent FNN models.The experimental results show that the proposed approach outperforms other traditional forecasting models, such as, BPN, ANFIS and FNN.
Chang, Wang and Tsai (2005) [3] used Back Propagation neural networks (BPN) trained by a genetic algorithm (ENN) to estimate demand production of printed circuit board (PCB).The experimental results show that the performance of ENN is greater than BPN.
Hadavandi, Shavandi and Ghanbari (2011) [18] proposed a novel sales forecasting approach by the integration of genetic fuzzy systems (GFS) and data clustering to construct a sales forecasting expert system.They use GFS to extract the whole knowledge base of the fuzzy system for sales forecasting problems.Experimental results show that the proposed approach outperforms the other previous approaches.This article proposes an intelligent hybrid sales forecasting system Delphi-FCBPN sales forecast based on Delphi Method, fuzzy clustering and Back-propagation (BP) Neural Networks with adaptive learning rate (FCBPN) for sales forecasting in packaging industry

III. DEVELOPMENT OF THE DELPHI-FCBPN MODEL
The proposed approach is composed of three stage as shown in Figure 1 :(1) Stage of data collection: collection of key factors that influence sales will be made using the Delphi method through experts judgments; (2) Stage of Data preprocessing: Use Rescaled Range Analysis (R/S) to evaluate the effects of trend.Winter's Exponential Smoothing method will be utilized to take the trend effect into consideration.The data for this study come from an industrial company that manufactures packaging in Tangier from 2001-2009.Amount of monthly sales is seen as an objective of the forecasting model.

A. Stage of data collection
Data collection will be implemented based on productioncontrol expert judgments.Some sales managers and production control experts are requested to express their www.ijacsa.thesai.orgopinions about the importance of each input parameter in predicting the sales, and then we apply the Delphi Method to select the key factors for forecasting packaging industry.

1)
The collection of factor that influence sales Packaging industry is an industry with highly variant environment.The variables of packaging industry sales can be subdivided in three domains: (1) the market demand domain; (2) macroeconomics domain; (3) industrial production domain.To collect all possible factors, which can affect the packaging industry sales from the three domains mentioned earlier, some sales managers and production control experts are requested to list all possible attributes affecting packaging industry sales

2) Delphi Method to select the factors affecting sales
The Delphi Method was first developed by Dalkey and Helmer (1963) in corporation and has been widely applied in many management areas, e.g.forecasting, public policy analysis, and project planning.The principle of the Delphi method is the submission of a group of experts in several rounds of questionnaires.After each round, a synthesis anonymous of response with experts' arguments is given to them.The experts were then asked to revise their earlier answers in light of these elements.It is usually found as a result of this process (which can be repeated several times times if necessary), the differences fade and responses converge towards the "best" answer.The Delphi Method was used to choose the main factors, which would influence the sales quantity from all possible that were collected in this research.The procedures of Delphi Method are listed as follows: 1. Collect all possible factors from ERP database, which may affect the monthly sales from the domain experts.This is the first questionnaire survey.2. Conduct the second questionnaire and ask domain experts select assign a fuzzy number ranged from 1 to 5 to each factor.The number represents the significance to the sales.3. Finalize the significance number of each factor in the questionnaire according to the index generated in step 3. Repeat 2 to 3 until .theresults of questionnaire converge

B. Data preprocessing stage
Based on Delphi method, the key factors that influence sales are (K1, K2, K3) (see Table 1): When the seasonal and trend variation is present in the time series data, the accuracy of forecasting will be influenced.R/S analysis will be utilized to detect if there is this kind of effects of serious data.If the effects are observed, Winter's exponential smoothing will be used to take the effects of seasonality and trend into consideration.

1) R/S analysis (rescaled range analysis)
For eliminating possible trend influence, the rescaled range analysis, invented by Hurst (Hurst, Black, & Simaika, 1965), is used to study records in time or a series of observations in different time.Hurst spent his lifetime studying the Nile and the problems related to water storage.The problem is to determine the design of an ideal reservoir on the basis of the given record of observed discharges from the lake.The R/S analysis will be introduced as follows: Consider the XZ{x1, x2,.,xn}, xi is the sales amount in period i, and compute MN where The standard deviation S is defined as www.ijacsa.thesai.orgFor each point i in the time series, we compute We computed the H coefficient as , here a=1 When 0<H<0.5, the self-similar correlations at all timescales are anti-persistent, i.e. increases at any time are more likely to be followed by decreases over all later time scales.When H=0.5, the self-similar correlations are uncorrelated.When 0.5<H<1, the self-similar correlations at all timescales are persistent, i.e. increases at any time are more likely to be followed by increases over all later time scales.

2) Winter's Exponential Smoothing
In order to take the effects of seasonality and trend into consideration, winter's Exponential Smoothing is used to preliminarily forecast the quantity of sales.For time serial data, Winter's Exponential Smoothing is used to preprocess all the historical data and use them to predict the production demand, which will be entered into the proposed hybrid model as input variable (K4)(see Table 1).Similar to the previous researches, we assume α= 0.1 , β= 0.1 and γ= 0.9 .The data generating process is assumed to be of the form Where C t seasonal factor is exponentially smoothed level of the process at the end of period t x t actual monthly sales in period t N number of periods in the season (N=12 months) α t-1 trend for period t-1 α smoothing constant for α 0 .

The season factor, C t , is updated as follows
Where γ is the smoothing constant for Ct.For updating the trend component Where Φis the smoothing constant for α 1 .Winter's forecasting model is then constructed by Where is the estimate in time period t.

C. FCBPN forecasting stage
FCBPN [30] (Fuzzy Clustering and Back-Propagation (BP) Neural Networks with adaptive learning rate) is utilized to forecast the future packaging industry sales.As shown in figue 2, FCBPN is composed of two steps: (1) utilizing Fuzzy C-Means clustering method (Used in an clusters memberships fuzzy system (CMFS)), the clusters membership levels of each normalized data records will be extracted; (2) Each cluster will be fed into parallel BP networks with a learning rate adapted as the level of cluster membership of training data records.

1)
Extract membership levels to each cluster (CMFS) Using Fuzzy C-Means clustering method (utilized in an adapted fuzzy system (CMFS)), the clusters centers of the normalized data records will be founded, and consequently, we can extract the clusters membership levels of each normalized data records.

1.1) Data normalization
The input values (K1, K2, K3, K4) will be ranged in the interval [0.1, 0.9] to meet property of neural networks.The normalized equation is as follows: Where K i presents a key variable, N i presents normalized input (see Table 1), max (K i ) and min (K i ) represent maximum and minimum of the key variables, respectively.

1.2) Fuzzy c-means clustering
In hard clustering, data is divided into distinct clusters, where each data element belongs to exactly one cluster.In Fuzzy c-means (FCM) (developed by Dunn 1973 [14] and improved by Bezdek 1981 [1]), data elements can belong to more than one cluster, and associated with each element is a set of membership levels.It is based on minimization of the following objective function: Where u ij is the degree of membership of x i in the cluster j ,x i is the i th of measured data and c j is the center of the j th cluster.The algorithm is composed of the following steps:  1.
3) The degree of Membership levels (MLC k ) In this stage, we will use the sigmoid function (Figure3) to improve the precision of results and to accelerate the training process of neural networks.Then, the advanced fuzzy distance between records data (X i ) and a cluster center (c k ) (AFD k ) will be presented as follow: The membership levels of belonging of a record X i to k ith cluster (MLC k (X i )) is related inversely to thedistance from records data X i to the cluster center c k (AFD k (X i )): The clusters memberships' fuzzy system (CMFS) return the memberships level of belonging of data record X to each clusters: Thus, we can construct a new training sample ( X i , MLC 1 (X i ), MLC 2 (X i ), MLC 3 (X i ), MLC 4 (X i )) for the adaptive neural networks evaluating (Figure2).

2) 2) Adaptive neural networks evaluating stage
The artificial neural networks (ANNs) concept is originated from the biological science (neurons in an organism).Its components are connected according to some pattern of connectivity, associated with different weights.The weight of a neural connection is updated by learning.The ANNs possess the ability to identify nonlinear patterns by learning from the data set.The back-propagation (BP) training algorithms are probably the most popular ones.The structure of BP neural networks consists of an input layer, a hidden layer, as well as an output layer.Each layer contains I ;J and L nodes denoted.The w i j is denoted as numerical weights between input and hidden layers and so is w jl between hidden and output layers as shown in Figure4.
In this stage, we propose an adaptive neural networks evaluating system which consists of four neural networks.Each cluster K is associated with the K ith BP network.For each cluster, the training sample will be fed into a parallel Back Propagation networks (BPN) with a learning rate adapted according to the level of clusters membership (MLC k ) of each records of training data set.The structure of the proposed system is shown in Figure 2. The Adaptive neural networks learning algorithm is composed of two procedures: (a) a feed forward step and (b) a back-propagation weight training step.These two separate procedures will be explained in details as follows: Step 1-All BP networks are initialized with the same random weights.
For each BPN k (associate to the K th cluster), we assume that each input factor in the input layer is denoted by x i .yj k and o k l represent the output in the hidden layer and the output layer, respectively.And y j k and o k l can be expressed as follows: where the w oj k and w ol k are the bias weights for setting threshold values, f is the activation function used in both hidden and output layers and X j k and Y l k are the temporarily computing results before applying activation function f .In www.ijacsa.thesai.orgthis study, a sigmoid function (or logistic function) is selected as the activation function.Therefore, the actual outputs y j k and o k l in hidden and output layers, respectively, can also be written as: The activation function f introduces the nonlinear effect to the network and maps the result of computation to a domain (0, 1).In our case, the sigmoid function is used as the activation function.
The globte output of the adaptive neural networks is calculate as: As shown above, the effect of the output o k l on the global output o l is both strongly and positively related to the membership level (MLC k ) of data record X i to k ith cluster.

Comparisons of FCBPN model with other previous models
Experimental comparison of outputs of DELPHI-FCBPN with other methods shows that the proposed model outperforms the previous approaches (Tables 4-6).We apply two different performance measures called mean absolute percentage error (MAPE) and root mean square error (RMSE), to compare the FCBPN model with the previous methods:     Where, P t is the expected value for period t ,Y t is the actual value for period t and N is the number of periods.
As shown in Table7, the forecasting accuracy of Delphi-FCBPN is superior to the other traditional approaches regarding MAPE and RMSE evaluations which are summarized in Table7.

V. CONCLUSION
Recently, more and more researchers and industrial practitioners are interested in applying fuzzy theory and neural network in their routine problem solving.This research combines fuzzy theory and back-propagation network into a hybrid system, which will be applied in the sales forecasting of packaging industries sales.This research proposes a hybrid system based on Delphi method, fuzzy clustering and Backpropagation Neural Networks with adaptive learning rate (FCBPN) for sales forecasting.
We applied DELPHI-FCBPN for sales forecasting in a case study of manufactures packaging in Tangier.The experimental results of the proposed approach in Section 4 demonstrated that the effectiveness of the DELPHI-FCBPN is superiorto the previous and traditional approaches: WES, BPN and FNN regarding MAPE and RMSE evaluations.

( 3 )
Stage of learning by FCBPN: We use hybrid sales forecasting system based on Delphi, fuzzy clustering and Backpropagation (BP) Neural Networks with adaptive learning rate (FCBPN).

Figure 1 :
Figure 1: Architecture of the Delphi-FCBPN model

Description K 1 1 2 Offerscompetitiveindex N 2 3 4 4
Manufacturing consumer index N Normalized manufacturing consumer index K Normalized offers competitive index K 3 packaging total production value Index N Normalized packaging total production value Index K Preprocessed historical data (WES) N Normalized preprocessed historical data (WES) Y 0 Actual historical monthly packaging sales Y Normalized Actual historical monthly packaging sales Table1: Description of input forecasting model.

Step 1 :Step 2 :Step 3 :
Initialize randomly the degrees of membership matrix Calculate the centroid for each cluster C(k )=[c j ] withU(k ) : For each point, update its coefficients of being in the clusters (U(k ) ,U(k+1)) : Step 4:If then STOP; otherwise return to step 2. www.ijacsa.thesai.orgThisprocedure converges to a local minimum or a saddle point of J m .According to Bezdek[1], the appreciated parameter combination of two factors (m and ε) ism = 2 and ε= 0.5 Using fuzzy c-means, Table2 shows that the use of four clusters is the best among all different clustering numbers.

FIGURE 4 :
FIGURE 4: The structure of back-propagation neural network

Step 3 - 1 - 1 -
Back-propagation weight training.The error function is defined as: Where t k is a predefined network output (or desired output or target value) and e k is the error in each output node.The goal is to minimize E so that the weight in each link is accordingly adjusted and the final output can match the desired output.The learning speed can be improved by introducing the momentum term.Usually, falls in the range [0, 1].For the iteration n and for BPN k (associated to k th cluster), the adaptive learning rate in BPN k and the variation of weights Δw k can be expressed as As shown above, we can conclude that the variation of the BPN k network weights (w oj k andw ol k ) are more important as long as the membership level (MLC k ) of data record X j to k th cluster is high.If the value of membership level (MLC k ) of data record X j to k ith cluster is close to zero then the changes in BPN k network weights are very minimal.The configuration of the proposed BPN is established as follows: -Number of neurons in the input layer: I =4 -Number of neurons in the output layer: L = Single hidden layer -Number of neurons in the hidden layer: J =2 -Network-learning rule: delta rule -Transformation function: sigmoid function learning rate: =0.Momentum constant: = 0.02 learning times :20000 IV.EXPERIMENT RESULTS AND ANALYSIS 1) Constructing DELPHI-FCBPN System The data test used in this study was collected from sales forecasting case study of manufactures packaging In Tangier.The total number of training samples was collected from January 2001 to December 2008 while the total number of testing samples was from January 2009 to December 2009.The proposed DELPHI-FCBPN system was applied as case to forecast the sales.The results are presented in Table3.forecasted results by DELPHI-FCBPN method.

Figure
Figure 7: The MAPE of WES Month Actual values Forecasted values 2009/1 5408 5131

Table 2 :
Comparison of total distance of different clustering algorithms in.