A Novel Robust Stacked Broad Learning System for Noisy Data Regression

—Robust broad learning system (RBLS) demonstrates the generalization and robustness for solving uncertain data regression tasks. To enhance representation ability of RBLS, this paper aims at developing a novel robust stacked broad learning system for solving noisy data regression problems, termed as RSBLS. In our work, we expand traditional BLS into a stacked broad learning system model with deep structure of feature nodes and enhancement nodes. Furthermore, ℓ 1 norm loss function is employed to update the objective function of RSBLS for processing noisy data, we apply augmented Lagrange multiplier (ALM) to get the output weights of RSBLS which keeps the effectiveness and efficiency compared with weighted loss function. Simulation results over some regression datasets with outliers demonstrate that, the proposed RSBLS performs favorably with better robustness with respect to RVFL, BLS, Huber-WBLS, KDE-WBLS and RBLS.


I. INTRODUCTION
Recently, inspired by random vector functional link neural networks (RVFL) [1,2], Chen et al. proposed a novel randomized neural network architecture, broad learning system (BLS) [3][4][5].Compared with the classical deep learning model, BLS adopts the broad structure which has the advantages of higher efficiency and fewer parameters.Due to its excellent learning ability, BLS has been widely concerned by scholars since it was proposed, and has developed rapidly in theoretical and applied research.To improve the interpretability of BLS, a novel broad neuro-fuzzy model, fuzzy broad learning system (FBLS) was presented, which reduces the number of fuzzy rules and improves the learning accuracy neuro-fuzzy model [6].Subsequently, Guo et al. used FBLS to synthesize multiview HDR images [7], Ali proposed a novel optic disk and cup segmentation method through FBLS for glaucoma screening [8].Furthermore, compact FBLS (CFBLS) which has better interpretability and fewer fuzzy rules was designed to balance the algorithm accuracy and complexity [9].For improve the representation of BLS, type-2 fuzzy BLS was given in [10].For solving sequential data, recurrent broad learning system and structured manifold broad learning system (SM-BLS) were presented respectively [11,12].To process the data with less label, semi-supervised broad learning system (SS-BLS) by introducing manifold regularization method to BLS [13], and some other SS-BLS algorithms had been used in semisupervised classification tasks [14,15].Otherwise, BLS and its variants had been extensive used in various engineering fields, such as traffic forecasting [16], image classification [17], EEG signals classification [18][19][20], sentiment analysis [21,22].
In practical engineering applications, sensors are susceptible to equipment failure, human interference, working environment and other factors, and there are different degrees of noises and outliers in the collected data, thus reducing the generalization of the learning model.To solve the uncertain data regression problem effectively, Chu et al. proposed weighted broad learning system (WBLS) framework for tackling industrial noisy data [23].Then, Zheng et al. designed a broad learning system based on maximum correntropy criterion (BLS-MCC) which used maximum correntropy criterion to calculate weights of training samples [24].In addition, Liu et al. adopted Cauchy loss function to process the noisy data [25].Meanwhile, ℓ 1 norm cost function and ℓ 2 regularization method were used in robust broad learning system (RBLS) [26], then elastic-net regularization approach replaced ℓ 2 regularization method in RBLS [27].Moreover, robust manifold broad learning system (RM-BLS) was used to predict large-scale noisy chaotic time series [28].In addition, for online sequential learning, Guo et al. presented online robust echo state broad learning system (OR-ESBLS) [29].However, the above models improved the robustness of BLS, the shallow models still lack feature representation capability.Now-a-days, deep neural networks with multi-layer have powerful representation capability, BLS was also been expanded with multi-layers [30][31][32].Therefore, to improve the noisy data processing performance of RBLS, we demonstrate a novel robust stacked broad learning system (RSBLS) for solving outlier data regression, which adopts deep structure of feature nodes and enhancement nodes through stacking deep model, ℓ 1 norm loss function and ℓ 2 regularization method ensure the learning accuracy and efficiency.
In brief, the highlights of RSBLS are listed as follows:  A novel robust stacked broad learning system structure is demonstrated, we presented the model architecture and algorithm description of RSBLS in detail.
 ℓ 1 norm loss function and ℓ 2 regularization method are adopted to enhance the robustness of RSBLS.
 Experiments on the benchmark datasets with different noise ratios present the superiority of RSBLS.
The other sections of our manuscript are given as follows: Section II introduces the basic algorithm description of BLS.Section III presents the architecture and optimization method www.ijacsa.thesai.org of the proposed RSBLS.Section IV demonstrates the uncertain data regression results on some datasets with different percentage of outliers.Finally Section V concludes the paper.

II. RELATED WORKS
BLS is novel random neural network model with effective and efficient performance, proposed by Chen et al., which has the architecture as Fig. 1 [3][4][5].The modeling process of BLS is given as follows [3][4][5]: express the feature and label.Among them, , { , , , } , N indicates the number of training samples.

A. Feature Nodes Generation
The feature nodes are generated through Eq. (1), then n groups of mapping nodes can be combined according to Eq. ( 2 .In particularly, the authors of BLS use sparse autoencoder to tune the initial parameters for obtaining better features [3][4][5].

B. Enhancement Nodes Generation
All the feature nodes in Eq. ( 2) are enhanced by using Eq.(3), each enhancement processing generate L h nodes.Then the enhancement nodes can be combined according to Eq. ( 4).

()
where, ()   is denoted as the activation function, it can be set as the same as ()   ; hq W and hq  are generated randomly; 1, 2, , q m  .

C. Output Y determination
The output Ŷ of BLS can be calculated according to Eq. (5).

[ | ]
where, W represents the output weight of BLS, is determined by the ridge regression approximation as Eq. ( 7).

III. ROBUST REGULARIZED HIERARCHICAL BROAD LEARNING SYSTEM
A. The Structure of RSBLS BLS, as a randomized learning algorithm, has an effective and efficient structure.Although the novel structure can reduce the computational burden and enhance learning accuracy, BLS with multi-layer structure can extract deep representation information [30][31][32][33].Meanwhile, the noisy and outliers in the training data affect the accuracy and generalization performance of BLS seriously.Therefore, we propose a novel RSBLS to solve noisy data regression problems.It is different from traditional BLS, RSBLS has multi-layer structure of feature nodes and enhancement nodes, the feature nodes and enhancement nodes of each layer are used as the input of next layer, only the feature nodes and enhancement nodes of the final layer are fully connected with the output.
In addition, due to the outliers usually take up a fraction of training data, the noisy data can be understood as having sparsity, ℓ 1 norm function is not only more robust to solving the sparsity data, but also ensures a faster learning efficiency, it is especially suitable for solving large-scale data and deep models [34,35].The RSBLS adopts ℓ 1 norm cost function and ℓ 2 regularization method to solve the noisy data regression.Fig. 2 demonstrates the model structure of RSBLS; the RSBLS algorithm is described as follows:

1) RSBLS parameters initialization:
To simplify the RSBLS model, the layer number is set as U, the feature mapping times of each layer are set to nu, the number of neurons in each feature mapping is L ue , the enhancement processing times of each layer are set to mu, and the number of neurons in each enhancement processing is L uh .a) Feature nodes generation: The original data X is transformed into feature nodes by using Eq. ( 8), then all the nodes in the feature layer are combined through Eq. ( 9) and Eq.(10)  b) Enhancement nodes generation: All the feature nodes in Eq. ( 9) are enhanced as enhancement nodes through Eq. (10), then all the enhancement nodes are connected by Eq. ( 11).

([ )
In addition, we combine all the nodes of layer u through Eq. ( 12) as the input of next layer.
[ , ] c) Target output matrix Y determination: To reduce the computational burden of RSBLS, we only connect the feature nodes and enhancement nodes of layer U in Eq. ( 13), then we use the ℓ 1 norm cost function and ℓ 2 regularization method to calculate the model output weights as Eq. ( 14).
The outputs of RSBLS can be calculated by Eq. (15).
B. The Optimization of RSBLS Hence Eq. ( 14) can be considered as a constrained convex optimization problem, we use augmented Lagrange multiplier (ALM) approach to solve this problem, and Eq. ( 14) can be transformed as Eq. ( 16).
where, e H Y    ; is denoted as the vector of Lagrange multiplier; The optimal and the Lagrange multiplier can be optimized by ALM method iteratively by using Eq. ( 17).
Enhancement nodes

Feature mapping Enhancement processing
Hidden layer 1 Enhancement nodes Hidden layer U The algorithm flow of RSBLS is described in Algorithm 1.
Parameters: the number of feature layer V, the feature mapping times of each layer n, the number of neurons in each feature mapping L e ; the number of enhancement layer U, the enhancement processing times of each layer m, the number of neurons in each enhancement processing L h .

IV. NUMERICAL EXPERIMENTS
The related experiments of our paper are programmed based on MATLAB 2019b.

A. Experimental Datasets
In this part, six benchmark datasets, including Concrete, Abalone, Stock, Mortgage, Treasury, and Compactiv from KEEL (http://www.keel.es/)are selected to demonstrate the feasibility of the RSBLS.Table Ⅰ gives the corresponding information of these datasets.
In addition, to verify the robustness of RSBLS, the datasets are preprocessed as follows: we first carry out normalization processing, the features and corresponding labels are normalized in the range of [0, 1].Moreover, 75% samples of the original datasets are selected as the training datasets randomly, the rest 25% samples are determined as the test datasets.In the last, 10%, 20%, 30%, 40% and 50% outliers with uniform distributed are inserted into the training datasets as Eq. ( 21). , 0.5 0.5 where, noise y expresses the contaminated training label in the range of [-0.5, 1.5]; outlier y V means the random outlier.

B. Evaluation Indexes
To present the robustness of RSBLS, we carry out all the related algorithms 50 times independently, the average root mean square error (RMSE) (see Eq. ( 22)) of experimental results are recorded as the evaluation indexes.
where, y i indicates the actual value of sample i; f i represents the output results of sample i; N denotes the number of samples.

1) Parameters settings:
To illustrate the proposed RSBLS, RVFL [1,2], BLS [3][4][5], Huber-WBLS [23], KED-WBLS [23] and RBLS [24] are chosen as the contrast algorithms, among them, we use the sigmoidal function (see Eq. ( 23)) as activation function of all the compared models. 1 () Some key parameters of RSBLS, RVFL, BLS, Huber-WBLS, KED-WBLS and RBLS are listed as follows: RVFL: the hidden layer nodes are selected from {50, 100, 150, 200}, the weights and biases are generated randomly in the range of [-1,1] and [0,1] respectively.BLS, Huber-WBLS, KED-WBLS and RBLS: the feature mappings and feature nodes are chosen from {5, 10, …, 45, 50}, and the enhancement nodes are selected from {50, 100, 150, 200}.Moreover, these models adopt ℓ 2 regularization technique and the regularization parameter C are chosen from {2 -10 , 2 -5 , 0, …, 2 15 , 2 20 }.Some other models are set as the same as the original references.RSBLS: the number of layers is set as 2; the feature mappings and feature nodes of each layer are chosen from {5,  At the same time, it is obviously that RSBLS with 2 hidden layers has a better robustness compared with Huber-WBLS, KED-WBLS and RBLS.Our proposed model, RSBLS with ℓ 1norm loss function gains the best mean RMSE on 6 datasets, which demonstrates the effectiveness of regularization method.In addition, the uncertain data regression performance of those models on different datasets with different contamination rates indicates the strong robustness of RSBLS, at different levels of outliers; the RSBLS shows the best consistency.
In summary, RSBLS with ℓ 1 norm loss function can solve uncertain data regression with uniform distributed outliers effectively; the stacked deep model is helpful to enhance the robustness of BLS well.www.ijacsa.thesai.orgV. CONCLUSION In the paper, we propose a novel robust stacked broad learning system model with multi-layers for solving uncertain data regression problem, named as RSBLS.In the proposed RSBLS, we expand BLS into a stacked deep model with multilayer of feature nodes and enhancement nodes, which can helpful to extract deep representation information.In addition, ℓ1-norm function is introduced to calculate the output weights of RSBLS, which can process noisy data and ensure learning efficiency of hierarchical model.Experimental results on some regression datasets with different ratios of noisy shows that, RSBLS has better robustness compared with RVFL, BLS, Huber-WBLS, KDE-WBLS and RBLS.
In the future, as some parameters should be selected by grid search which limits the search scope, some of the latest swarm intelligence algorithms can be used to choose the parameter of RBLS.

TABLE I
50}, and the enhancement nodes of each layer are selected from {50, 100, 150, 200}.Moreover, these models adopt ℓ 2 regularization technique and the regularization parameter C are chosen from {2 -10 , 2 -5 , 0, …, 215, 2 20 }.2) Results and discussion: In this section, we give the performance evaluation of RSBLS, TableIIto TableVIIshow the average test RMSE results of RSBLS compared with different models on six regression problems with uniform distributed outliers.As it can be seen from TableIIto TableVII, with the increase of contamination rates, the performance of RVFL and BLS gradually become worse, Huber-WBLS, KED-WBLS and RBLS can improve the robustness of BLS.