An Extreme Learning Machine Model Approach on Airbnb Base Price Prediction

The base price of Airbnb properties prediction is still a new area of prediction research, especially with the Extreme Learning Machine (ELM). The previous studies had several suggestions for the advantages of ELM, such as good generalization performance, fast learning speed, and high prediction accuracy. This paper proposes how the ELM approach is used as a prediction model for Air BnB base price. Generally, the steps are setting hidden neuron numbers, randomly assigning input weight and hidden layer biases, calculating the output layer; and the entire learning measure finished through one numerical change without iteration. The performance of the model is estimated utilizing mean squared error, mean absolute percentage error, and root mean squared error. Experiment with Airbnb dataset in London with twentyone features as input generates a faster learning speed and better accuracy than the existing model. Keywords—Airbnb; base price prediction; extreme learning machine; fast learning


I. INTRODUCTION
Airbnb is a property-sharing marketplace that allows property holders and tenants to put their properties on the web, with the goal that guests can pay to stay in them as part of the hospitality business. In the hospitality domain, evaluating and income, the executives, are two top now and again explored zone because of the theoretical and practical criticality of room estimating. Airbnb has must ace room valuing to build their gainfulness after fulfilling visitor desires [1].
Paid third-party pricing software is available, but generally, the property owners are needed to put their regular daily value (base cost); and the calculation will differ everyday price around that base cost. As a platform provider, Airbnb does not control how their hosts set prices for their postings, yet it gives an assortment of apparatuses to enable their hosts to put their prices all the more adequately. For instance, they permit hosts to set altered day by day rates, end of the week costs, and limits for long haul stays, so the base price determination becomes an essential process [2].
To be able to determine the property price, some researchers use various methods with three components to determine the price: i) a binary classification model predicts the booking probability of each listing night, ii) a regression model predicts the ideal cost for each listing night, iii) personalization reasoning on top of the yield from the resulting model to deliver the last cost suggestions [2].
In contrast to evaluating issues where valuing techniques are applied to an enormous number of indistinguishable items, there are no identical items on Airbnb. Each listing property on the Airbnb platform offers unique qualities and encounters to visitors. A geographically weighted regression (GWR) approach to distinguish a few variables corresponded with Airbnb listing costs has been implemented. The unique nature of Airbnb listing makes it exceptionally hard to appraise a precise interest bend needed to apply traditional income expansion evaluating techniques. The offline and online evaluation result of this pricing model show that the proposed model performs better than an immediate max-fire up evaluating methodology because of max-fire up pricing strategy are likely to suffer from demand curve which is hard to estimate [3].
Several machine learning algorithms such as a ridge, random forest regressor, linear regression, decision tree regressor, and lasso have been used in the forecast of housing selling costs for prediction. The result shows that feature selection is a significant component. Two special exhibitions of machine learning are the precision of forecast and averaging of errors or fitness, which might be influenced by the highlights chose with various gatherings of relationship levels [4].
The ELM has excellent potential for system prediction and modeling, i.e., an ELM based indicator for genuine frequency stability assessment (FSA) of power systems [5], electricity price forecasting [6], sales forecasting [7], security evaluation of wind power system [8], and drying system modeling [9]. However, there is no discussion about the Extreme Learning Machine (ELM) for Airbnb price forecast.
Over the past few years, a simple learning algorithm ELM for a single hidden layer feedforward network (SLFNs) was introduced [10]. ELM has superior faster-learning speed and better generalization performance than traditional feedforward network learning algorithms such as backpropagation (BP) algorithm. ELM achieves similar or preferred speculation execution over Support Vector Machine for regression and binary class, and much better speculation execution for multiclass classification cases [11].
ELM does not have to build up an exact numerical model of the object and appreciate the characteristics of the item. The technique accomplishes deficiency areas only by restricted defect tests for preparing and learning. ELM picks input loads and hidden biases randomly and then analytically computes *Corresponding Author 179 | P a g e www.ijacsa.thesai.org yield loads with Moore-Penrose derive pseudo-inverse [12]. The entire cycle is assessed without iterations, so the learning cycle is a very time-productive strategy. ELM Method overcomes numerous issues in gradient-based learning algorithms, for example, learning rate, stopping criterion, reducing local minima, and the number of epochs [13]. The objective of this research are: 1) to provide a new approach for the base price prediction for Airbnb; 2) to give a more accurate and faster prediction model for Airbnb base price prediction.

II. RELATED WORKS
In previous studies, price forecasting has been carried out. There are several ELM methods that have been applied to price forecasting, namely gold price forecasting [14]; in this paper, a learning algorithm for a single hidden layered feedforward neural network called Extreme Learning Machine (ELM) is utilized, which has good learning capacity. Also, this examination dissects the five models, explicitly feedforward backpropagation networks, feedforward networks without feedback, radial basis function, ELMAN networks, and ELM learning model. The outcomes demonstrate that ELM learning performs in a way that is better than different techniques.
Based on [15], this research focus on the issue of how to plan an approach that can improve the forecast exactness just as accelerate expectation measure for stock market prediction. Initially, so as to get the most critical highlights of the market news records, this article is proposed a new feature selection algorithm called NRDC, just as another component weighting calculation (N-TF-IDF) to help increase the expected precision. Exploratory outcomes demonstrate that the N-N-K-ELM model can accomplish better execution on the thought of both forecast exactness and expectation speed as a rule.
In [5], ELM is used as the predictor for real-time. This article surveys the ELM's applications in power planning and a short time later develops an ELM-based indicator for realtime frequency stability assessment (FSA) of power systems. The contributions of the indicator are power system operational parameters, and the yield is recurrence soundness edge that measures the security level of the power system subject to a chance. By disconnected training with a frequency stability database, the indicator can be online was applied for real-time FSA. Profiting by the rapid speed of ELM, the predictor can be online are refreshed for upgraded robustness and reliability.
ELM is also used for electricity market prices by [6]. In this article, a fast electricity market price forecast is proposed dependent on an as of late developed learning technique for single hidden layer feedforward neural networks, the extreme learning machine (ELM), to defeat these disadvantages. The new methodology additionally has improved value stretches gauge exactness by incorporating a bootstrapping method for vulnerability assessments. The outcomes show the extraordinary capability of this proposed approach for online precise price forecasting at the spot market costs assessment.
In this research, ELM will be used for predicting Airbnb property base price with one preferred position is predominant quicker learning pace and better speculation execution with a theory that will improve the accuracy of the model.

A. Extreme Learning Machine
Extreme Learning Machine is a single hidden layer feedforward neural network (SLFNs) and is a sort of straightforward and powerful learning algorithm [12]. It merely needs to set the hidden layer nodes and infinitely differentiable actuation work before preparing by clarifying the minimum norm least-squares of a linear equation to the ideal arrangement. Arbitrarily picked the hidden biases and input weights, and the yield loads are determined systematically with a given number of hidden neurons. The entire cycle of computation finishes once at a time without iteration.
The output function of ELM for generalized SLFNs as shown in (1).
Where β = [β 1 , … , β L ] T is the output weight vector between the hidden layer of L nodes to the m ≥ 1 output nodes, and h(x) = [h 1 (x)h L (x)] is a nonlinear feature mapping. The output (row) vector of the hidden layer with respect to the input x. h i (x) is the output of the i th hidden node output. The output functions of hidden nodes may not be unique. Different output functions may be used in other hidden neurons. In particular, in a real application h i (x) can be formulated in (2).
A standard SLFNs with L(N 0 ≥ L) hidden layer nodes and the activation function g(x) are mathematically modeled as (3).
where w i = [w i1 , w i2 , … , w in ] T is the weight vectors connecting the i th hidden node and input nodes, β i = [β i1 , β i2 , … , β in ] T is the weight vectors connecting the i th hidden node and output nodes, b I is the bias of the i th hidden node, w i . x j represents the inner product of w i and x j , the network structure, as shown in Fig. 1. To make SLFNs with hidden layer nodes and the activation function ( ) approximate the N samples with zero error mean that ∑ �o j − t j � = 0 L i=1 , namely, existing β I , w I and b I make formula (4) established.
The equation (4) can be written compactly as (5) where, From (6) H is the hidden layer output matrix of the neural network, the ith column of H represents the output matrix about x I , … , x n of the ith hidden layer node.
When the activation function infinitely differentiable, the input connection weights w I moreover, hidden layer bias b I can randomly set at the beginning of the training, and they will be fixed in the training process; the output connection weights are obtained by solving the least-squares solution of linear (7), the result as (8).
where H † is the Moore-Penrose generalized inverse of the hidden layer output matrix H . The mathematical transformation determines the output weights. This ensures that the long training phrase when network boundaries are iteratively changed with some reasonable learning boundaries (like iterations and learning rate) is not needed.
ELM learning algorithm steps can be summarized as follows: Step 1: Given a training set (x I , t i ) = (i = 1,2, … , N) , the activation function is g(x), number of hidden layer nodes is L, setting the input weights w I and hidden layer bias b I randomly.
Step 2: Calculate the output matrix H of the hidden layers.
Because the ELM algorithm does not require iterative input weights and bias in training adjustment, it reduces the complexity of the training, and the training speed improved obviously.

B. Performance Evaluation
An evaluation needs to be performed to measure performance and to provide feedback that can serve to improve the model. In this research, root mean squared error (RMSE), mean square error (MSE), and mean absolute percentage error (MAPE) measurements were performed [17].
MSE is a measure of prediction accuracy by squaring each error for each observation in a data set and then obtaining the average number of squares. MSE gives greater weight to the error compared to a small error because the error is raised before adding up. MSE can be calculated by (9).
RMSE is the square root value of the average square error and formulated as in (10).
MAPE is the average percentage of the sum of the differences between prediction results with actual data. The formula for the mean absolute percentage error can be written as follows in (11).
Given is actual value, � is predicted value, i is the ith data, n is the number of data.
A study from [18] shown that using MAPE as a measure of quality for regression models is feasible both on a practical point of view and on a theoretical one.

A. Data Description
The dataset utilized for this examination originates from InsideAirbnb.com. The dataset was downloaded on 9 April 2019 and contained data on all London Airbnb listings that were live on the webpage on that date, which is 79.671 Airbnb listing. The data itself has 106 features and saved in .csv format.
In this study, the base price prediction is predicted from 21 input features that will be chosen after the data preprocessing of the Airbnb listing. In the sample dataset, the input features that contains a price value are the advertised price of its Airbnb listing, security deposit fee, cleaning fee, and extra people fee.

B. Preprocessing
The dataset needs to be changed or prepared according to the needs. The original dataset has 106 features, including quite a few text columns of all the different description fields. Some features or columns will be dropped because of: not indicated to be useful for predicting price, and there are many null or NaN entries. Some entries connected with the fee and having missing values were replaced with the median to avoid fractions by 0.

C. Network Architecture
To solve the Airbnb base pricing issue, the ELM model was designed. The proposed methods can see in Fig. 2. Twenty-one input was picked to ELM from the dataset, 181 | P a g e www.ijacsa.thesai.org though the base price is assigned the yield of ELM. The performance of ELM relies upon be utilized the type of activation function and the number of hidden neurons. A sigmoid function was chosen as it is not too delicate to the user-determined parameters and determined by a testing parameter to find the best model of ELM.
Twenty-one features are chosen after the preprocessing as input due to their relationship to basic fee types of a property price are shown in Table I.
The test will be divided into two parts. The first test was conducted to learn the amount of training data and test data against the evaluation parameter value. The second test is to determine how the number of hidden neurons affects the evaluation parameter value and execution time; for this purpose, 100 steps regularly increase the number of hidden layer neurons from 100 to 1000. Each test will be conducted ten attempts and will be evaluated based on the average evaluation value and execution time.  The ELM parameter testing aims to determine the best parameters for benchmarking and results in Airbnb base price predictions. The first tests were conducted to learn the amount of training data and test data against the evaluation parameter value; the second is to resolve how the count of hidden neurons affects the evaluation parameter value and execution time. Each parameter value will be tested as many as ten times the experiment.
The first test conducted is to determine the size ratio of data training and data testing. The size ratio of training and testing data is 70:30 and 80:20. At this stage, the count of hidden neurons will be used as a control variable in testing. The number of hidden neurons used is 100. The results from testing the parameter ratio of training and testing data are indicated in Table II.
From the test results at the training data ratio and test of the ELM model, it was apparent that the training and test data with a ratio of 70%:30% had better average performance compared to the training and test data ratio of 80%:20% for all accuracy parameters.

A. Number of Hidden Neuron Testing
The next test is determining the count of hidden neurons on a hidden layer. The count of hidden neurons to be tested are 100, 200, 300, 400, 500, 600, 700, 800, 900, and 1000 as indicated in [20]. The MSE accuracy from this step is shown in Table III.
The MSE result shows that accuracy tends to decrease as the number of hidden neurons increases, and it is more evident in the chart of average MSE versus hidden neurons size in Fig. 3.
The experiment results, as depicted in Fig. 3, implicates that 1000 hidden neurons perform better MSE accuracy than the other hidden neurons for the lowest, highest, and average MSE training and testing except for 500 hidden nodes on average MSE testing.  The next accuracy testing is for RMSE, as shown in Table  IV. It has the same behavior as MSE in which the more hidden nodes in the model, the smaller RMSE is.
As the average RMSE is transformed in to chart for the average RMSE, it is more obvious too that the behavior is detected as seen in Fig. 4. The last accuracy data is for the MAPE on various hidden neurons, as shown in Table V. The result is similar with two other accuracy parameters where the more neurons, the less accuracy parameter for the MAPE. It demonstrates the same behavior too for the nodes of 500 and 800. Table V shown the average MAPE results for each testing parameter; like the previous evaluation method, testing the number of neurons on a hidden layer indicates that with many neurons, it will result in a small error value. If the average MAPE is charted, the same pattern is appearing for the MAPE, as shown in Fig. 5.
From Fig. 3, 4, and 5, it can be seen when the number of neurons increased, and the accuracy tends to increase too. It could be caused by the more neurons, the more link between the input and output layer, and leads to a better quality of learning process. The more link between input and output layer needs more computation time, as indicated in Table VI.
From Table VI, we can see that 1000 hidden neurons in terms of duration cost almost six times longer than 100 hidden neurons. The time increasing for the computation tend to be a linear graph, as shown in Fig. 6.   As observed in Fig. 6, the execution time for the model continue expanding as the number of hidden neurons ascends because the learning time of ELM is nearly spent on computing the Moore-Penrose generalized inverse H † of the hidden layer output matrix H.
Based on the tests already conducted, the parameters to be used in determining Airbnb price prediction results with the ELM method are the ratio of training and test data of 70%: 30% and the number of hidden neurons = 1000.

B. Comparison with XGBoost
In the testing phase, the performance of the resulting model was evaluated using three selected methods, namely MAPE, MSE, and RMSE. As a comparison, other predictions are made using the XGBoost method [19]. The test results for the three methods can be seen in Table VII. In the MAPE method, the model produces a value of 3.06% for the training data and 3.07% for the test data, while for the MSE method, the model has a value of 0.038 for the training data and 0.057 for the test data, and for the RMSE method, the model produces a value of 0.196 for the training data and 0.239 for test data. While for the XGBoost model, In the MAPE method, the model has a value of 6.70% for the training data and 6.75% for the test data, while for the MSE method, the model produces a value of 0.181 for the training data and 0.198 for the test data, and for the RMSE method, the model produces a value of 0.4260 for the training data and 0.4454 for test data. This is very clear that the ELM model outperforms the XGBoost model for the three accuracy parameters.

VI. CONCLUSION
In conclusion, we have developed an ELM prediction model approach on Airbnb base price and tested it by London Airbnb Listing in April 2019. The model is trained using 21 features, 70%:30% data split, and a maximum of 1000 neurons. The experiment results show that the model is having a good accuracy with the best average MSE value of 0.096, RMSE value of 0.304, and MAPE value of 4.88% for the test data. These accuracies are mostly achieved by 1000 neurons. From the experiments, the model show as the count of neurons raises, the link between the input and output layers would consequently increase. This leads to a better quality of learning. These accuracy parameters are outperforming the XGBoost algorithm and having a much faster learning time with better accuracy. For further research, the number of neurons for training can be expanded to more than 1000 neurons with more powerful hardware, so the convergence point with the number of neurons where the accuracy reached the optimum value could be found. The base price prediction can be expanded to a daily basis prediction with more features like scheduled events, holidays, and many other features.

ACKNOWLEDGMENT
The author would like to express our gratitude to the Direktorat Penelitian Universitas Gadjah Mada, Indonesia, for providing the research grant.