A Predictive Model for Solar Photovoltaic Power using the Levenberg-Marquardt and Bayesian Regularization Algorithms and Real-Time Weather Data

The stability of power production in photovoltaics (PV) power plants is an important issue for large-scale gridconnected systems. This is because it affects the control and operation of the electrical grid. An efficient forecasting model is proposed in this paper to predict the next-day solar photovoltaic power using the Levenberg-Marquardt (LM) and Bayesian Regularization (BR) algorithms and real-time weather data. The correlations between the global solar irradiance, temperature, solar photovoltaic power, and the time of the year were studied to extract the knowledge from the available historical data for the purpose of developing a real-time prediction system. The solar PV generated power data were extracted from the power plant installed on-top of the faculty of engineering building at Applied Science Private University (ASU), Amman, Jordan and weather data with real-time records were measured by ASU weather station at the same university campus. Huge amounts of training, validation, and testing experiments were carried out on the available records to optimize the Neural Networks (NN) configurations and compare the performance of the LM and BR algorithms with different sets and combinations of weather data. Promising results were obtained with an excellent realtime overall performance for next-day forecasting with a Root Mean Square Error (RMSE) value of 0.0706 using the Bayesian regularization algorithm with 28 hidden layers and all weather inputs. The Levenberg-Marquardt algorithm provided a 0.0753 RMSE using 23 hidden layers for the same set of learning inputs. This research shows that the Bayesian regularization algorithm outperforms the reported real-time prediction systems for the PV power production. Keywords—Solar photovoltaic; solar irradiance; PV power forecasting; machine learning; artificial neural networks; LevenbergMarquardt; Bayesian regularization


I. INTRODUCTION
The rising fuel costs and increasing energy demands with the ongoing industrial growth and environmental awareness have engaged to the importance of new renewable energy sources such as the solar Photovoltaic (PV) systems [1,2].As one of the most important renewable energy sources, PV energy is becoming the dominant clean and reliable energy source that is widely used around the world without caus-ing any damage to the environment.Mentioning the lightelectricity process, the term "Photovoltaic" is first used by Alfred [3], as the light conversion process into electricity.There are two modes of installation for solar PV power plants: grid-tied and off-grid systems [4].The first mode is widely used and proven to be hugely beneficial.It depends on the variable weather conditions according to the geographical area of the system which is the reason why it was known as uncertain, uncontrollable, and non-scheduling power source [5].The second mode, off-grid systems, is used for isolated or remote areas that are normally on a smaller scale.
Many studies were reported in the literature suggesting different modeling, simulation, and prediction methods for the expected power production of solar PV plants for the purpose of improving the investment feasibility and maintaining a stable power quality and scheduling [6,7].Fonseca [8], compared the accuracy of one-day ahead prediction for the power produced by 1MW PV System using two methods: Support Vector Machines (SVM) and Multilayer Perceptron (MP) Artificial Neural Networks (ANNs).It was found that the two algorithms approximately obtained almost the same accuracy with 0.07 KWh/m 2 and 0.11 KWh/m 2 Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), respectively.
Various forecasting methods of PV power output were reviewed in [9].It was demonstrated that any model uses numerically predicted weather data will not take into account the effect of cloud cover and cloud formation when initializing, therefore sky imaging and satellite data methods used to predict the PV power output with higher accuracy.The article also outlined some key factors affecting the accuracy of prediction, such as forecast horizon, forecasting interval width, system size and PV panels mounting method (fixed or tracking).A model using multilayer perceptron-based ANN was proposed in [5] for one day ahead forecasting.The daily solar power output and atmospheric temperature for 70 days used for training the ANN.For the different settings of the ANN model (number of hidden layers, activation function, and learning rule), the minimum MAPE achieved was 0.855%.
The aim of the work published in [10] was to study the effect of forecast horizon on the accuracy of the method used to predict the PV power production, which was Support Vector Regression (SVR) using numerically predicted weather data.Two forecast horizons studied: up to 2 and 25 hours ahead.As expected, the forecasting of up to 2 hours ahead was more accurate with RMSE and MAE increased 13% and 17%, respectively, when the forecast horizon was up to 25 hours ahead.Cococcioni [11], developed and validated a model that adapted an ANN with tapped delay lines and built for one day ahead forecasting.The inputs were the irradiation and the sampling hours.The model achieved seasonal MAE ranging from 12.2% to 26% in spring and autumn, respectively.Monteiro [12], compared two short-term forecasting models: the analytical PV power forecasting model (APVF) and the MP PV forecasting model (MPVF), with both of the models using numerically predicted weather data and past hourly values for PV electric power production.The two models achieved similar results (RMSE varying between 11.95% and 12.10%) with forecast horizons covering all daylight hours of one day ahead, thus the models demonstrated their applicability for PV electric power prediction.
Leva [13], proposed a new Physical Hybrid ANN (PHANN) method to improve the accuracy of the standard ANN method.The hybrid method is based on ANN and clear sky curves for a PV plant.The PHANN method reduced the Normalized MAE (NMAE) and the Weighted MAE (WMAE) by almost 50% in many days compared to the standard ANN method.In [14], the PV energy production for the next day with 15-minutes intervals was accurately predicted with an SVM model that uses historical data for solar irradiance, ambient temperature, and past energy production.The method demonstrated very good accuracy with R 2 correlation coefficients of more than 90%, and the coefficient was strongly dependent on the quality of the weather forecast.
In our previous work [15], we proposed an initial real-time forecasting model for the PV power production using ANNs based on the available solar irradiation records for the last few days.In this research work, ANNs were optimized comparing the Levenberg-Marquardt (LM) and Bayesian Regularization (BR) algorithms to analyze and correlate the available data of temperature, solar irradiance, timing, and the generated solar PV power.The suggested system provides real-time PV energy forecasts for the next 24 hours based on real-time weather data for the last week.

II. PV AND WEATHER DATA A. PV Systems
There are four separate PV systems installed at the university campus for a total generation capacity of 550KWp: • PV ASU00 (The Test Field): This system was installed in 2013 with a capacity of 56.4KWp including a CPV tracker, a Polycrystalline tracker, Poly-crystalline and Mono-crystalline panels (South and East/West oriented), and thin film panels.
• PV ASU08 (The Library): A rooftop-mounted 130.1KWp system of Yingli Solar panels and SMA sunny tripower inverters.• PV ASU09 (Faculty of Engineering): This is the largest rooftop-mounted PV system at ASU that is installed on top of the faculty of engineering building with a capacity of 264KWp [16].It consists of 14 SMA sunny tripower inverters (17KW and 10KW) connected with Yingli Solar (YL 245P-29b-PC) panels that are tilted by 11°and oriented 36°(S to E) (see Fig. 1).
• PV ASU10 (Deanship of Student Affairs): A rooftop mounted 117.4KWp system of Yingli Solar panels and SMA sunny tripower inverters.

B. ASU Weather Station
ASU's weather station was installed in 2015 to be the first of its kind in Jordan providing a wide range of weather data measured by the latest sensors and devices [17].It is located about 175m from the engineering building as shown on the map of Fig. 2. The station is equipped with many instruments to measure: • The wind speed and direction (4 altitudes between 10-36m).
• Global solar irradiance (including separate direct radiation and diffuse radiation records).
• Subsoil and soil surface temperature.
The real-time weather measurements obtained by the station are updated continuously and published on the website: http://energy.asu.edu.jo as depicted in Fig. 3.
In this research work, we created a two years dataset using the available hourly records of weather and PV energy data for the duration between 16 May 2015 and 15 May 2017.This dataset for 731 days includes 17544 weather records and 17544 PV power values.PV power records were obtained only from the PV ASU09 system as it is the largest and most stable system on campus.

III. FORECASTING SYSTEM
The suggested forecasting system can be represented by the block diagram of Fig. 4 and is described in the following subsections.

A. Data Filtering and Association
The collected weather and PV power data are first filtered, as shown in Fig. 4, to filter out any missing records which guarantee the consistency of the dataset.This includes any weather information with no PV power values associated at the same time or any PV power records with a missing weather data.
Based on the timing data, associations were found by matching solar PV power records with weather records including the record time, temperature, and global solar irradiance.To obtain homogeneous data and reliable machine learning experiments, the final dataset was normalized between 0 and 1.

B. LM and BR ANNs
Compared with similar algorithms [18, 19,20], ANNs are known as one of the most powerful machine learning techniques with a wide range of applications [21,22,23].ANNs map non-linear inputs through adjustable weights into the desired targets.The network is created by three layers: the input, hidden, and output layers [24] as illustrated in the example of Fig. 5 for a network of 11 inputs, 23 hidden layers, and one output.
ANNs showed excellent learning and classification performances while dealing with real-world sensor data [25,26,27].In this work, we applied the LM and BR to neural networks and compared the learning performances using different training configurations.
The LM backpropagation optimization algorithm has been initially reported in [28] and has been applied later to neural networks in [29,30].The LM ANNs algorithm is implemented in MATLAB and it is known as the fastest backpropagation supervised algorithm especially while training feedforward ANNs with moderate sizes [31].The BR backpropagation algorithm has been introduced in [32] and [33] and it is implemented in MATLAB [34].Both of the LM and BR ANNs calculate the neural network errors' derivative functions with respect to weights and biases to obtain a Jacobian matrix that is used for calculations which means that the performance can only be measured by the mean squared errors [29].

C. Training and Testing Experiments
Data of the global solar irradiance (Rad d (t)) and the temperature (T emp d (t)) at the altitude of 1m are used to form the weather information vector W d (t) at time t of day d.Two neural network models were created based on the LM and BR algorithms with the target function of the mean PV power P d (t).The inputs to these models are the current time stamp from the beginning of the current year (T d (t)) and the available weather vectors W d (t) at the same time t over the previous five days before day d.So, as depicted in Fig. 6, each input sample of the training dataset consists of: and one output value P d (t) with In this work, the MATLAB Neural Networks toolbox was used for huge amounts of training, validation, and testing experiments while using different input combinations and varying the number of hidden layers from 1 to 30.The used set of inputs are: • ALL inputs (1 time value, 5 radiation values, and 5 temperature values).
• 3 inputs (1 time value, 1 radiation value, and 1 temperature value).Ten experiments were handled at each value for the number of hidden layers.At each experiment, the samples of the dataset were randomly mixed to generate the sub-datasets: 80% for training, 5% for validation, and 15% for testing.Then, the performance was evaluated by calculating the average RMSE for each of ten experiments using: The network configurations that provided the best performances are listed in Table I.It can be concluded from the results that the best training/testing experiments provided an average RMSE of 0.0706 and a best testing correlation coefficient of R=0.9660 and mean square error of 0.00485 while using all inputs to the BR ANNs with 28 hidden layers for the testing performance illustrated in Fig. 7 and 8.
A histogram of 20 Bins is depicted in Fig. 9 for the overall errors.These results are very low compared to the methods and measures reported in the literature and related to the current research as summarized in Table II.

IV. CONCLUSION
In this research, a predictive forecasting model is proposed by applying the Levenberg-Marquardt and Bayesian Regularization algorithms to neural networks for the purpose of correlating historical weather data to photovoltaic outputs.Two years of hourly data were processed to associate the available   25 hours ahead SVR MAE 0.076 MWh [11] One day ahead ANN MAE 0.122 [12] One day ahead APVF RMSE, MAE 0.121, 0.0597 [12] One day ahead MPVF RMSE, MAE 0.1195, 0.0646 [13] One day ahead PHANN NMAE, WMAE 50% error reduction [14] One day ahead SVM R 2 correlation coefficients 90% [15] One day ahead ANN RMSE, R  temperature and global radiation records to the generated PV power.The associated datasets were used as a source of learning for a neural network model that use real-time weather data to provide PV power forecasts for the next 24 hours.
After a vast amount of training/testing experiments, excellent prediction results were obtained using the BR ANNs based on time, temperature, and radiation inputs.These predictions can be used by many energy management systems and power control systems of grid-tied PV plants.The proposed model is being developed into a real-time online application in our near future work.

Fig. 1 .
Fig. 1.Rooftop mounted Solar panels on top of the engineering building.

Fig. 5 .
Fig. 5.An example for the structure of ANNs with 23 hidden layers.

Fig. 6 .
Fig. 6.Next-day PV forecasting based on the weather data of the previous five consecutive days.

TABLE I .
RESULTS OBTAINED USING DIFFERENT SETS OF INPUTS

TABLE II .
A COMPARISON BETWEEN THE FORECASTING PERFORMANCE FOR DIFFERENT METHODS AND MEASURES RELATED TO THE CURRENT RESEARCH