Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Artificial neural network and SARIMA based models for power load forecasting in Turkish electricity market

  • Ömer Özgür Bozkurt ,

    Contributed equally to this work with: Ömer Özgür Bozkurt, Göksel Biricik, Ziya Cihan Tayşi

    Affiliation Computer Engineering Department, Yıldız Technical University, İstanbul, Turkey

  • Göksel Biricik ,

    Contributed equally to this work with: Ömer Özgür Bozkurt, Göksel Biricik, Ziya Cihan Tayşi

    goksel@ce.yildiz.edu.tr

    Affiliation Computer Engineering Department, Yıldız Technical University, İstanbul, Turkey

  • Ziya Cihan Tayşi

    Contributed equally to this work with: Ömer Özgür Bozkurt, Göksel Biricik, Ziya Cihan Tayşi

    Affiliation Computer Engineering Department, Yıldız Technical University, İstanbul, Turkey

Abstract

Load information plays an important role in deregulated electricity markets, since it is the primary factor to make critical decisions on production planning, day-to-day operations, unit commitment and economic dispatch. Being able to predict the load for a short term, which covers one hour to a few days, equips power generation facilities and traders with an advantage. With the deregulation of electricity markets, a variety of short term load forecasting models are developed. Deregulation in Turkish Electricity Market has started in 2001 and liberalization is still in progress with rules being effective in its predefined schedule. However, there is a very limited number of studies for Turkish Market. In this study, we introduce two different models for current Turkish Market using Seasonal Autoregressive Integrated Moving Average (SARIMA) and Artificial Neural Network (ANN) and present their comparative performances. Building models that cope with the dynamic nature of deregulated market and are able to run in real-time is the main contribution of this study. We also use our ANN based model to evaluate the effect of several factors, which are claimed to have effect on electrical load.

Introduction

Deregulation in Turkish Electricity Market has started in 2001 and liberalization will be completed in a few years with the removal of the consumption limits of eligibility for consumers to choose their distributing company. This will cause distributors to offer prices as low as possible to attract new subscribers, while preserving the profitable level. Currently, distribution companies trade with manufacturers over the prices that are determined daily by EPIAS [1]. In the near future, as the market becomes fully deregulated and competitive, correct moves in the market will depend on the precision of the expected electricity cost.

Load is a fundamental and vital information for power generation facilities and traders, especially in production planning, day-to-day operations, unit commitment and economic dispatch. Load forecasting is done in different intervals according to requirements: long term load forecasting covers one to several years for plant and infrastructure investment decisions; mid-term load forecasting covers a few days to a few months for maintenance scheduling and negotiation of forward contracts; short term load forecasting (STLF) covers one hour to a few days for real time generation control, security analysis and energy transaction planning [2]. STLF can be nationwide [3], regional [4] or for microgrids [5]. Well known rules of supply-demand balance are also valid in electricity markets: price increases during the hours of higher demand and goes down during the hours of lower demand such as nights, weekends and holidays. Demand is shaped hourly and it is impossible to start or stop production instantaneously in a huge power plant; therefore, production planning is mostly done in daily basis. Thus, STLF plays a crucial role for managing operations in electricity markets.

With the deregulation of electricity markets, a variety of STLF models are developed. These models include multi linear regression [6], Box-–Jenkins method and other derived autoregressive models [7], artificial neural networks (ANNs) [8], fuzzy logic systems [9], Kalman Filter models [10] and hybrid models [11, 12]. Relationship between external factors and electrical load is not only quite complex but also nonlinear. This nature of load makes it difficult to predict future values with parametric modeling methods such as time series and linear regression analysis. Parametric methods require making assumptions on the rules of underlying system. On the other hand, ANNs require minimum number of assumptions to find out the relation between input and the output. For non-linear multivariate problems with large datasets, ANN is known to exhibit a much higher performance and therefore, seems to be appropriate for STLF. Contrariwise, autoregressive models sometimes outperform ANN based models due to seasonality effect.

Medium and long term forecasts are done using collected historical load data, weather data, the number of consumers in different classes, number and characteristics of appliances at the region, population data, electrical equipment sales and estimations of these for the interval to be forecasted [13]. On the other hand, short term forecasts use historical load, price and weather data. Introducing seasonality effect by using day of week, hour of the day and holiday information as input has shown to increase performance [14].

Deregulation of the Turkish Electricity Market is still in progress with new rules being effective. Earlier studies on Turkish Electricity Market are done in regional basis and span the period before the actual deregulation. Also there is a growing trend to use intermittent sources such as wind and solar energy to produce electricity due to the environmental concerns. However, irregular nature of these resources increases the degree of uncertainty of electrical load. Our primary motivation is to create an accurate STLF system for current Turkish Electricity Market, since both deregulation and the use of intermittent sources have changed the dynamics of the market.

The literature review on STLF shows that two main streamlines exist. The first group consists of regression [1517] and time series methods [7, 1823], where the performances are given in Mean Absolute Percentage Error (MAPE) and vary between 1.40% and 7.0%. The second group of studies are either ANN based [8, 2428] or have some extensions and modifications to ANN, which are referred as hybrid solutions [6, 12, 2936]. All these modifications tend to increase forecast performance, and this group of studies report MAPE values between 0.98% and 14.0%. We have to note that all these mentioned MAPE values are not standardized, changing from only one-hour-ahead forecasts to weekly mean values. Besides, the nature of the electricity markets used in these studies directly affects the forecast performances.

STLF studies for Turkish market are very limited. Filik et al. develop a statistical model to forecast short, medium and long term load for regulated Turkish Market. The success of the model for short term is given as 5.74% MAPE [37, 38]. Topalli et al. work on Turkish load data of year 2001 and develop an ANN model after clustering the data according to its characteristics. Their model outperform Autoregressive Moving Average (ARMA) model developed for benchmarking and achieve 1.51% weighted average MAPE [39, 40]. Yasin et al. compare performance of ANN and SVM based methods for STLF using calendar and temperature data of three major cities. They report MAPE values ranging from 2.0% to 3.55% determined by the season [41]. Cevik and Cunkas compare performance of ANN and Adaptive Neuro Fuzzy System (ANFIS) methods using load data of Turkish Market between 2009 and 2011. They report 1.85% to 2.02% MAPE values for ANN and ANFIS respectively [42].

As we reported above, the studies on Turkish Electricity Market are inadequate. First, they are outdated and do not fit to the deregulating structure of the Market. Second, they propose one method and lack on presenting comparisons. Our motivation is to fill this gap by proposing two separate models, one based on Seasonal Autoregressive Integrated Moving Average (SARIMA) and the other based on ANN. There are several factors including weather, currency, and price, which are believed to have effect on load. However, to the best of our knowledge there are limited number of studies investigating this subject and scope of these studies is limited to weather forecasts as in [4346]. Another motivation for our work is investing the correlation between these factors and load by using real data spanning over two years.

Our contribution is three folded. First, while building our proposed systems, we used recent deregulated market data, which reflect the dynamic nature of Turkish Electric Market. Secondly, in most of the proposed STLF systems, successive one hour is predicted using previous actual values of inputs. This approach is not suitable to make a weekly prediction in real-time, since it requires actual values to be known beforehand. Our proposed systems are based on weekly predictions and able to forecast 168 hours ahead. Finally, on contrary to existing studies, we performed extensive test cases by using a week from each month of the year. Thus, we obtained fair and unbiased results, which includes effects of special days and seasons.

The rest of this paper is organized as follows. Details of the methods that we used to create our models, are given in Methods. Case studies and detailed discussion on experimental results are presented in Experimental results. Finally, we conclude the paper and provide guidelines for future work.

Methods

Electrical load is a typical time series, since it consists of successive hourly measurements. Such time series data occur naturally in many application areas including process control, forecasting in economics, marketing, population studies, biomedical science. In order to understand the characteristics of a physical system that creates the time series, time series analysis methods that use systematic approaches are employed [47]. An important part of time series analysis is forecasting, which focus on prediction of future events based on the information extracted from the time series. There are different approaches used in time series analysis to forecast short and long term future. We can categorize these methods as parametric and non-parametric methods.

Parametric methods

Parametric methods that are used for time series forecasting include mathematical models such as Autoregressive (AR), Moving Average (MA), ARMA, Autoregressive Integrated Moving Average (ARIMA) and SARIMA. These models are used frequently in electrical load and price forecasting [48, 49].

All these methods employ a four step approach to create a model. First, model is formulated as a hypothesis. Then, a specific model is formed by selected variables based on observations. In the third step, model parameters are estimated by least-squares or maximum likelihood. At the final step, performance of the model is tested with selected variables and parameters. If the performance of the model meets predefined criteria, then the forecasting model is accepted. Otherwise, new parameters for model are estimated. This procedure is repeated until a set of model parameters that satisfies our predefined criteria is found [50].

In AR models, a series of previous values Zt−1, Zt−2, …Ztp are used to forecast the value Zt. An AR model can simply be defined as in Eq (1). (1)

Where C is a constant, ϕ1, ϕ2, ϕ3, …, ϕp are coefficients, ϵt is forecast error and p is the number of autoregressive terms. The formula above can also be written as in Eq (2). (2)

MA models use average of subsequences. As the process in a time series goes on, each new observation is added to the average and the oldest observation is dropped. Mathematical definition of MA models is given in Eq (3). (3)

Where θj are model parameters and ϵt is error. q represents the number of moving average terms. ARMA models are formed by combining AR and MA models. An ARMA(p, q) model can be expressed as: (4)

In practice, most of the time series are non-stationary. In order to fit a stationary model, it is necessary to remove non-stationary sources of variation. This can be done by differencing. Integrating ARMA(p, q) process to the dth order creates a model that is capable of describing certain types of non-stationary series [51]. This model is called ARIMA and can be shown as ARIMA(p, d, q), where d is the number of nonseasonal differences needed for stationarity.

Time series may possess seasonal patterns such as daily, weekly, monthly, etc. In order to model such time series, SARIMA models can be used. A SARIMA model is an extended version of ARIMA model with additional seasonal terms and can be shown as ARIMA(p, d, q) × (P, D, Q)s, where P is the degree of seasonal AR model, Q is the degree of seasonal MA model, D is degree of seasonal integration, and s is the span of repeating seasonal pattern. Detailed discussion about variable selection of our SARIMA based model is given in Experimental results.

Non-parametric methods

The methods discussed in the previous subsection rely on tuning the parameters of the defined model. On the other hand, due to the nature of the time series data, the coefficients and the constant can belong to an unknown distribution and may not be described with parameters. This situation especially arise from the non-stationary nature of the data. To overcome this problem, non-parametric forecasting methods are introduced [5254].

In early studies, non-parametric kernel estimators are used to adjust the coefficients of AR, MA, ARMA and ARIMA methods [55]. Later on, ANNs are used as non-parametric estimators for time series forecasting. ANN can fit a non-parametric and non-linear function, without guidance to time series data [5658]. ANN is a biologically inspired machine learning method, which simulates the workflow of human neural system. However, the function approximation and learning algorithms differ from the way that the biological nerves behave. The underlying mechanism of ANNs is defining a function by means of weighted sum of several sigmoids. These sigmoid transfer functions are in fact the combining functions of all relevant explanatory variables. The weights of the sigmoids are determined with regard to the impacts of the input variables and their interrelations, typically with gradient search algorithms. This structure enables the network to fit a non-linear function to the given data. This is achieved by the multi-layered topology, where the input layer normalizes and weighs the inputs, the hidden layer fits nonlinear function to the presented data through the transfer functions, and the output layer sums up the results. These layers consist of the processing units called neurons. The topology of ANN is formed by the weighted connection structure between the neurons. Multi-layered topology with direct weighted connections is known as the Feed Forward (FF) network and visualized in Fig 1. Besides FF, there are many ANN topologies presented in the literature, each having a different specific target according to the nature of the data and the problem. The most commonly used network topologies in short-term electrical load and price forecasting are discussed in [59]. In this study we used FF ANN as it is already shown that they perform better on forecasting [60].

In order to train a network for adapting to the introduced data for the desired output, the objective function must be minimized, using learning algorithms. Back Propagation (BP) is the most commonly used error distribution model in learning phase of ANNs. In order to minimize the objective function, or cost, different algorithms are presented. Starting from the slow converging gradient descent, each learning algorithm works fine on certain types of datasets or objectives. Scaled Conjugate Gradient (SCG), Levenberg-Marquardt (LM), Quasi-Newton (QN) and Bayesian Regulation (BR) are the examples of common learning algorithms used in ANNs. The LM algorithm works 10 to 100 times faster than BP [61]. This makes LM the most convenient learning algorithm in many tasks. Newton’s minimization function for vector x is (5) where J(x) is the Jacobian matrix and e(x) is the error vector. The LM algorithm is in fact an update to Newton’s minimization, defined as: (6) and practically solves the situations where can not be inverted. The μ coefficient adjusts the convergence speed of LM. While small values provide fast convergence, large values speeds down and turns the algorithm into the steepest descent. μ can be modified during learning phase, between iterations, to guarantee convergence.

Due to the nature of ANN, every feature vector can be presented as an input neuron. This structure enables us to form any subset of features that effect electrical load and use these as the inputs of the forecasting NN. The subsets of features and their impact on load forecast performance is discussed in Dataset section.

Experimental results

We evaluated our power load forecasting models on data from deregulated Turkish Market. In this section, we present our dataset, performance evaluation metric and experimental setups of the selected methods.

Dataset

Electrical load depends on several factors including calendar effect, consumption, electricity price, weather and currency. The effects of these factors can be explained as follows: Calendar effect shapes demand through working hours, holidays, and national or religious days. Consumption corresponds to the electricity demand of both industrial and residential consumers. Electricity price is shaped by both production and trading, and influences load. Weather conditions can change power demand. It is known that temperature, relative humidity, wind speed and direction are the most affectional weather parameters, since the usage of air conditioners or electrical heaters are directly related to these factors. Currency is another major factor because it directly affects the electricity production costs and cross-border electricity trade agreements.

Selecting the correct combination of input parameters is the key to create an effective electrical load forecasting system. In order to attain a good combination, data is collected from several sources related to the factors mentioned above. We evaluated the effects of these factors on load forecast performance using ANN and compared the results with SARIMA. Load, electricity price, and weather data are collected in hourly period between 01.01.2013 and 31.12.2014. Currency data is collected in daily basis. Using these data, we constructed our training and test sets. We established our test sets by selecting the last full week, starting from Monday, of the corresponding month in 2014. The data in the preceding 1, 3, 6 and 12 months of the test weeks are used for training. The training and test sets for the selected weeks are given in Table 1, with start and end dates.

thumbnail
Table 1. Start and end dates of the training and test periods for the selected test weeks.

https://doi.org/10.1371/journal.pone.0175915.t001

The hourly load data and market clearing prices for Turkish Market are gathered from EPIAS [62]. Using this data, we calculated hourly lagged load data that includes the previous hour load, the load at the same hour on previous day, on previous week, and average load on last 24 hours. Besides load data, we prepared calendar data by marking weekdays, weekends, Turkish national and religious holidays.

We collected weather data for the major cities of Turkey from Turkish State Meteorological Service [63]. After analyzing their impact on load forecast performance, we selected four prominent cities: İstanbul, Ankara, İzmir and Antalya. The weather data consist of hourly temperature and humidity values for these city centers.

Most of the wholesale trade in Turkish Electricity Market is made using foreign currencies. Thus, foreign exchange currency rates for Euro and US Dollar are collected from the Central Bank of the Republic of Turkey archives [64]. Unfortunately, historical hourly rates for these currencies are not provided in the records. Thus, we used daily currency exchange rates for every hour in a day.

A detailed description of our feature sets and the features within these groups are presented in Table 2. In order to evaluate the effect of these features on load, we analyzed their correlation coefficients with respect to the hourly load data. Based on the p-values matrix, we can easily say that there is a significant correlation between the selected input features and load. Besides statistical parameter observations, we estimate the importance of inputs using a bootstrap aggregated random-forest ensemble. Out-of-bag importance of the selected features are given in Fig 2, and it clearly shows that all these features have impact on load.

thumbnail
Table 2. Details of feature sets that are used in ANN based STLF model.

https://doi.org/10.1371/journal.pone.0175915.t002

Performance metric

In this study, we use Absolute Percentage Error (APE) and MAPE to measure the performances of the proposed approaches. APE is calculated by Eq (7) and is used to show the maximum and minimum forecast errors. On the other hand, MAPE gives an overall performance evaluation of proposed approaches. MAPE formula is given in Eq (8). In Eq (7), LPi is the Load Estimation Plan value, that is the original value provided by EPIAS, whereas LEi is the estimated Load at hour i. In Eq (8), N corresponds to total number of estimated hours. (7) (8)

Creating SARIMA model

Electrical load of four consecutive weeks on March 2014 is given in Fig 3. A close inspection of the figure shows a distinct weekly seasonal pattern, which electrical load possesses. Thus, in this study, we prefer to build a SARIMA model, which can be shown as ARIMA(p, d, q) × (P, D, Q)S. Determining the values of p, q, d, P, Q, and D plays a crucial role for creating a highly accurate SARIMA model. We used Econometrics Toolbox of Matlab to determine these values, and to estimate parameters of our SARIMA models.

thumbnail
Fig 3. Power load in four consecutive weeks of March 2014.

https://doi.org/10.1371/journal.pone.0175915.g003

In most of the previous studies, variables of proposed ARIMA models are determined intuitively as in [18, 40, 65]. It is also possible to use Sample Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) for determining p and q variables [7, 32]. ACF and PACF of the electrical load for March 2014 are given in Fig 4, which shows that there is a high correlation between the first few lags and the actual load. This figure also shows that t − 168th lag has a high influence on the tth hour. This also validates our assumption about the weekly seasonal characteristic of the electrical load.

thumbnail
Fig 4. Autocorrelation function and partial autocorrelation function of load.

https://doi.org/10.1371/journal.pone.0175915.g004

In order to select degrees of p and q parameters, we employed Bayesian Information Criterion (BIC). We estimated several models with different p and q values. Then, for each estimated model, the log-likelihood objective function value is calculated. This value is then used to calculate the BIC measure of fit. In our study, we scanned a wide range of p and q values, and observed that the best fitted model is constructed when both p and q values are set to 1. BIC values of models with different p and q values are given in Table 3.

thumbnail
Table 3. BIC values of SARIMA models with different p & q values.

https://doi.org/10.1371/journal.pone.0175915.t003

We also created another SARIMA model and selected its parameters intuitively. Our selection basically depends on the idea that electrical load on time t (Lt) depends on the load on last three hours (Lt−1, Lt−2, Lt−3), the load at same hour in previous day (Lt−24), the load 48 (Lt−48) hours ago and the load 72 (Lt−72) hours ago. Variables used in these models are given in Table 4.

We evaluated the performance of the proposed SARIMA models using one week of each month in 2014. Table 5 shows the MAPE values of both models for each week. Load estimations of both models and the actual load for the first test week is given in Fig 5. Figure clearly shows that both models perform well in weekdays. However, their accuracy is very low on weekends. This is due to fact that there is no information about weekdays or weekends supplied to both models. On the other hand, both models have relatively higher MAPE values on test week 7, that corresponds to July. Load estimated by our models and actual load for these weeks are given in Fig 6. Test week 7, which starts from July 21 and ends at July 27, overlaps with the Ramadan Feast Eve. Thus, last day of this week has higher error rate that increases overall MAPE value of the week.

thumbnail
Fig 5. Load estimation of both SARIMA models for the last week of January 2014.

Estimations for week 1, BIC based SARIMA model is shown in the upper part. Estimations of intuitive SARIMA model are given in the lower part.

https://doi.org/10.1371/journal.pone.0175915.g005

thumbnail
Fig 6. Load estimation of both SARIMA models for the last week of July 2014.

Estimations for week 7, BIC based SARIMA model is shown in the upper part. Estimations of intuitive SARIMA model are given in the lower part.

https://doi.org/10.1371/journal.pone.0175915.g006

Overall performance of both models are very close. However, our first model, which was built using BIC, outperforms the intuitive model. Thus, we preferred to use BIC-based SARIMA model and hereafter SARIMA refers to our BIC-based model.

Creating neural network model

The architectural properties of ANN directly effect the performance. In a FF network, most crucial properties are the hidden layer size, learning method and the length of training data. In order to successfully select and adjust these properties, we ran comparative tests on our dataset.

Choosing the number of hidden neurons is an important problem since there is not a certain way to determine it. Previous studies report that most widely used hidden neuron numbers are 2n + 1, (n + 1)/2, where n is the number of input neurons [66]. We observed that 2n + 1 hidden neurons worked best especially in shorter training datasets. It is also possible to use the trial and error approach to determine number of hidden neurons. Our trials showed that 20 hidden neurons produced better results in larger training datasets. A detailed comparison of hidden layer size effect on load forecast performance using LM is given in Table 6. Since highest forecast performance is expected using larger training data, we set our hidden layer to 20 neurons.

thumbnail
Table 6. Impact of hidden layer size on ANN performance, measured with MAPE (%).

Smaller MAPE means higher forecast accuracy. The network is trained using LM. D refers to calendar data, L is previous load estimation plan, P is electricity price, W is weather and C is currency feature sets.

https://doi.org/10.1371/journal.pone.0175915.t006

We discussed popular learning methods for training ANN in Methods. We compared three different learning methods, LM, BR and SCG using 20 hidden neurons. The impact of these learning methods across the feature sets, grouped by training dataset size, are compared in Table 7 and visualized in Fig 7. The comparison results showed that LM is the most convenient learning method that reasonably works good for every training dataset size. In addition, overall error decreases as the training dataset gets larger, regardless of learning algorithms.

thumbnail
Table 7. Learning method performance evaluation across different feature sets, grouped by training set length.

Performance is measured with MAPE (%). Smaller MAPE means higher forecast accuracy. D refers to calendar data, L is previous load estimation plan, P is electricity price, W is weather and C is currency feature sets.

https://doi.org/10.1371/journal.pone.0175915.t007

thumbnail
Fig 7. Performance comparison of NN learning methods across feature sets, measured with MAPE (%).

Smaller MAPE means higher forecast accuracy. D refers to calendar data, L is previous load estimation plan, P is electricity price, W is weather and C is currency feature sets.

https://doi.org/10.1371/journal.pone.0175915.g007

Here we define the detailed parameters of our FF network, based on the determined hidden layer size and learning method above. Our input neurons vary from 6 to 19, depending on the feature set combinations of calendar data (D), previous load estimation plan (L), electricity price (P), weather data (W), and currency (C). The DL combination has 6 input neurons. DLP, DLW and DLC combinations have 10, 14 and 7 input neurons respectively. When all of the aforementioned features are used, the system has 19 inputs. We have a fully connected hidden layer with 20 neurons. The hidden neurons are activated with tansig transfer function. Our output neuron has linear activation function. Bias is introduced to both hidden and output neurons. Our model does not have input delays or layer delays. We measure the error on training with mean squared error, only with error minimization. LM training algorithm is used with 1000 maximum training epochs, 6 validation checks and μ beginning from 0.001, with 0.1 decrement and 10 increment ratios with 10,000 maximum limit. Usually, μ values converge to 10. We used Matlab Neural Network Toolbox in order to build and train the network.

Another focus point on the forecast performance of ANN is the impact of the feature sets. We evaluated the effect of these features by creating combinations of D, L, P, W, and C. The comparative results are given in Table 7. The t-tests proved that using DL, DP, DLW and DLPWC for load forecasting is statistically significant. On contrary to our expectations, currency has no positive effect on performance. This situation is clearly proved in our tests. Similarly, weather and price has minor positive effects. Calendar data and previous load values work well for load forecasting with adequate precision. We observed that an ANN with 20 hidden neurons, trained with DL of previous 12 months using LM learning algorithm, produced lowest MAPE. Our test results showed that using larger training dataset and simpler feature sets work better on load forecasting with ANN on Turkish Market.

Comparative discussion

Performance evaluation of both methods is summarized in Table 8. The table is also visualized in Fig 8 using minimum, maximum APE values and MAPE values of each test week. Values in Table 8 and Fig 8 are calculated by averaging of 20 runs, in order to suppress the sensitivity of ANN to initial state. Hourly predictions of both models for 12 test weeks are also given in Fig 9 and in Fig 10. Our models predict all 168 hours of each test week at once. Therefore, we have a 168-hour ahead forecast horizon.

thumbnail
Table 8. Performance of implementations, measured with MAPE (%).

Smaller values mean higher forecast accuracy.

https://doi.org/10.1371/journal.pone.0175915.t008

thumbnail
Fig 8. Performance comparison of the proposed approaches, measured with APE.

MAPE values are highlighted on min-max intervals. Smaller values mean higher forecast accuracy.

https://doi.org/10.1371/journal.pone.0175915.g008

thumbnail
Fig 9. Load estimations of SARIMA based model and actual load values for 12 weeks of year 2014.

https://doi.org/10.1371/journal.pone.0175915.g009

thumbnail
Fig 10. Load estimations of ANN based model and actual load values for 12 weeks of year 2014.

https://doi.org/10.1371/journal.pone.0175915.g010

Table 8 is obtained disregarding a special hour: MAPE of 10th week is evaluated by removing the 146th hour, which corresponds to October 28th, 2014 02:00 am. This hour was, unfortunately, end of daylight saving time, and since clocks are turned backwards from 2 am to 1 am, 2 am has occurred twice. Therefore, load of this hour is doubled. Clock change happens twice a year; start of daylight saving time, there happens a single hour with 0 load, and end of daylight saving time, where load is almost doubled for a single hour. We chose to disregard those hours.

At the first glance, it can easily be seen that performance of both methods depends on the season: average forecasting error for winter weeks is much less then that of summer, while spring and autumn are placed in between. Special cases for unexpected errors are explained below.

The highest forecasting error is at 9th week which covers 22nd to 28th of September. Here, for a single hour, 75th hour of the week (September 25th, 2014 02:00 am) both methods make an unfortunate peak causing the maximum error of the week. The cause of this peak is a inexplicable peak at same hour of previous week of the input data. The magnitude of the peak with SARIMA is reasonably greater than that of the ANN. Here the effect of seasonality on the model is revealed. ANN can smooth the noisy values, however, noise is directly reflected to forecasts.

Worst forecasted week seems to be the 7th week according to the MAPE values. Forecasting error for the last day of this week is above the expectations. This day is not only Sunday but also Ramadan Feast Eve. Load during this day is about 20% less than an ordinary Sunday. The error occurred for this day spoils the MAPE of the week.

The 4th and 5th weeks include national days. April 23rd is National Sovereignty and Children’s Day, and May 19th is Commemoration of Atatürk, Youth and Sports Day. Those days not only cause errors on forecasting but also cause a noise in the training data and increase the error on the forecasting of 6th week.

At the worst case, for week 9, forecasting error with ANN is 16.03% while highest forecasting for SARIMA model is 44.09%. The reason of this error was explained above as the noise in input. When overall performance of the methods is considered, ANN with calendar data and previous load outperforms SARIMA although the error trend seems to be the same. SARIMA’s main weakness is that there is no way to distinguish between the working days and holidays. Separate models for working days and holidays might be regarded as a solution. However, there are two religious holidays and four national holidays, which occur once a year. Religious holidays shift 10 days each year. It is also possible for all holidays to be extended by the government, if the holiday is close to weekend. Therefore separate models for SARIMA is not applicable and ANN is the method which provides the distinction required by the nature of electric consumption.

SARIMA’s benefit seems to be quick recovery from the effect of the holidays. The effect of national days in third day of 4th week and first day of the 5th week is reflected to next day in ANN model, however, SARIMA does not propagate this unexpected effect to next day. For the fourth day of 4th week, ANN has 3.4% MAPE while ARIMA has 1.6% MAPE. Difference for the second day of 5th week is not this much notable; MAPE values are 4.1% for ANN and 3.1% for SARIMA.

We compare the distribution of errors for the SARIMA and ANN based models in Fig 11 by using the empirical cumulative distribution function. We see that ANN based model produces less error than the SARIMA based model on most of the test weeks and the cumulative error is below 5% in general. However, SARIMA has minor advantages in some points of the 4th and 5th test weeks. This is due to the SARIMA’s ability to recover quickly from the effects of the national holidays in these months, as we discussed above.

thumbnail
Fig 11. Empirical cumulative distribution functions for MAPEs of SARIMA and ANN based models on 12 test weeks of year 2014.

https://doi.org/10.1371/journal.pone.0175915.g011

Conclusion

In this study, we created two separate STLF models based on SARIMA and ANN for Turkish Electricity Market. We comparatively presented their performances for last weeks of each month. On contrary to existing studies, we included weekends and special days in our test sets for fair and unbiased performance evaluation. Additionally, we evaluated the contribution of globally known factors on forecast performance, such as electricity price, weather parameters and currency.

When the model performances are observed on average of 12 test weeks, ANN produced 1.80% MAPE and outperformed SARIMA, which had 2.60% MAPE. We can say that ANN model fits better than SARIMA to Turkish Market. However, in some cases SARIMA performs better than ANN, especially on the forecasts after holidays. This structure addresses one of our future works, to produce a hybrid load forecast solution.

Experimental results proved that when more features are utilized, model becomes more complex and forecast performance decreases. For this reason, we do not recommend using load, price, weather and currency feature sets together. Calendar data and load feature sets work best on ANN for forecasting with adequate precision.

Our future work consists of building a hybrid model to produce more accurate forecasts, using the models and directive discussions we presented here. Moreover, we will use the output of this system as input to a short term electricity price forecaster. We also plan evaluating our model after the total liberalization of Turkish Market.

Acknowledgments

This study is a part of the TÜBİTAK SME-RDI funded project numbered 7140008.

Author Contributions

  1. Conceptualization: OOB GB ZCT.
  2. Data curation: GB ZCT OOB.
  3. Formal analysis: GB ZCT.
  4. Funding acquisition: OOB.
  5. Investigation: ZCT GB.
  6. Methodology: ZCT GB OOB.
  7. Project administration: OOB.
  8. Resources: GB OOB ZCT.
  9. Software: GB OOB ZCT.
  10. Supervision: GB OOB ZCT.
  11. Validation: OOB GB ZCT.
  12. Visualization: GB ZCT OOB.
  13. Writing – original draft: OOB GB ZCT.
  14. Writing – review & editing: OOB GB ZCT.

References

  1. 1. EPIAS—Enerji Piyasalari Isletme Anonim Sirketi. [cited 05.04.2016]. Available from: https://www.epias.com.tr/en/about-us.
  2. 2. Hahn H, Meyer-Nieberg S, Pickl S. Electric load forecasting methods: Tools for decision making. Eur J Oper Res. 2009;199(3):902–907.
  3. 3. Işıklı Esener I, Yüksel T, Kurban M. Short-term load forecasting without meteorological data using AI-based structures. Turk J Electr Eng Co. 2015;23(2):370–380.
  4. 4. Pandey AS, Singh D, Sinha SK. Intelligent hybrid wavelet models for short-term load forecasting. IEEE T Power Syst. 2010;25(3):1266–1273.
  5. 5. Wu X, Hu X, Moura S, Yin X, Pickert V. Stochastic control of smart home energy management with plug-in electric vehicle battery energy storage and photovoltaic array. J Power Sources. 2016;333:203–212.
  6. 6. Vu DH, Muttaqi KM, Agalgaonkar AP. Short-term load forecasting using regression based moving windows with adjustable window-sizes. In: Industry Applications Society Annual Meeting. IEEE; 2014. pp. 1–8.
  7. 7. Taylor JW, McSharry PE. Short-term load forecasting methods: An evaluation based on European data. IEEE T Power Syst. 2007;22(4):2213–2219.
  8. 8. Mandal P, Senjyu T, Urasaki N, Funabashi T. A neural network based several-hour-ahead electric load forecasting using similar days approach. Int J Elec Power. 2006;28(6):367–373.
  9. 9. Mori H, Kobayashi H. Optimal fuzzy inference for short-term load forecasting. IEEE T Power Syst. 1996;11(1):390–396.
  10. 10. Al-Hamadi HM, Soliman SA. Short-term electric load forecasting based on Kalman filtering algorithm with moving window weather and load model. Electr Pow Syst Res. 2004;68(1):47–59.
  11. 11. Zheng T, Girgis AA, Makram EB. A hybrid wavelet-Kalman filter method for load forecasting. Electr Pow Syst Res. 2000;54(1):11–17.
  12. 12. Xiao L, Wang J, Yang X, Xiao L. A hybrid model based on data preprocessing for electrical power forecasting. Int J Elec Power. 2015;64:311–327.
  13. 13. Monterio C. Overview of Electric Power Generation Systems. In: Catalão JP, editor. Electric power systems: advanced forecasting techniques and optimal generation scheduling. CRC Press; 2012. pp. 1–26.
  14. 14. Feinberg EA, Genethliou D. Load forecasting. In: Chow JH, Wu FF, Momoh JA, editors. Applied mathematics for restructured electric power systems. Springer; 2005. pp. 269–285.
  15. 15. Che J. A Novel Hybrid Model for bi-Objective Short-Term Electric Load Forecasting. Int J Elec Power. 2014;(61):259–266.
  16. 16. Rothe JP, Wadhwani AK, Wadhwani S. Short term load forecasting using multi parameter regression; 2009. Available from: arXiv:0912.1015. [cited 05.04.2016].
  17. 17. Chen JF, Wang WM, Huang CM. Analysis of an adaptive time-series autoregressive moving-average (ARMA) model for short-term load forecasting. Electr Pow Syst Res. 1995;34(3):187–196.
  18. 18. Amjady N. Short-term hourly load forecasting using time-series modeling with peak load estimation capability. IEEE T Power Syst. 2001;16(3):498–505.
  19. 19. Deihimi A, Orang O, Showkati H. Short-term electric load and temperature forecasting using wavelet echo state networks with neural reconstruction. Energy. 2013;57:382–401.
  20. 20. Sudheer G, Suseelatha A. A wavelet-nearest neighbor model for short-term load forecasting. Energy Sci Eng. 2015;3:51–59.
  21. 21. Almeshaiei E, Soltan H. A methodology for electric power load forecasting. Alexandria Engineering Journal. 2011;50(2):137–144.
  22. 22. Yang Y, Wu J, Chen Y, Li C. A New Strategy for Short-Term Load Forecasting. Abstr Appl Anal. 2013; Available from: https://www.hindawi.com/journals/aaa/2013/208964/cta/.
  23. 23. Lee CM, Ko CN. Short-term load forecasting using lifting scheme and ARIMA models. Expert Syst Appl. 2011;38(5):5902–5911.
  24. 24. Chen H, Canizares CA, Singh A. ANN-based short-term load forecasting in electricity markets. In: Power Engineering Society Winter Meeting. vol. 2. IEEE; 2001. pp. 411–415.
  25. 25. Kalaitzakis K, Stavrakakis G, Anagnostakis E. Short-term load forecasting based on artificial neural networks parallel implementation. Electr Pow Syst Res. 2002;63(3):185–196.
  26. 26. Wang Y, Gu D, Xu J, Li J. Back propagation neural network for short-term electricity load forecasting with weather features. In: International Conference on Computational Intelligence and Natural Computing. vol. 1. IEEE; 2009. pp. 58–61.
  27. 27. Karsaz A, Mashhadi HR, Mirsalehi MM. Market clearing price and load forecasting using cooperative co-evolutionary approach. Int J Elec Power. 2010;32(5):408–415.
  28. 28. Xiao Z, Ye SJ, Zhong B, Sun CX. BP neural network with rough set for short term load forecasting. Expert Syst Appl. 2009;36(1):273–279.
  29. 29. Hooshmand RA, Amooshahi H, Parastegari M. A hybrid intelligent algorithm based short-term load forecasting approach. Int J Elec Power. 2013;45(1):313–324.
  30. 30. Li P, Li Y, Xiong Q, Chai Y, Zhang Y. Application of a hybrid quantized Elman neural network in short-term load forecasting. Int J Elec Power. 2014;55:749–759.
  31. 31. Quan H, Srinivasan D, Khosravi A, Nahavandi S, Creighton D. Construction of neural network-based prediction intervals for short-term electrical load forecasting. In: IEEE Symposium on Computational Intelligence Applications in Smart Grid. IEEE; 2013. pp. 66–72.
  32. 32. Quan H, Srinivasan D, Khosravi A. Short-term load and wind power forecasting using neural network-based prediction intervals. IEEE T Neur Net Lear. 2014;25(2):303–315.
  33. 33. Liu N, Tang Q, Zhang J, Fan W, Liu J. A hybrid forecasting model with parameter optimization for short-term load forecasting of micro-grids. Appl Energ. 2014;129:336–345.
  34. 34. Kavousi-Fard A, Samet H, Marzbani F. A new hybrid Modified Firefly Algorithm and Support Vector Regression model for accurate Short Term Load Forecasting. Expert Syst Appl. 2014;41(13):6047–6056.
  35. 35. Shayeghi H, Ghasemi A, Moradzadeh M, Nooshyar M. Simultaneous day-ahead forecasting of electricity price and load in smart grids. Energ Convers Manage. 2015;95:371–384.
  36. 36. Lang K, Zhang M, Yuan Y. Improved Neural Networks with Random Weights for Short-Term Load Forecasting. PLOS ONE. 2015;10(12):1–14.
  37. 37. Filik UB, Gerek ON, Kurban M. Hourly forecasting of long term electric energy demand using novel mathematical models and neural networks. Int J Innov Comput I. 2011;7:3545–3557.
  38. 38. Filik UB, Gerek ON, Kurban M. Yuk Tahmini icin Gelistirilen Matematiksel Model ve Uygulamasi. In: V. Enerji Verimliligi ve Kalitesi Sempozyumu. EMO; 2013. pp. 35–39.
  39. 39. Topalli AK, Erkmen I. A hybrid learning for neural networks applied to short term load forecasting. Neurocomputing. 2003;51:495–500.
  40. 40. Topalli AK, Erkmen I, Topalli I. Intelligent short-term load forecasting in Turkey. Int J Elec Power. 2006;28(7):437–447.
  41. 41. Ishik MY, Goze T, Ozcan I, Gungor VC, Aydin Z. Short term electricity load forecasting: A case study of electric utility market in Turkey. In: 3rd International Smart Grid Congress and Fair. 2015. pp. 1–5.
  42. 42. Cevik HH, Cunkas M. A comparative study of artificial neural network and ANFIS for short term load forecasting. In: 6th International Conference on Electronics, Computers and Artificial Intelligence. 2014. pp. 29–34.
  43. 43. Lahouar A, Slama JBH. Day-ahead load forecast using random forest and expert input selection. Energ Convers Manage. 2015;103:1040–1051.
  44. 44. Fattaheian-Dehkordi S, Fereidunian A, Gholami-Dehkordi H, Lesani H. Hour-ahead demand forecasting in smart grid using support vector regression (SVR). Int T Electr Energy. 2014;24(12):1650–1663.
  45. 45. Abedinia O, Amjady N. Short-term load forecast of electrical power system by radial basis function neural network and new stochastic search algorithm. Int T Electr Energy. 2015;.
  46. 46. Hu Z, Bao Y, Xiong T, Chiong R. Hybrid filter–wrapper feature selection for short-term load forecasting. Eng Appl Artif Intel. 2015;40:17–27.
  47. 47. Weron R, Misiorek A. Forecasting spot electricity prices with time series models. In: 3rd International Conference on the European Energy Market. 2005. pp. 133–141.
  48. 48. Khatoon S, Singh AK, et al. Analysis and comparison of various methods available for load forecasting: An overview. In: Innovative Applications of Computational Intelligence on Power, Energy and Controls with Their Impact on Humanity. IEEE; 2014. pp. 243–247.
  49. 49. Campbell PR, Adamson K. Methodologies for load forecasting. In: 3rd International IEEE Conference on Intelligent Systems. IEEE; 2006. pp. 800–806.
  50. 50. Kolmek MA, Navruz I. Forecasting the day ahead price at electricity balancing and settlement market of Turkey by using artificial neural networks. Turk J Electr Eng Co. 2015;23:841–852.
  51. 51. Song YH, Wang XF. Operation of market-oriented power systems. London: Springer-Verlag; 2003.
  52. 52. Vilar-Fernandez JM, Cao R. Nonparametric Forecasting in Time Series—A Comparative Study. Commun Stat-Simul C. 2007;36(2):311–334.
  53. 53. Shang HL, Hyndman RJ. Nonparametric time series forecasting with dynamic updating. Math Comput Simulat. 2011;81(7):1310–1324.
  54. 54. Smith BL, Williams BM, Oswald RK. Comparison of parametric and nonparametric models for traffic flow forecasting. Transport Res C-Emer. 2002;10(4):303–321.
  55. 55. Härdle W, Lütkepohl H, Chen R. A review of nonparametric time series analysis. Int Stat Rev. 1997;65(1):49–72.
  56. 56. Kennedy P. A guide to econometrics. MIT press; 2003.
  57. 57. Pagan A, Ullah A. Nonparametric econometrics. Cambridge university press; 1999.
  58. 58. Scott DW. Multivariate density estimation: theory, practice, and visualization. John Wiley & Sons; 2009.
  59. 59. Weron R. Electricity price forecasting: A review of the state-of-the-art with a look into the future. Int J Forecasting. 2014;30(4):1030–1081
  60. 60. Rutkowski L. Computational intelligence: methods and techniques. Springer-Verlag Berlin Heidelberg; 2008.
  61. 61. Catalão J, Mariano S, Mendes V, Ferreira L. Short-term electricity prices forecasting in a competitive market: a neural network approach. Electr Pow Syst Res. 2007;77(10):1297–1304.
  62. 62. EPIAS—Enerji Piyasalari Isletme Anonim Sirketi, General Reports. [cited 05.04.2016]. Available from: https://rapor.epias.com.tr/rapor/xhtml/ptfSmfListeleme.xhtml.
  63. 63. Turkish State Meteorological Service. [cited 20.01.2015]. Available from: http://tumas.mgm.gov.tr.
  64. 64. Central Bank of the Republic of Turkey. [cited 20.01.2015]. Available from: http://www.tcmb.gov.tr.
  65. 65. Jaramillo-Moran MA, Gonzalez-Romera E, Carmona-Fernandez D. Monthly electric demand forecasting with neural filters. Int J Elec Power. 2013;49:253–263.
  66. 66. Feng CXJ, Gowrisankar AC, Smith AE, Yu ZGS. Practical guidelines for developing BP neural network models of measurement uncertainty data. J Manuf Syst. 2006;25(4):239–250.