An attention-based recurrent learning model for short-term travel time prediction

Jawad-ur-Rehman Chughtai; Irfan Ul Haq; Muhammad Muneeb

doi:10.1371/journal.pone.0278064

Abstract

With the advent of Big Data technology and the Internet of Things, Intelligent Transportation Systems (ITS) have become inevitable for future transportation networks. Travel time prediction (TTP) is an essential part of ITS and plays a pivotal role in congestion avoidance and route planning. The novel data sources such as smartphones and in-vehicle navigation applications allow traffic conditions in smart cities to be analyzed and forecast more reliably than ever. Such a massive amount of geospatial data provides a rich source of information for TTP. Gated Recurrent Unit (GRU) has been successfully applied to traffic prediction problems due to its ability to handle long-term traffic sequences. However, the existing GRU does not consider the relationship between various historical travel time positions in the sequences for traffic prediction. We propose an attention-based GRU model for short-term travel time prediction to cope with this problem enabling GRU to learn the relevant context in historical travel time sequences and update the weights of hidden states accordingly. We evaluated the proposed model using FCD data from Beijing. To demonstrate the generalization of our proposed model, we performed a robustness analysis by adding noise obeying Gaussian distribution. The experimental results on test data indicated that our proposed model performed better than the existing deep learning time-series models in terms of Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Coefficient of Determination (R²).

Citation: Chughtai J-u-R, Haq IU, Muneeb M (2022) An attention-based recurrent learning model for short-term travel time prediction. PLoS ONE 17(12): e0278064. https://doi.org/10.1371/journal.pone.0278064

Editor: Xiyu Liu, Shandong Normal University, CHINA

Received: June 30, 2022; Accepted: November 9, 2022; Published: December 1, 2022

Copyright: © 2022 Chughtai et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All the implementation details of our work can be found at https://github.com/jawadchughtai/Att_GRU_TTP We have made the visibility of the repository public. Also, minimal anonymized dataset is provided in the Dataset folder to replicate our work.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Recent years have witnessed a drastic movement of people from rural to urban areas. 4.1 billion people were living in urban areas in 2017, comprising 55% of the total global population [1]. According to the Population Reference Bureau report, the current population will have grown by 14% by 2050 [2]. Urbanization has significantly improved the quality of life of individuals [3] whereas, on the other hand, it has brought new challenges and raised new concerns [4].

Advancement in information and communication technology has brought about a significant rise in the availability of mobility data collected through multiple data sources, including FCD (Floating Car Data), detectors, cameras, etc. Research groups and companies analyze this data using big data and machine learning to improve people’s living standards [5]. Researchers have used data from multiple sources to improve traffic-related operations with applications in traffic congestion prediction [6], traffic flow prediction [7], traffic speed estimation [8], traffic demand prediction [9], traffic signal control [10], parking space forecasting [11], stay point detection [12], traffic accident prediction [13], accident severity analysis [14], and many others.

One of the essential components of an Intelligent Transportation System (ITS) is Travel Time Prediction (TTP). Accurate TTP helps commuters and travelers make wise decisions about departure time and route selection which, in turn, leads to congestion avoidance. Moreover, it assists logistic operators in improving service quality and reducing transportation costs by avoiding congested routes. Furthermore, TTP helps traffic managers and decision-makers make traffic-related strategies and improve existing operations [15].

Various approaches including statistical (e.g., Historical Average (HA)), classical time-series (e.g., Auto-Regressive Moving Average (ARIMA) and variants), machine learning(e.g., Random Forest (RF), Support Vector Regression (SVR)), and deep learning-based approaches (Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and variants) have been proposed to predict travel time [16–23]. Deep learning-based approaches outperformed their counterparts in prediction tasks because of their ability to deal with non-linearities, traffic trends, and long-term sequences [24].

Recurrent Neural Networks (RNNs) are specialized models developed for sequence learning problems. Simple RNN performs well for short-term sequences. However, it suffers from exploding Gradient and vanishing gradient problems when dealing with long-term sequences. Gated Recurrent Unit (GRU) and Long-Short-Term Memory (LSTM) were developed to resolve these issues. Both models have shown state-of-the-art performance on various sequence learning tasks with applications ranging from Natural Language Processing (NLP) to traffic prediction [25].

Existing RNNs architectures like LSTM and GRU suffer from implicitly modeling the context in historical travel time sequences. These models give equal weights to all hidden states when used for the TTP task. We introduced an attention mechanism, which aims to re-weight the network weights by leveraging the hidden relationship between distinct positions in the Travel Time (TT) sequence [26].

In this paper, we propose an attention-based GRU model for TTP. We selected GRU due to its simplistic architecture and faster training time. The experimental results on the Q-Traffic dataset show significant improvement in short-term traffic prediction compared to baseline approaches.

The main contribution of the paper can be summarized as follows:

This paper proposes a deep learning model based on GRU for short-term travel time prediction. We introduced self-attention in GRU to address the limitation of GRU in finding the relation across various travel time positions in the input (past) sequences. To the best of our knowledge, no attempt has been made to forecast travel time using traffic flow as input with attention-based GRU.
We compared our proposed model with baseline state-of-the-art statistical, classical time-series, Machine Learning (ML), and Deep Learning (DL) approaches. The comparative results on the Beijing-based FCD dataset show considerable improvement in Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Coefficient of Determination (R²).
Moreover, we performed perturbation analysis by adding noise to our data which validated the generalization abilities of our proposed model.

We organized the remainder of this paper as follows: Section II provides the historical background of our studied area. Section III explains the proposed methodology. Section IV presents the findings and results of our work. Section V concludes the paper.

Related work

The literature on TTP is grouped into two broad categories: traditional approaches and advanced approaches. Traditional approaches include classical approaches and machine learning-based approaches while advanced approaches include deep learning-based approaches, ensemble learning-based approaches and attention-based approaches.

Traditional approaches for TTP

Classical approaches.

Earlier travel time prediction approaches employed statistical theory-based modeling and classical time-series approaches. The HA was one of the first statistical theory-based modeling approaches used in TTP studies. In this approach, travel time in the historical period is averaged to get the prediction [16]. HA is computationally fast and doesn’t require any assumption for prediction. However, HA does not consider temporal variations and features, resulting in lower prediction precision. The ARIMA was another widely used classical time-series model [17]. ARIMA treats the traffic data as a stationary time series to predict future travel time, and this assumption hampers ARIMA’s ability to predict TT in uncertain or changing traffic conditions. Despite ARMIA’s widespread use, the simple linear model falls short of accurately forecasting nonlinear traffic data.

Machine learning-based approaches.

SVR is one of the widely used approach for TTP. Some studies used SVR with nonlinear transformations to handle data complexities. The authors in [18] proposed SVR for freeway TTP. When compared to classical Support Vector Machine (SVM), an SVM model optimized using the artificial fish swarm approach [27] or least squared loss function and equality constraint [28] has been found to improve model precision. k-Nearest Neighbors (k-NN), an example-based or pattern matching-based model, has also been widely employed for similarity pattern matching in travel time problems on urban roads and highways [29]. Typically, k-NN uses euclidean distance to find k similar patterns and then uses a weighted algorithm to get the final result. Myung et al. in [30] employed k-NN on data collected through automatic toll collection and vehicle detector systems to predict highway travel time.

Advanced approaches

Deep learning-based approaches.

DL has attracted researchers’ attention for TTP and is still an enduring area. Different DL approaches, including MLP, auto-encoders, CNN, and RNN, have been applied for TTP. MLP is one of the earliest and most widely used approaches for TTP. The authors in [21] have proposed a multi-step deep learning approach for TTP. Extensive feature engineering is performed using geospatial feature analysis, principal component analysis, and k-means clustering, followed by a deep-stacked auto-encoder. The findings revealed that the proposed approach performed well for general traffic dynamics but failed to predict travel time in case of rare events. Fu et al. [31] used MLP as a final predictor on top of wide deep recurrent modules to predict travel time. Yuan et al. [32] employed MLP for spatiotemporal learning of travel time by exploiting periodicity in daily and weekly patterns and the road network structure. Researchers have also used CNN to capture the spatial aspects of TTP data. The authors in [20] proposed a global-level representation for CNN to capture better the relationship between the predicted information and historical data points to overcome local receptive field limitations. However, the proposed approach is validated only on a single highway link. A new local-receptive field is proposed by [33] to model nonlinear spatiotemporal relationships in travel time data over multiple highway links. Shen et al. [34] implemented CNN with RNN to learn both spatial and temporal features to improve the prediction of FCD. Likewise, the authors in [35] proposed a graph convolutional network with LSTM to capture spatiotemporal features in urban road travel time data. A graph-based deep learning approach is presented in [36]. The model gives promising results compared to baselines, but only short trajectories are considered with no external features incorporated. Unlike MLPs and CNNs which are feed-forward neural networks and take data all at once, RNNs act on data sequentially and are frequently employed in the NLP domain. RNNs have also been widely adopted for TTP. Zhao et al. [37] employed GRU on integrated data from remote transportation microwave sensors and dedicated short-range communications to predict travel time. The experiment used two freeway segments yielding better results with data fusion. With the introduction of data sparsity as the spatial scale increased, a neighboring-segments-based strategy is proposed in [23] which employed GRU to predict travel time for the entire trajectory path. Adjacent road segment information addresses trajectory data sparseness due to longer trips. In [22], the authors compared the LSTM-based RNN model with a Back Propagation Neural Network (BPNN) and Deep Belief Network (DBN) using multi-factor data for TTP.

Ensemble learning-based approaches.

Researchers have also employed ensemble approaches for TTP. The most extensively used ensemble approaches for TTP are Gradient Boosting Machine (GBM), eXtreme Gradient Boosting (XGBoost), and RF. The authors in [19] implemented GBM, RF, and ARIMA for multi-step ahead prediction using freeways data and demonstrated better performance of GBM over RF and ARIMA. Verkehr In Städten-SIMulationsmodell (VISSIM) freeway data is used in [38] to predict TT for multi-step ahead prediction. The Gradient Boosting Regression Tree (GBDT) model performed better than the SVM and BPNN. Chen et al. [39] proposed the XGBoost model to predict freeway TT. The results show better performance of XGBoost compared to GBM. The authors in [40] implemented decision tree, RF, XGBoost, and LSTM and demonstrated better performance of RF on freeway data. However, as the data volume increases, the performance of these approaches begins to deteriorate, and to improve the performance of these approaches, some studies used a combination of various algorithms for TTP. Ting et al. in [41] proposed an ensemble based on XGBoost and GRU to improve prediction accuracy using freeway data. For example, the results from the ensemble of light gradient boosting machine and MLP are combined using a decision tree for TTP in [42]. Recently, an ML-based ensemble has been proposed in [43] that uses kalman filter, k-NN, and BPNN as base learners. The prediction results are combined using fuzzy soft set theory to improve the prediction accuracy using freeway data. Although the proposed study improves individual models’ prediction accuracy, the criteria for choosing these base learners are not discussed.

Attention-based approaches.

Attention mechanisms or attention-based models have proven to be very powerful and adaptable models in a wide range of transportation applications [6, 8, 44]. Attention mechanism has recently been introduced in TTP due to its success in other related applications to learn only the relevant context. [26, 45] implemented an attention mechanism with LSTM to enhance performance. Ran et al. [46] introduced an attention mechanism with CNN on freeways data for better results. The authors in [47] employed attention with LSTM for joint prediction of travel time and next location.

To summarize, various approaches have been developed for TTP to improve and enhance TTP performance. Different datasets have been used in different studies. Some studies are conducted on freeway data, while others use urban road networks. It is difficult to conclude which approach is better in every scenario. Generally, deep learning performed better in handling data complexities and non-linearities. For traffic data, RNNs have shown promising results due to the nature of the problem (i.e., the traffic conditions at the current timestamp or near future timestamp are dependent on past timestamps).

Therefore, we propose a GRU variant to improve prediction performance by adding an attention method that allows the model to learn only the relevant context rather than considering all historical timestamps (positions) equally. The attention mechanism resulted in enhanced feature space yielding better results.

Proposed methodology

Problem definition

Travel time prediction can be formalized as forecasting future travel time given historical travel time. Let denote the i-th segment travel time during t-th time period. Given the historical travel time sequence (τ=t-mδ,…,t-δ, t and i ∈ S, where S is the set of segments in the considered study area), the task is to predict segment travel time at time interval (t + fδ) for some prediction horizon δ. In this work, we consider δ = 15 minutes, m = 4, and f = 1, 2, 3, 4, which means that previous one-hour observations are used to predict the travel time of the next 15 minutes, 30 minutes, 45 minutes and 60 minutes.

GRU

RNNs are proposed to process sequential data efficiently. However, standard RNNs suffer from exploding and vanishing gradient problems as the input sequence lengthens. To overcome these limitations, two specialized variants of RNNs, GRU, [48] and LSTM, [49] are developed. These models use the gated mechanism to handle long-term time-series sequences. LSTM comprises three gates; input, forget, and output gate. GRU uses two gates (update gate and reset gate), speeding the training process with fewer parameters than LSTM.

In this study, a GRU with an attention mechanism was employed to forecast future TT (short-term) using past travel time sequences. Fig 1 shows the structure of a GRU cell. The Reset gate decides how much past information the model needs to forget at each timestamp. Likewise, the Update gate is responsible for determining how much past information the model needs to pass at each timestamp. Eqs (1)–(4) show how the two gates govern the flow of information within the GRU network. (1) (2) (3) (4) where r_t and u_t denote reset gate and update gate, respectively, , and h_t denote memory content (current) and memory content (final) at time t, respectively, σ is the sigmoid activation function and μ denotes tanh activation function. W^u and U^u are the respective weight matrices of the two gates whereas ⊙ denotes element-wise multiplication.

Download:

Fig 1. Structure of a GRU cell [48].

https://doi.org/10.1371/journal.pone.0278064.g001

The architectural diagram and data flow are illustrated in Figs 2 and 3.

Download:

Fig 2. Layers of the proposed GRU model.

https://doi.org/10.1371/journal.pone.0278064.g002

Download:

Fig 3. Data flow of proposed GRU model.

https://doi.org/10.1371/journal.pone.0278064.g003

Attention mechanism

The attention mechanism enhances the learning ability of predictive models by focusing on relevant information. The authors in [50] improved weight assignment by giving different weights to different text fragments, thereby enhancing the encoding process in neural machine translation. Subsequently, the attention mechanism is successfully applied in document classification [51], image caption generation [52], tabular learning [53], and many more. In the context of ITS, attention has recently been applied to traffic congestion prediction [6], traffic speed prediction, [8], traffic flow prediction [44] and travel time prediction [26]. Because standard RNN models such as LSTM and GRU could not identify the relevance in historical travel time sequences explicitly, we have implemented an attention mechanism to learn global trends in travel time sequences. The following three steps explain the attention mechanism. First, GRU computes the hidden states at different timestamps (H=(h₁,h₂,…,h_n)). In the second step, weights of each hidden state h_i are computed using a scoring function (i.e., a two-layer deep neural network in our case). Thirdly, the context vector A_t, which is used to get the final prediction, is extracted with an attention function. We illustrated the attention mechanism concept in Fig 4 where the relation between the predicted (X_t+1) value and historical values (X_t−3, X_t−2, X_t−1, X_t) is shown by the thickness of arrows. We adapted the attention mechanism implemented for traffic speed prediction in [54] and given by the Eqs (5)–(7). (5) (6) (7) where the weights and biases of two hidden layers are denoted by W_(h₁), W_(h₂), b_(h₁) and b_(h₂), respectively, α_i shows the dependency between h_t (i.e., current position at t) and h_t′ (i.e., the previous position at t^′) in H.

Download:

Fig 4. An illustration of the attention mechanism.

https://doi.org/10.1371/journal.pone.0278064.g004

Results

Dataset

We evaluated our model on the Q-Traffic dataset presented in [55]. The dataset contains 15,073 road segments spanning 738.91 km from April 1, 2017, to May 31, 2017. All the data is collected around the most crowded area of Beijing (i.e., around the 6th ring road). This dataset also incorporates events happening around that time like Summer Palace (May Day), Fish Leong Concert, Chou Chuan-huing Concert, 106th Anniversary of THU and Spring outing, etc., causing a massive increase in traffic congestion than usual traffic. The data is aggregated at a 15-minute time interval on every road. The training-test split is 80-20%. The data is normalized to the interval [0, 1]. Travel time for the next 15 minutes, 30 minutes, 45 minutes, and 60 minutes are predicted in this experiment.

Performance metrics

We used four evaluation measures to evaluate our proposed model; RMSE, MAE, MAPE, and R². The RMSE can be computed using Eq (8). These equations are taken from [56]. (8) where TT_i denotes the actual travel time and denotes the predicted travel time. MAE is the average absolute error among the TT_i and and is shown in Eq (9). (9) MAPE denotes the percentage of the difference between the actual and predicted value and is given in Eq (10). A lower value of MAPE indicates high prediction accuracy. (10) Eq (11) shows the R2, which reflects how much variation the model learns. (11) Here TT_m denotes the mean travel time value. For optimal prediction, RMSE and MAE should be zero (or close to zero), and R2 should be close to one.

Hyperparameters setting

In our experiments, we set the training epoch to 600 and the learning rate to 0.001. One of the important hyperparameters is the number of hidden units which greatly affect the prediction output. We tested our model with 8, 16,32,64, and 128 hidden units with varying batch sizes (i.e. [16, 32 and 64] and chose the values with the best results. The results of the finalized model with a batch size of 32 are illustrated in Fig 5. It can be seen that when the hidden units are set to 32, we got the smallest values for RMSE, MAE, and MAPE and higher values for R². By choosing a smaller value or a value greater than 32 for batch size and hidden units, the evaluation measures either give higher values or start diverging from minima. As a result, we set hidden units to 32 in our experiments. To avoid overfitting, a normalization term is added in loss computation as shown in Eq (12). (12) where L_norm is the normalization term and C is a parameter whose value is set to 0.0015 for this experiment.

Download:

Fig 5. Comparison of RMSE, MAE, MAPE, and R Squared error results for different hidden units values.

https://doi.org/10.1371/journal.pone.0278064.g005

Baselines

HA [57]: Historical Average is a simple mathematical model which takes the mean of traffic values in the historical interval as the final prediction.
ARIMA [58]: ARIMA is a widely used time-series model that predicts future traffic data by fitting a parametric model to the historical time series. We have used ARIMA from statsmodel python package with parameters setting as (2,0,1).
SVR [59]: SVR is a well-known machine learning model that we train on training data to obtain a relationship between explanatory variables and the target variable. In this experiment, we have used rbf kernel function, ϵ =0.1, and C = 1.
XGBoost [60]: XGBoost is a state-of-the-art model from the decision tree family that employs an ensemble of decision tree regressors for travel time prediction. In our work, we have set max_depth to 7.
MLP [59]: MLP is a feed-forward artificial neural network consisting of fully connected layers (dense). In this study, we used 3 layers deep neural network. Hidden units are set to 64 and relu activation function is used.
GRU [48]: GRU is an improved variant of recurrent neural network (Readers are referred to Section III for details.). In our experiment, a single layer GRU model with 32 hidden units and relu activation function is used.

Performance comparison with baselines

A comparison of the proposed model and baseline approaches for the prediction horizon of 15 minutes, 30 minutes, 45 minutes, and 60 minutes is shown in Table 1. ⋇ shows minimal values of the error measures indicating the model’s poor performance.

Download:

Table 1. Performance evaluation of baselines & proposed (Overall).

https://doi.org/10.1371/journal.pone.0278064.t001

The results demonstrate that SVR with a non-linear kernel performed better than HA and ARIMA. For example, SVR reduces the RMSE from 2.73136 (HA) and 4.25444 (ARIMA) to 2.71936, a 0.44 percent and 36.08 percent reduction, respectively. Compared to HA, ARIMA, and SVR, an ensemble model XGBoost performed better. For example, XGBoost shows a reduction of 1.26%, 36.61%, and 0.82% against HA, ARIMA, and SVR in RMSE for the prediction horizon of 15 minutes. Similarly, there is a reduction of 3.72%, 60.94%, and 0.82% in MAE when comparing XGBoost with HA, ARIMA, and SVR for the same prediction horizon. The same trend is visible for MAPE in Table 1. Like RMSE, MAE, and MAPE, we observed an improvement in R² error compared to HA, ARIMA, and SVR. For instance, an improvement of 0.56% is reported when comparing HA with XGBoost.

The results in the Table 1 show that neural networks such as MLP, GRU, and our proposed attention-based GRU model performed better than traditional machine learning and time-series models. For example, there is a reduction of 1.93%, 37.04%, 1.49%, and 0.68% in RMSE when comparing MLP with HA, ARIMA, SVR, and XGBoost for the prediction horizon of 15 minutes. Likewise, compared to HA, ARIMA, SVR, XGBoost, and MLP, GRU performance was reduced by 3.62%, 38.12%, 3.19%, 2.39%, and 1.72%, in terms of RMSE. Our proposed attention-based GRU has shown a decrease in RMSE, MAE, and MAPE of 4.02%, 10.16%, and 3.7%, compared to HA. Furthermore, comparing HA with our proposed model, we have seen an improvement of about 3.16% in the R² error. With the attention mechanism, we improved the prediction precision of GRU. The RMSE, MAE, and MAPE have been decreased by 0.41 percent, 1.64 percent, and 1.87 percent, respectively.

Our proposed model can achieve better prediction performance regardless of how the horizon varies, and the prediction results have a lower tendency to change. Compared to GRU, there is a 0.50 percent, 2.20 percent, and 1.87 percent reduction in RMSE, MAE, and MAPE, respectively, demonstrating that the proposed model can be utilized for both short and long-term prediction without reducing performance significantly.

To demonstrate our model performance, we selected a single road and showed the plots of actual and predicted travel time values for the four prediction horizons. The visualization results on test data for the prediction horizon of 15-min, 30-min, 45-min, and 60-min are shown in Figs 6–13. These results show that our proposed model captures the traffic dynamics regardless of the prediction horizon. However, taking into account both the geographical and temporal dimensions can improve the results even more, particularly along with local minima/maxima.

Download:

Fig 6. Prediction results for 15 minutes horizon on test data (overall).

https://doi.org/10.1371/journal.pone.0278064.g006

Download:

Fig 7. Prediction results for 15 minutes horizon on test data (two days).

https://doi.org/10.1371/journal.pone.0278064.g007

Download:

Fig 8. Prediction results for 30 minutes horizon on test data (overall).

https://doi.org/10.1371/journal.pone.0278064.g008

Download:

Fig 9. Prediction results for 30 minutes horizon on test data (two days).

https://doi.org/10.1371/journal.pone.0278064.g009

Download:

Fig 10. Prediction results for 45 minutes horizon on test data (overall).

https://doi.org/10.1371/journal.pone.0278064.g010

Download:

Fig 11. Prediction results for 45 minutes horizon on test data (two days).

https://doi.org/10.1371/journal.pone.0278064.g011

Download:

Fig 12. Prediction results for 60 minutes horizon on test data (overall).

https://doi.org/10.1371/journal.pone.0278064.g012

Download:

Fig 13. Prediction results for 60 minutes horizon on test data (two days).

https://doi.org/10.1371/journal.pone.0278064.g013

Robustness analysis

Noise is unavoidable during the data collection process in real-world circumstances. We have performed a robustness analysis to test the generalization of our proposed model in the presence of noise.

We induced a common type of noise that obeys Gaussian Distribution (i.e., N ∈ 0, σ² where sigma varies from 0.2−2) to our dataset after normalising it to the interval [0, 1]. Fig 14 shows that the change in error measures is small, which shows the proposed model’s generalization ability to perform well even on noisy traffic data.

Download:

Fig 14. Robustness analysis after adding Gaussian noise.

https://doi.org/10.1371/journal.pone.0278064.g014

Conclusion

Travel time is becoming an attractive research area in the traffic prediction domain compared to other traffic variables as it is more interpretable and understandable for those unfamiliar with transportation terms. With cutting-edge traffic data collection technologies in recent years, data-driven approaches have been extensively applied for traffic prediction problems. In this article, we have implemented a GRU model to capture temporal relations in travel time data. We added an attention mechanism to the GRU model to help it learn the relevant information and enhance prediction precision. Experimental results show improvement in the performance when using the proposed attention-based GRU model compared to classical time-series models. Furthermore, we conducted a robustness test and found that the proposed model performed better even when there was noise in the traffic data, which is inevitable, with just a minor degradation in the model’s performance.

In the future, we plan to extend our work by incorporating graph-based neural networks to cater to the spatial dimension along with temporal on the same dataset To improve prediction accuracy, we also plan to combine exogenous elements such as weather, peak/non-peak hours, and other factors with traffic data.

References

1. Ritchie H, Roser M. Urbanization 2018. [Online]. Available: https://ourworldindata.org/urbanization
2. Bureau PR. 2018 World population data. 2018. [Online]. Available: https://interactives.prb.org/wpds/2018/index.html
3. Kato T, Uchida K. A study on benefit estimation that considers the values of travel time and travel time reliability in road networks. Transportmetrica A: transport science. 2018, 14(1-2):89–109.
- View Article
- Google Scholar
4. Schrank D, Eisele B, Lomax T. Urban mobility report 2019. Texas Transportation Institute, 2019.
5. Quasim MT, Khan MA, Algarni F, Alshahrani MM. Fundamentals of Smart Cities. In: Smart Cities: A Data Analytics Perspective. Springer, 2021:3–16.
6. Zheng C, Fan X, Wang C, Qi J. Gman: A graph multi-attention network for traffic prediction. AAAI Conference on Artificial Intelligence. 2020:1234–1241.
- View Article
- Google Scholar
7. Ma C, Dai G, Zhou J. Short-Term traffic flow prediction for urban road sections based on time series analysis and LSTM BILSTM method. IEEE Transactions on Intelligent Transportation Systems. 2021.
- View Article
- Google Scholar
8. Abdelraouf A, Abdel-Aty M, Yuan J. Utilizing Attention-Based Multi-Encoder-Decoder Neural Networks for Freeway Traffic Speed Prediction. IEEE Transactions on Intelligent Transportation Systems. 2021.
- View Article
- Google Scholar
9. Roy KC, Hasan S, Culotta A, Eluru N. Predicting traffic demand during hurricane evacuation using real-time data from transportation systems and social media. Transportation research part C: emerging technologies. 2021, 131:1033–1039.
- View Article
- Google Scholar
10. Astarita V, Giofre VP, Festa DC, Guido G, Vitale A. Floating Car Data Adaptive Traffic Signals: A Description of the First Real-Time Experiment with Connected Vehicles. Electronics. 2020, 9(1):114.
- View Article
- Google Scholar
11. Yang S, Ma W, Pi X, Qian S. A deep learning approach to real-time parking occupancy prediction in transportation networks incorporating multiple spatio-temporal data sources. Transportation Research Part C: Emerging Technologies. 2019, 107:248–265.
- View Article
- Google Scholar
12. Chen J, Xiao Z, Wang D, Long W, Bai J, Havyarimana V. Stay time prediction for individual stay behavior. IEEE Access. 2019, 7:130085–130100.
- View Article
- Google Scholar
13. Lin DJ, Chen MY, Chiang HS, Sharma PK. Intelligent Traffic Accident Prediction Model for Internet of Vehicles With Deep Learning Approach. IEEE Transactions on Intelligent Transportation Systems. 2021.
- View Article
- Google Scholar
14. Rahim MA, Hassan HM. A deep learning based traffic crash severity prediction framework. Accident Analysis & Prevention. 2021, 154:106090.
- View Article
- Google Scholar
15. Chiabaut N, Faitout R. Traffic congestion and travel time prediction based on historical congestion maps and identification of consensual days. Transportation Research Part C: Emerging Technologies. 2021, 124:102920.
- View Article
- Google Scholar
16. Schmitt EJ, Jula H. On the limitations of linear models in predicting travel times. 2007 IEEE Intelligent Transportation Systems Conference. IEEE, 2007:830–835.
17. Billings D, Yang JS. Application of the ARIMA models to urban roadway travel time prediction-a case study. 2006 IEEE International Conference on Systems, Man and Cybernetics. IEEE, 2006:2529–2534.
18. Wu CH, Ho JM, Lee DT. Travel-time prediction with support vector regression. IEEE transactions on intelligent transportation systems. 2004, 5(4):276–281.
- View Article
- Google Scholar
19. Zhang Y, Haghani A. A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies. 2015, 58:308–324.
- View Article
- Google Scholar
20. Ran X, Shan Z, Fang Y, Lin C. Travel time prediction by providing constraints on a convolutional neural network. IEEE Access. 2018, 6:59336–59349.
- View Article
- Google Scholar
21. Abdollahi M, Khaleghi T, Yang K. An integrated feature learning approach using deep learning for travel time prediction. Expert Systems with Applications. 2020, 139:112864.
- View Article
- Google Scholar
22. Wang M, Li W, Kong Y, Bai Q. Empirical evaluation of deep learning-based travel time prediction. Pacific Rim Knowledge Acquisition Workshop. Springer, 2019:54–65. https://doi.org/10.1007/978-3-030-30639-7_6
23. Qiu J, Du L, Zhang D, Su S, Tian Z. Nei-TTE: intelligent traffic time estimation based on fine-grained time derivation of road segments for smart city. IEEE Transactions on Industrial Informatics. 2019, 16(4):2659–2666.
- View Article
- Google Scholar
24. Yuan H, Li G. A survey of traffic prediction: from spatio-temporal data to intelligent transportation. Data Science and Engineering. 2021, 6(1):63–85.
- View Article
- Google Scholar
25. Lipton ZC, Berkowitz J, Elkan C. A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:150600019. 2015.
26. Ran X, Shan Z, Fang Y, Lin C. An LSTM-based method with attention mechanism for travel time prediction. Sensors. 2019, 19(4):861. pmid:30791424
- View Article
- PubMed/NCBI
- Google Scholar
27. Long K, Yao W, Gu J, Wu W, Han LD. Predicting freeway travel time using multiple-source heterogeneous data integration. Applied Sciences. 2019, 9(1):104.
- View Article
- Google Scholar
28. Bing Q, Qu D, Chen X, Pan F, Wei J. Arterial travel time estimation method using SCATS traffic data based on KNN-LSSVR model. Advances in Mechanical Engineering. 2019, 11(5):1687814019841926.
- View Article
- Google Scholar
29. Zhao J, Gao Y, Tang J, Zhu L, Ma J. Highway travel time prediction using sparse tensor completion tactics and-nearest neighbor pattern matching method. Journal of Advanced Transportation. 2018.
- View Article
- Google Scholar
30. Myung J, Kim DK, Kho SY, Park CH. Travel time prediction using k nearest neighbor method with combined data from vehicle detector system and automatic toll collection system. Transportation Research Record. 2011, 2256(1):51–59.
- View Article
- Google Scholar
31. Fu K, Meng F, Ye J, Wang Z. Compacteta: A fast inference system for travel time prediction. 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020:3337–3345.
32. Yuan H, Li G, Bao Z, Feng L. Effective Travel Time Estimation: When Historical Trajectories over Road Networks Matter. 2020 ACM SIGMOD International Conference on Management of Data, 2020:2135–2149.
33. Ran X, Shan Z, Shi Y, Lin C. Short-term travel time prediction: a spatiotemporal deep learning approach. International Journal of Information Technology & Decision Making. 2019, 18(04):1087–1111.
- View Article
- Google Scholar
34. Shen Y, Jin C, Hua J. TTPNet: A neural network for travel time prediction based on tensor decomposition and graph embedding. IEEE Transactions on Knowledge and Data Engineering. 2020.
- View Article
- Google Scholar
35. Li X, Wang H, Sun P, Zu H. Spatiotemporal Features—Extracted Travel Time Prediction Leveraging Deep-Learning-Enabled Graph Convolutional Neural Network Model. Sustainability. 2021, 13(3):1253.
- View Article
- Google Scholar
36. Jin G, Wang M, Zhang J, Sha H, Huang J. STGNN-TTE: Travel time estimation via spatial–temporal graph neural network. Future Generation Computer Systems. 2022, 126:70–81.
- View Article
- Google Scholar
37. Zhao J, Gao Y, Qu Y, Yin H, Liu Y, Sun H. Travel time prediction: Based on gated recurrent unit method and data fusion. IEEE Access. 2018, 6:70463–70472.
- View Article
- Google Scholar
38. Cheng J, Li G, Chen X. Research on travel time prediction model of freeway based on gradient boosting decision tree. IEEE access. 2018, 7:7466–7480.
- View Article
- Google Scholar
39. Chen Z, Fan W. A Freeway Travel Time Prediction Method Based on an XGBoost Model. Sustainability. 2021, 13(15):8577.
- View Article
- Google Scholar
40. Qiu B, Fan WD. Machine Learning Based Short-Term Travel Time Prediction: Numerical Results and Comparative Analyses. Sustainability. 2021, 13(13):7454.
- View Article
- Google Scholar
41. Ting PY, Wada T, Chiu YL, Sun MT, Sakai K, Ku WS, et al. Freeway Travel Time Prediction Using Deep Hybrid Model–Taking Sun Yat-Sen Freeway as an Example. IEEE Transactions on Vehicular Technology. 2020, 69(8):8257–8266.
- View Article
- Google Scholar
42. Zou Z, Yang H, Zhu AX. Estimation of Travel Time Based on Ensemble Method With Multi-Modality Perspective Urban Big Data. IEEE Access. 2020, 8:24819–24828.
- View Article
- Google Scholar
43. Li H, Xiong S. Time-varying weight coefficients determination based on fuzzy soft set in combined prediction model for travel time. Expert Systems with Applications. 2022, 189:115998.
- View Article
- Google Scholar
44. Do LN, Vu HL, Vo BQ, Liu Z, Phung D. An effective spatial-temporal attention based neural network for traffic flow prediction. Transportation research part C: emerging technologies. 2019, 108:12–28.
- View Article
- Google Scholar
45. Wu J, Wu Q, Shen J, Cai C. Towards attention-based convolutional long short-term memory for travel time prediction of bus journeys. Sensors. 2020, 20(12):3354. pmid:32545698
- View Article
- PubMed/NCBI
- Google Scholar
46. Ran X, Shan Z, Fang Y, Lin C. A convolution component-based method with attention mechanism for travel-time prediction. Sensors. 2019, 19(9):2063. pmid:31058812
- View Article
- PubMed/NCBI
- Google Scholar
47. Sun J, Kim J. Joint prediction of next location and travel time from urban vehicle trajectories using long short-term memory neural networks. Transportation Research Part C: Emerging Technologies. 2021, 128:103114.
- View Article
- Google Scholar
48. Cho K, Van Merri enboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:14061078. 2014.
49. Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation. 1997, 9(8):1735–1780. pmid:9377276
- View Article
- PubMed/NCBI
- Google Scholar
50. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in neural information processing systems. 2017:30.
- View Article
- Google Scholar
51. Zhao W, Fang D, Zhang J, Zhao Y, Xu X, Jiang X, et al. An effective framework for semistructured document classification via hierarchical attention model. International Journal of Intelligent Systems. 2021, 36(9):5161–5183.
- View Article
- Google Scholar
52. Li X, Ye Z, Zhang Z, Zhao M. Clothes image caption generation with attribute detection and visual attention model. Pattern Recognition Letters. 2021, 141:68–74.
- View Article
- Google Scholar
53. Arık SO, Pfister T. Tabnet: Attentive interpretable tabular learning. AAAI Conference on Artificial Intelligence. 2021:6679–6687.
54. Bai J, Zhu J, Song Y, Zhao L, Hou Z, Du R, et al. A3t-gcn: Attention temporal graph convolutional network for traffic forecasting. ISPRS International Journal of Geo-Information. 2021, 10(7):485.
- View Article
- Google Scholar
55. Liao B, Zhang J, Wu C, McIlwraith D, Chen T, Yang S, et al. Deep sequence learning with auxiliary information for traffic prediction. 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; 2018:537–546.
56. Xu X, Liu C, Zhao Y, Lv X. Short-term traffic flow prediction based on whale optimization algorithm optimized BiLSTM Attention. Concurrency and Computation: Practice and Experience. 2022:6782.
- View Article
- Google Scholar
57. Liu J, Guan W. A summary of traffic flow forecasting methods. Journal of highway and transportation research and development. 2004, 21(3):82–85.
- View Article
- Google Scholar
58. Ahmed MS, Cook AR. Analysis of freeway traffic time-series data by using Box-Jenkins techniques. 1979.
- View Article
- Google Scholar
59. Smola AJ, Sch olkopf B. A tutorial on support vector regression. Statistics and computing. 2004, 14(3):199–222.
- View Article
- Google Scholar
60. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016:785–794.

[ref1] 1. Ritchie H, Roser M. Urbanization 2018. [Online]. Available: https://ourworldindata.org/urbanization

[ref2] 2. Bureau PR. 2018 World population data. 2018. [Online]. Available: https://interactives.prb.org/wpds/2018/index.html

[ref3] 3. Kato T, Uchida K. A study on benefit estimation that considers the values of travel time and travel time reliability in road networks. Transportmetrica A: transport science. 2018, 14(1-2):89–109.
View Article
Google Scholar

[4] View Article

[5] Google Scholar

[ref4] 4. Schrank D, Eisele B, Lomax T. Urban mobility report 2019. Texas Transportation Institute, 2019.

[ref5] 5. Quasim MT, Khan MA, Algarni F, Alshahrani MM. Fundamentals of Smart Cities. In: Smart Cities: A Data Analytics Perspective. Springer, 2021:3–16.

[ref6] 6. Zheng C, Fan X, Wang C, Qi J. Gman: A graph multi-attention network for traffic prediction. AAAI Conference on Artificial Intelligence. 2020:1234–1241.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref7] 7. Ma C, Dai G, Zhou J. Short-Term traffic flow prediction for urban road sections based on time series analysis and LSTM BILSTM method. IEEE Transactions on Intelligent Transportation Systems. 2021.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref8] 8. Abdelraouf A, Abdel-Aty M, Yuan J. Utilizing Attention-Based Multi-Encoder-Decoder Neural Networks for Freeway Traffic Speed Prediction. IEEE Transactions on Intelligent Transportation Systems. 2021.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref9] 9. Roy KC, Hasan S, Culotta A, Eluru N. Predicting traffic demand during hurricane evacuation using real-time data from transportation systems and social media. Transportation research part C: emerging technologies. 2021, 131:1033–1039.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref10] 10. Astarita V, Giofre VP, Festa DC, Guido G, Vitale A. Floating Car Data Adaptive Traffic Signals: A Description of the First Real-Time Experiment with Connected Vehicles. Electronics. 2020, 9(1):114.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref11] 11. Yang S, Ma W, Pi X, Qian S. A deep learning approach to real-time parking occupancy prediction in transportation networks incorporating multiple spatio-temporal data sources. Transportation Research Part C: Emerging Technologies. 2019, 107:248–265.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref12] 12. Chen J, Xiao Z, Wang D, Long W, Bai J, Havyarimana V. Stay time prediction for individual stay behavior. IEEE Access. 2019, 7:130085–130100.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref13] 13. Lin DJ, Chen MY, Chiang HS, Sharma PK. Intelligent Traffic Accident Prediction Model for Internet of Vehicles With Deep Learning Approach. IEEE Transactions on Intelligent Transportation Systems. 2021.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref14] 14. Rahim MA, Hassan HM. A deep learning based traffic crash severity prediction framework. Accident Analysis & Prevention. 2021, 154:106090.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref15] 15. Chiabaut N, Faitout R. Traffic congestion and travel time prediction based on historical congestion maps and identification of consensual days. Transportation Research Part C: Emerging Technologies. 2021, 124:102920.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref16] 16. Schmitt EJ, Jula H. On the limitations of linear models in predicting travel times. 2007 IEEE Intelligent Transportation Systems Conference. IEEE, 2007:830–835.

[ref17] 17. Billings D, Yang JS. Application of the ARIMA models to urban roadway travel time prediction-a case study. 2006 IEEE International Conference on Systems, Man and Cybernetics. IEEE, 2006:2529–2534.

[ref18] 18. Wu CH, Ho JM, Lee DT. Travel-time prediction with support vector regression. IEEE transactions on intelligent transportation systems. 2004, 5(4):276–281.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref19] 19. Zhang Y, Haghani A. A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies. 2015, 58:308–324.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref20] 20. Ran X, Shan Z, Fang Y, Lin C. Travel time prediction by providing constraints on a convolutional neural network. IEEE Access. 2018, 6:59336–59349.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref21] 21. Abdollahi M, Khaleghi T, Yang K. An integrated feature learning approach using deep learning for travel time prediction. Expert Systems with Applications. 2020, 139:112864.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref22] 22. Wang M, Li W, Kong Y, Bai Q. Empirical evaluation of deep learning-based travel time prediction. Pacific Rim Knowledge Acquisition Workshop. Springer, 2019:54–65. https://doi.org/10.1007/978-3-030-30639-7_6

[ref23] 23. Qiu J, Du L, Zhang D, Su S, Tian Z. Nei-TTE: intelligent traffic time estimation based on fine-grained time derivation of road segments for smart city. IEEE Transactions on Industrial Informatics. 2019, 16(4):2659–2666.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref24] 24. Yuan H, Li G. A survey of traffic prediction: from spatio-temporal data to intelligent transportation. Data Science and Engineering. 2021, 6(1):63–85.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref25] 25. Lipton ZC, Berkowitz J, Elkan C. A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:150600019. 2015.

[ref26] 26. Ran X, Shan Z, Fang Y, Lin C. An LSTM-based method with attention mechanism for travel time prediction. Sensors. 2019, 19(4):861. pmid:30791424
View Article
PubMed/NCBI
Google Scholar

[61] View Article

[62] PubMed/NCBI

[63] Google Scholar

[ref27] 27. Long K, Yao W, Gu J, Wu W, Han LD. Predicting freeway travel time using multiple-source heterogeneous data integration. Applied Sciences. 2019, 9(1):104.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref28] 28. Bing Q, Qu D, Chen X, Pan F, Wei J. Arterial travel time estimation method using SCATS traffic data based on KNN-LSSVR model. Advances in Mechanical Engineering. 2019, 11(5):1687814019841926.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref29] 29. Zhao J, Gao Y, Tang J, Zhu L, Ma J. Highway travel time prediction using sparse tensor completion tactics and-nearest neighbor pattern matching method. Journal of Advanced Transportation. 2018.
View Article
Google Scholar

[71] View Article

[72] Google Scholar

[ref30] 30. Myung J, Kim DK, Kho SY, Park CH. Travel time prediction using k nearest neighbor method with combined data from vehicle detector system and automatic toll collection system. Transportation Research Record. 2011, 2256(1):51–59.
View Article
Google Scholar

[74] View Article

[75] Google Scholar

[ref31] 31. Fu K, Meng F, Ye J, Wang Z. Compacteta: A fast inference system for travel time prediction. 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020:3337–3345.

[ref32] 32. Yuan H, Li G, Bao Z, Feng L. Effective Travel Time Estimation: When Historical Trajectories over Road Networks Matter. 2020 ACM SIGMOD International Conference on Management of Data, 2020:2135–2149.

[ref33] 33. Ran X, Shan Z, Shi Y, Lin C. Short-term travel time prediction: a spatiotemporal deep learning approach. International Journal of Information Technology & Decision Making. 2019, 18(04):1087–1111.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref34] 34. Shen Y, Jin C, Hua J. TTPNet: A neural network for travel time prediction based on tensor decomposition and graph embedding. IEEE Transactions on Knowledge and Data Engineering. 2020.
View Article
Google Scholar

[82] View Article

[83] Google Scholar

[ref35] 35. Li X, Wang H, Sun P, Zu H. Spatiotemporal Features—Extracted Travel Time Prediction Leveraging Deep-Learning-Enabled Graph Convolutional Neural Network Model. Sustainability. 2021, 13(3):1253.
View Article
Google Scholar

[85] View Article

[86] Google Scholar

[ref36] 36. Jin G, Wang M, Zhang J, Sha H, Huang J. STGNN-TTE: Travel time estimation via spatial–temporal graph neural network. Future Generation Computer Systems. 2022, 126:70–81.
View Article
Google Scholar

[88] View Article

[89] Google Scholar

[ref37] 37. Zhao J, Gao Y, Qu Y, Yin H, Liu Y, Sun H. Travel time prediction: Based on gated recurrent unit method and data fusion. IEEE Access. 2018, 6:70463–70472.
View Article
Google Scholar

[91] View Article

[92] Google Scholar

[ref38] 38. Cheng J, Li G, Chen X. Research on travel time prediction model of freeway based on gradient boosting decision tree. IEEE access. 2018, 7:7466–7480.
View Article
Google Scholar

[94] View Article

[95] Google Scholar

[ref39] 39. Chen Z, Fan W. A Freeway Travel Time Prediction Method Based on an XGBoost Model. Sustainability. 2021, 13(15):8577.
View Article
Google Scholar

[97] View Article

[98] Google Scholar

[ref40] 40. Qiu B, Fan WD. Machine Learning Based Short-Term Travel Time Prediction: Numerical Results and Comparative Analyses. Sustainability. 2021, 13(13):7454.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref41] 41. Ting PY, Wada T, Chiu YL, Sun MT, Sakai K, Ku WS, et al. Freeway Travel Time Prediction Using Deep Hybrid Model–Taking Sun Yat-Sen Freeway as an Example. IEEE Transactions on Vehicular Technology. 2020, 69(8):8257–8266.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref42] 42. Zou Z, Yang H, Zhu AX. Estimation of Travel Time Based on Ensemble Method With Multi-Modality Perspective Urban Big Data. IEEE Access. 2020, 8:24819–24828.
View Article
Google Scholar

[106] View Article

[107] Google Scholar

[ref43] 43. Li H, Xiong S. Time-varying weight coefficients determination based on fuzzy soft set in combined prediction model for travel time. Expert Systems with Applications. 2022, 189:115998.
View Article
Google Scholar

[109] View Article

[110] Google Scholar

[ref44] 44. Do LN, Vu HL, Vo BQ, Liu Z, Phung D. An effective spatial-temporal attention based neural network for traffic flow prediction. Transportation research part C: emerging technologies. 2019, 108:12–28.
View Article
Google Scholar

[112] View Article

[113] Google Scholar

[ref45] 45. Wu J, Wu Q, Shen J, Cai C. Towards attention-based convolutional long short-term memory for travel time prediction of bus journeys. Sensors. 2020, 20(12):3354. pmid:32545698
View Article
PubMed/NCBI
Google Scholar

[115] View Article

[116] PubMed/NCBI

[117] Google Scholar

[ref46] 46. Ran X, Shan Z, Fang Y, Lin C. A convolution component-based method with attention mechanism for travel-time prediction. Sensors. 2019, 19(9):2063. pmid:31058812
View Article
PubMed/NCBI
Google Scholar

[119] View Article

[120] PubMed/NCBI

[121] Google Scholar

[ref47] 47. Sun J, Kim J. Joint prediction of next location and travel time from urban vehicle trajectories using long short-term memory neural networks. Transportation Research Part C: Emerging Technologies. 2021, 128:103114.
View Article
Google Scholar

[123] View Article

[124] Google Scholar

[ref48] 48. Cho K, Van Merri enboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:14061078. 2014.

[ref49] 49. Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation. 1997, 9(8):1735–1780. pmid:9377276
View Article
PubMed/NCBI
Google Scholar

[127] View Article

[128] PubMed/NCBI

[129] Google Scholar

[ref50] 50. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in neural information processing systems. 2017:30.
View Article
Google Scholar

[131] View Article

[132] Google Scholar

[ref51] 51. Zhao W, Fang D, Zhang J, Zhao Y, Xu X, Jiang X, et al. An effective framework for semistructured document classification via hierarchical attention model. International Journal of Intelligent Systems. 2021, 36(9):5161–5183.
View Article
Google Scholar

[134] View Article

[135] Google Scholar

[ref52] 52. Li X, Ye Z, Zhang Z, Zhao M. Clothes image caption generation with attribute detection and visual attention model. Pattern Recognition Letters. 2021, 141:68–74.
View Article
Google Scholar

[137] View Article

[138] Google Scholar

[ref53] 53. Arık SO, Pfister T. Tabnet: Attentive interpretable tabular learning. AAAI Conference on Artificial Intelligence. 2021:6679–6687.

[ref54] 54. Bai J, Zhu J, Song Y, Zhao L, Hou Z, Du R, et al. A3t-gcn: Attention temporal graph convolutional network for traffic forecasting. ISPRS International Journal of Geo-Information. 2021, 10(7):485.
View Article
Google Scholar

[141] View Article

[142] Google Scholar

[ref55] 55. Liao B, Zhang J, Wu C, McIlwraith D, Chen T, Yang S, et al. Deep sequence learning with auxiliary information for traffic prediction. 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; 2018:537–546.

[ref56] 56. Xu X, Liu C, Zhao Y, Lv X. Short-term traffic flow prediction based on whale optimization algorithm optimized BiLSTM Attention. Concurrency and Computation: Practice and Experience. 2022:6782.
View Article
Google Scholar

[145] View Article

[146] Google Scholar

[ref57] 57. Liu J, Guan W. A summary of traffic flow forecasting methods. Journal of highway and transportation research and development. 2004, 21(3):82–85.
View Article
Google Scholar

[148] View Article

[149] Google Scholar

[ref58] 58. Ahmed MS, Cook AR. Analysis of freeway traffic time-series data by using Box-Jenkins techniques. 1979.
View Article
Google Scholar

[151] View Article

[152] Google Scholar

[ref59] 59. Smola AJ, Sch olkopf B. A tutorial on support vector regression. Statistics and computing. 2004, 14(3):199–222.
View Article
Google Scholar

[154] View Article

[155] Google Scholar

[ref60] 60. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016:785–794.

Figures

Abstract

Introduction

Related work

Traditional approaches for TTP

Classical approaches.

Machine learning-based approaches.

Advanced approaches

Deep learning-based approaches.

Ensemble learning-based approaches.

Attention-based approaches.

Proposed methodology

Problem definition

GRU

Attention mechanism

Results

Dataset

Performance metrics

Hyperparameters setting

Baselines

Performance comparison with baselines

Robustness analysis

Conclusion

References