Traffic Flow Prediction Based on Spatial-Temporal Attention Convolutional Neural Network
-
摘要:
为充分挖掘交通流量的复杂时空动态相关性以提高交通流量预测精度,引入空间注意力机制与膨胀因果卷积神经网络,提出一种基于时空注意力卷积神经网络的交通流量预测模型(spatio-temporal attention convolutional neural network,STACNN). 首先,由膨胀因果卷积与门控单元构建的门控时间卷积网络模块用于获取交通流量的非线性时间动态相关性,避免在训练长时间序列时发生梯度消失或梯度爆炸;其次,采用空间注意力机制为路网中的交通传感器节点自动分配注意力权重,动态关注不相邻节点之间的空间关系,并结合图卷积神经网络提取路网的局部空间动态相关性特征;然后,通过全连接层获取最终的交通流量预测结果;最后,利用高速公路交通数据集PEMSD4、PEMSD8进行了60 min的交通流量预测实验. 实验结果表明:与基线模型中具有良好性能的时空图卷积网络(spatio-temporal graph convolutional network,STGCN)模型相比,提出的STACNN模型预测结果的平均绝对误差(mean absolute error,MAE)在两个数据集上分别提高2.79%和1.18%,平均绝对百分比误差(mean absolute percentage error,MAPE)分别提高1.00%和0.46%,均方根误差(root mean square error,RMSE)分别提高3.80%和1.25%;此外,引入的膨胀因果卷积神经网络与空间注意力机制对提取时空动态相关性特征均具有积极的贡献.
Abstract:In order to fully exploit the complex spatial-temporal dynamic correlation of traffic flow and improve the accuracy of traffic flow prediction, a spatial attention mechanism and an dilated causal convolutional neural network are introduced. A traffic flow prediction model STACNN based on spatial-temporal attention convolutional neural network is proposed. Firstly, the gated temporal convolution network block constructed by dilated causal convolution and gating unit is used to obtain the nonlinear temporal dynamic correlation of traffic flow and avoid gradient disappearance or gradient explosion when training long-term sequences. Secondly, the spatial attention mechanism is used to automatically assign attention weights to the traffic sensor nodes in the road network, which can dynamically pay attention to the spatial relationship between non-adjacent nodes, and combine the graph convolutional neural network to extract the local spatial dynamic correlation of the road network. Then, the final traffic flow prediction result is obtained through the fully connected layer. Finally, a 60-minute traffic flow prediction experiment is carried out using two highway traffic datasets PEMSD4 and PEMSD8. The experimental results show that: compared with the spatio-temporal graph convolutional network (STGCN) model with good performance in the baseline model, the MAE (mean absolute error) value of the prediction results of the proposed STACNN model on the two datasets is improved by 2.79% and 1.18%, the MAPE (mean absolute percentage error) value increased by 1.00% and 0.46%, and the RMSE (root mean square error) value increased by 3.8% and 1.25%, respectively. In addition, introducing dilated causal convolutional neural network and spatial attention mechanism have positively contributed to extraction of spatial-temporal dynamic correlation features.
-
Key words:
- traffic forecasting /
- deep learning /
- graph convolution /
- attention mechanism
-
表 1 数据集描述
Table 1. Dataset description
数据集 传感器数/个 时间范围 数据点/个 PEMSD4 307 2018年1月1日—
2月28日16992 PEMSD8 170 2016年7月1日—
8月31日17856 表 2 不同方法在PEMSD4和PEMSD8上进行1 h流量预测的性能对比
Table 2. Performance comparison of different methods for one-hour traffic prediction on PEMSD4 and PEMSD8
% 模型 PEMSD4 PEMSD8 MAE MAPE RMSE MAE MAPE RMSE HA[1] 38.56 28.17 56.85 32.06 20.34 47.51 VAR[7] 30.68 21.51 46.92 25.60 16.94 37.51 LSTM[11] 31.77 28.65 44.84 28.81 29.61 40.80 T-GCN[16] 28.04 22.81 41.21 24.01 13.95 33.98 STGCN[17] 26.45 16.23 41.39 21.94 12.32 33.59 STACNN-NT 24.40 15.76 38.45 21.42 12.02 33.11 STACNN-NA 25.15 16.25 38.65 21.41 12.60 33.10 STACNN 23.66 15.23 37.40 20.76 11.86 32.34 表 3 数据集训练的时间消耗
Table 3. Time consumption of training on datasets
s 模型 PEMSD4 PEMSD8 STGCN 121.03 69.20 STACNN-NA 98.71 45.22 STACNN-NT 235.57 110.57 STACNN 197.52 90.51 -
[1] NAGY A M, SIMON V. Survey on traffic prediction in smart cities[J]. Pervasive and Mobile Computing, 2018, 50: 148-163. doi: 10.1016/j.pmcj.2018.07.004 [2] 刘静,关伟. 交通流预测方法综述[J]. 公路交通科技,2004,21(3): 82-85.LIU Jing, GUAN Wei. A summary of traffic flow forecasting methods[J]. Journal of Highway Transportation Research Development, 2004, 21(3): 82-85. [3] 周晓,唐宇舟,刘强. 基于卡尔曼滤波的道路平均速度预测模型研究[J]. 浙江工业大学学报,2020,48(4): 392-396,404.ZHOU Xiao, TANG Yuzhou, LIU Qiang. Research on road average speed prediction model based on kalman filter[J]. Journal of Zhejiang University of Technology, 2020, 48(4): 392-396,404. [4] OKUTANI I, STEPHANEDES Y J. Dynamic prediction of traffic volume through Kalman filtering theory[J]. Transportation Research Part B: Methodological, 1984, 18(1): 1-11. doi: 10.1016/0191-2615(84)90002-X [5] HAMED M M, AL-MASAEID H R, SAID Z M B. Short-term prediction of traffic volume in urban arterials[J]. Journal of Transportation Engineering, 1995, 121(3): 249-254. doi: 10.1061/(ASCE)0733-947X(1995)121:3(249) [6] 李洁,彭其渊,杨宇翔. 基于SARIMA模型的广珠城际铁路客流量预测[J]. 西南交通大学学报,2020,55(1): 41-51. doi: 10.35741/issn.0258-2724.55.1.41LI Jie, PENG Qiyuan, YANG Yuxiang. Passenger flow prediction for Guangzhou−Zhuhai intercity railway based on SARIMA model[J]. Journal of Southwest Jiaotong University, 2020, 55(1): 41-51. doi: 10.35741/issn.0258-2724.55.1.41 [7] ZIVOT E, WANG J H. Modeling financial time series with S-PLUS®[M]. 2nd editon. New York: Springer, 2006: 385-429. [8] 姚智胜,邵春福,高永亮. 基于支持向量回归机的交通状态短时预测方法研究[J]. 北京交通大学学报,2006,30(3): 19-22. doi: 10.3969/j.issn.1673-0291.2006.03.005YAO Zhisheng, SHAO Chunfu, GAO Yongliang. Research on methods of short-term traffic forecasting based on support vector regression[J]. Journal of Beijing Jiaotong University, 2006, 30(3): 19-22. doi: 10.3969/j.issn.1673-0291.2006.03.005 [9] 张晓利,贺国光,陆化普. 基于K-邻域非参数回归短时交通流预测方法[J]. 系统工程学报,2009,24(2): 178-183.ZHANG Xiaoli, HE Guoguang, LU Huapu. Short-term traffic flow forecasting based on K-nearest neighbors non-parametric regression[J]. Journal of Systems Engineering, 2009, 24(2): 178-183. [10] 陈丹,胡明华,张洪海,等. 基于贝叶斯估计的短时空域扇区交通流量预测[J]. 西南交通大学学报,2016,51(4): 807-814. doi: 10.3969/j.issn.0258-2724.2016.04.028CHEN Dan, HU Minghua, ZHANG Honghai, et al. Short-term traffic flow prediction of airspace sectors based on Bayesian estimation theory[J]. Journal of Southwest Jiaotong University, 2016, 51(4): 807-814. doi: 10.3969/j.issn.0258-2724.2016.04.028 [11] SHAO H X, SOONG B H . Traffic flow prediction with long short-term memory networks (LSTMS)[C]// Proceedings of 2016 IEEE Region 10 Conference (TENCON). Singapore: IEEE, 2016: 2986-2989. [12] 刘明宇,吴建平,王钰博,等. 基于深度学习的交通流量预测[J]. 系统仿真学报,2018,30(11): 4100-4105,4114. doi: 10.16182/j.issn1004731x.joss.201811007LIU Mingyu, WU Jianping, WANG Yubo, et al. Traffic flow prediction based on deep learning[J]. Journal of System Simulation, 2018, 30(11): 4100-4105,4114. doi: 10.16182/j.issn1004731x.joss.201811007 [13] SHI X J, CHEN Z R, WANG H, et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting[C]//29th Annual Conference on Neural Information Processing Systems. Montreal: NIPS, 2015: 802-810 [14] YAO H X, TANG X F, WEI H, et al. Revisiting spatial-temporal similarity: a deep learning framework for traffic prediction[C]//Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence. Honolulu: AAAI, 2019: 5668-5675. [15] ZHANG J, ZHENG Y, QI D. Deep spatio-temporal residual networks for citywide crowd flows prediction[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. San Francisco: AAAI, 2016: 1655-1661. [16] ZHAO L, SONG Y J, ZHANG C, et al. T-GCN: a temporal graph convolutional network for traffic prediction[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(9): 3848-3858. doi: 10.1109/TITS.2019.2935152 [17] YU B, YIN H T, ZHU Z X. Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting[C]//Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Stockholm: IJCAI, 2018: 3634-3640. [18] TEDJOPURNOMO D A, BAO Z F, ZHENG B H, et al. A survey on modern deep neural network for traffic prediction: trends, methods and challenges[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(4): 1544-1561. [19] VELIKOVI P, CUCURULL G, CASANOVA A, et al. Graph attention networks[C]//6th International Conference on Learning Representations. Vancouver: ICLR, 2018: 1-12 [20] FENG X C, GUO J, QIN B, et al. Effective deep memory networks for distant supervised relation extraction[C]//Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. Melbourne: IJCAI, 2017: 4002-4008. [21] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks [C]//5th International Conference on Learning Representations. Toulon: ICLR, 2017: 1-14 [22] SIMONOVSKY M, KOMODAKIS N. Dynamic edge-conditioned filters in convolutional neural networks on graphs[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 29-38. [23] YU F , KOLTUN V. Multi-scale context aggregation by dilated convolutions[C]//4th International Conference on Learning Representations. San Juan: ICLR, 2016: 1-13. [24] DAUPHIN Y N, FAN A, AULI M, et al. Language modeling with gated convolutional networks[C]//34th International Conference on Machine Learning. Sydney: IMLS, 2017: 1551-1559. [25] GUO S N, LIN Y F, FENG N, et al. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu: AAAI, 2019: 922-929.