• ISSN 0258-2724
  • CN 51-1277/U
  • EI Compendex
  • Scopus
  • Indexed by Core Journals of China, Chinese S&T Journal Citation Reports
  • Chinese S&T Journal Citation Reports
  • Chinese Science Citation Database
Volume 56 Issue 1
Jan.  2021
Turn off MathJax
Article Contents
PENG Bo, TANG Ju, ZHANG Yuanyuan, CAI Xiaoyu, MENG Fanhe. Automatic Traffic State Recognition from Road Videos Based on 3D Convolution Neural Network[J]. Journal of Southwest Jiaotong University, 2021, 56(1): 153-159. doi: 10.3969/j.issn.0258-2724.20191169
Citation: PENG Bo, TANG Ju, ZHANG Yuanyuan, CAI Xiaoyu, MENG Fanhe. Automatic Traffic State Recognition from Road Videos Based on 3D Convolution Neural Network[J]. Journal of Southwest Jiaotong University, 2021, 56(1): 153-159. doi: 10.3969/j.issn.0258-2724.20191169

Automatic Traffic State Recognition from Road Videos Based on 3D Convolution Neural Network

doi: 10.3969/j.issn.0258-2724.20191169
  • Received Date: 09 Dec 2019
  • Rev Recd Date: 16 Mar 2020
  • Available Online: 01 Apr 2020
  • Publish Date: 01 Feb 2021
  • In order to directlyextract effective traffic information from videos, a traffic state recognition method based on 3D CNN (3D convolutional neural networks)was put forward. Firstly, with the deep convolutional network C3D (convolutional 3D) as 3D CNN prototype, the number and position of convolutional layers, convolutional kernelsize and 3D convolutional depth were optimized and adjusted; thus 37 candidate models were built. Secondly, video datasets were established to systematically train and test candidate models, and a traffic state recognition model C3D* was proposed. Then, tests and analysis were conducted on traffic state recognition results of C3D* and existing 3D convolutional models. At last, traffic recognition results were compared between C3D* and commonly used 2D convolutional networks. The results show that for video traffic state recognition, the average F value of C3D* reaches 91.32%, which is 12.24%, 26.72% and 28.02% higher than that of C3D, R3D (region convolutional 3D network) and R(2+1)D (resnets adopting 2D spatial convolution and a 1D temporal convolution), respectively, demonstrating that the proposed model C3D* is more accurate and effective. Compared with image recognition results from LeNet, AlexNet, GoogleNet and VGG16, the average C3D* is 32.61%, 69.91%, 50.11% and 69.17% higher respectively, proving that 3D video convolution F value of outperforms 2D image convolution in terms of traffic status recognition.

     

  • loading
  • WEI L, HONG Y D. Real-time road congestion detection based on image texture analysis[J]. Procedia Engineering, 2016, 137: 196-201. doi: 10.1016/j.proeng.2016.01.250
    SHI X, SHAN Z, ZHAO N. Learning for an aesthetic model for estimating the traffic state in the traffic video[J]. Neurocomputing, 2016, 181: 29-37. doi: 10.1016/j.neucom.2015.08.099
    崔华,袁超,魏泽发,等. 利用FCM对静态图像进行交通状态识别[J]. 西安电子科技大学学报,2017,44(6): 85-90.

    CUI Hua, YUAN Chao, WEI Zefa, et al. Traffic state recognition using state images and FCM[J]. Journal of Xidian University, 2017, 44(6): 85-90.
    LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. doi: 10.1109/5.726791
    Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations. San Diego: [s.n.], 2015: 1-14.
    JI S, XU W, YANG M, et al. 3D Convolutional neural networks for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221-231. doi: 10.1109/TPAMI.2012.59
    TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]//IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 4489-4497.
    XU H, DAS A, SAENKO K. R-C3D: region convolutional 3D network for temporal activity detection[C]//IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 5794-5803.
    TRAN D, WANG H, TORRESANI L, et al. A closer look at spatiotemporal convolutions for action recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6450-6459.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(5)

    Article views(570) PDF downloads(53) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return