Processing math: 100%
  • ISSN 0258-2724
  • CN 51-1277/U
  • EI Compendex
  • Scopus 收录
  • 全国中文核心期刊
  • 中国科技论文统计源期刊
  • 中国科学引文数据库来源期刊

基于3D CNN的道路视频交通状态自动识别

彭博 唐聚 张媛媛 蔡晓禹 孟繁和

陈民武, 田航, 宋雅琳, 陈玲. 基于变频控制策略的同相供电装置可靠性优化方法[J]. 西南交通大学学报, 2020, 55(1): 9-17. doi: 10.3969/j.issn.0258-2724.20180668
引用本文: 彭博, 唐聚, 张媛媛, 蔡晓禹, 孟繁和. 基于3D CNN的道路视频交通状态自动识别[J]. 西南交通大学学报, 2021, 56(1): 153-159. doi: 10.3969/j.issn.0258-2724.20191169
CHEN Minwu, TIAN Hang, SONG Yalin, CHEN Ling. Reliability Optimization of Co-phase Power Supply Device Based on Frequency Conversion Control Strategy[J]. Journal of Southwest Jiaotong University, 2020, 55(1): 9-17. doi: 10.3969/j.issn.0258-2724.20180668
Citation: PENG Bo, TANG Ju, ZHANG Yuanyuan, CAI Xiaoyu, MENG Fanhe. Automatic Traffic State Recognition from Road Videos Based on 3D Convolution Neural Network[J]. Journal of Southwest Jiaotong University, 2021, 56(1): 153-159. doi: 10.3969/j.issn.0258-2724.20191169

基于3D CNN的道路视频交通状态自动识别

doi: 10.3969/j.issn.0258-2724.20191169
基金项目: 国家自然科学基金(61703064);重庆市科委基础前沿项目(cstc2017jcyjAX0473);重庆市技术创新与应用示范项目(cstc2018jscx-msybX0295)
详细信息
    作者简介:

    彭博(1986—),男,副教授,硕士生导师,研究方向为交通视频智能分析、城市交通状态识别与演变,E-mail:pengbo351@126.com

  • 中图分类号: U491.1

Automatic Traffic State Recognition from Road Videos Based on 3D Convolution Neural Network

  • 摘要: 为了从视频直接有效地提取交通信息,提出了基于三维卷积神经网络 (3D convolutional neural networks,3D CNN)的交通状态识别方法.首先,以C3D (convolutional 3D)深度卷积网络为3D CNN原型,对卷积层数量与位置、平面卷积尺寸及三维卷积深度进行优化调整,形成了37个备选模型;其次,建立了视频数据集,对备选模型进行系统的训练测试,提出了交通状态识别模型C3D*;然后,对C3D* 和现有三维卷积网络模型进行视频交通状态识别测试分析;最后,对比测试了C3D* 及常用二维卷积网络的交通状态识别效果. 对比结果显示:针对视频交通状态识别,C3D* 的F均值为91.32%,比C3D、R3D (region convolutional 3D network)、R (2+1) D (resnets adopting 2D spatial convolution and a 1D temporal convolution)分别高12.24%、26.72%、28.02%;与LeNet、AlexNet、GoogleNet、VGG16的图像识别结果相比,C3D* 的F均值分别高32.61%、69.91%、50.11%、69.17%.

     

  • 图 1  调整卷积层

    Figure 1.  Adjustment of convolutional layers

    图 2  优化平面卷积核尺寸

    Figure 2.  Optimizing size of plane convolutional kernels

    图 3  交通视频数据集制作

    Figure 3.  Making datasets of traffic videos

    图 4  平面卷积尺寸优化结果

    Figure 4.  Optimization results of plane convolutional size

    图 5  卷积核深度优化结果

    Figure 5.  Optimization results of convolutional depth

    图 6  视频测试指标对比

    Figure 6.  Index comparison of video tests

    表  1  既有三维卷积模型测试指标

    Table  1.   Test indexes of existing 3D convolutional models %

    交通状态C3DR3DR (2+1) D
    准确率召回率F准确率召回率F准确率召回率F
    畅通 92.86 96.89 94.83 64.56 95.03 76.88 69.78 97.52 81.35
    缓行 93.75 50.68 65.79 71.17 53.38 61.00 75.32 39.19 51.56
    拥堵 62.11 100.00 76.63 73.77 45.00 55.90 55.14 59.00 57.00
    均值 82.91 82.52 79.08 69.83 64.47 64.60 66.75 65.23 63.30
    下载: 导出CSV

    表  2  C3D删减conv4b或conv5b后识别结果

    Table  2.   C3D recognition results without conv4b or conv5b %

    交通状态C3D 删减 conv4b 后C3D 删减 conv5b 后
    准确率召回率F准确率召回率F
    畅通 98.73 96.89 97.81 100.00 59.01 74.22
    缓行 94.74 60.81 74.07 64.74 83.11 72.78
    拥堵 64.10 100.00 78.13 79.84 99.00 88.39
    均值 85.86 85.90 83.33 81.53 80.37 78.46
    下载: 导出CSV

    表  3  视频检测样例

    Table  3.   Samples of video detection results

    模型
    C3D畅通,Pr=0.57拥堵,Pr=1.00拥堵,Pr=1.00
    R3D缓行,Pr=0.45拥堵,Pr=0.41拥堵,Pr=0.64
    R(2+1)D缓行,Pr=0.49缓行,Pr=0.46拥堵,Pr=0.97
    C3D*畅通,Pr=0.99缓行,Pr=0.97拥堵,Pr=1.00
    真实状态畅通缓行拥堵
    下载: 导出CSV

    表  4  C3D*指标提升(相对于C3D)

    Table  4.   Index increase of C3D* compared to C3D %

    交通状态准确率召回率F
    畅通7.14−12.42−3.25
    缓行−9.7041.8922.31
    拥堵27.89−1.0017.66
    均值8.449.4912.24
    下载: 导出CSV

    表  5  C3D*与二维卷积模型对比

    Table  5.   Comparison between C3D* and 2-d convolutional models %

    交通状态LenetAlxnetGooglenetVGG16C3D*
    准确率召回率F准确率召回率F准确率召回率F准确率召回率F准确率召回率F
    畅通 88.13 32.61 47.61 0 0 0 90.01 36.94 52.38 35.02 100.00 51.87 98.31 76.15 85.82
    缓行 76.64 38.74 51.47 33.01 100.00 49.64 35.69 85.06 50.28 0 0 0 76.36 87.83 81.70
    拥堵 45.44 100.00 62.49 0 0 0 17.79 3.88 6.37 0 0 0 86.06 98.49 91.86
    均值 70.07 57.12 53.85 11.00 33.33 16.55 47.83 41.96 36.35 11.67 33.33 17.29 86.91 87.49 86.46
    对比 C3D*
    下降值
    16.84 30.37 32.61 75.91 54.16 69.91 39.08 45.53 50.11 75.24 54.16 69.17
    下载: 导出CSV
  • WEI L, HONG Y D. Real-time road congestion detection based on image texture analysis[J]. Procedia Engineering, 2016, 137: 196-201. doi: 10.1016/j.proeng.2016.01.250
    SHI X, SHAN Z, ZHAO N. Learning for an aesthetic model for estimating the traffic state in the traffic video[J]. Neurocomputing, 2016, 181: 29-37. doi: 10.1016/j.neucom.2015.08.099
    崔华,袁超,魏泽发,等. 利用FCM对静态图像进行交通状态识别[J]. 西安电子科技大学学报,2017,44(6): 85-90.

    CUI Hua, YUAN Chao, WEI Zefa, et al. Traffic state recognition using state images and FCM[J]. Journal of Xidian University, 2017, 44(6): 85-90.
    LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. doi: 10.1109/5.726791
    Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations. San Diego: [s.n.], 2015: 1-14.
    JI S, XU W, YANG M, et al. 3D Convolutional neural networks for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221-231. doi: 10.1109/TPAMI.2012.59
    TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]//IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 4489-4497.
    XU H, DAS A, SAENKO K. R-C3D: region convolutional 3D network for temporal activity detection[C]//IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 5794-5803.
    TRAN D, WANG H, TORRESANI L, et al. A closer look at spatiotemporal convolutions for action recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6450-6459.
  • 加载中
图(6) / 表(5)
计量
  • 文章访问数:  608
  • HTML全文浏览量:  300
  • PDF下载量:  55
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-12-09
  • 修回日期:  2020-03-16
  • 网络出版日期:  2020-04-01
  • 刊出日期:  2021-02-01

目录

    /

    返回文章
    返回