• ISSN 0258-2724
  • CN 51-1277/U
  • EI Compendex
  • Scopus 收录
  • 全国中文核心期刊
  • 中国科技论文统计源期刊
  • 中国科学引文数据库来源期刊

基于3D CNN的道路视频交通状态自动识别

彭博 唐聚 张媛媛 蔡晓禹 孟繁和

彭博, 唐聚, 张媛媛, 蔡晓禹, 孟繁和. 基于3D CNN的道路视频交通状态自动识别[J]. 西南交通大学学报, 2021, 56(1): 153-159. doi: 10.3969/j.issn.0258-2724.20191169
引用本文: 彭博, 唐聚, 张媛媛, 蔡晓禹, 孟繁和. 基于3D CNN的道路视频交通状态自动识别[J]. 西南交通大学学报, 2021, 56(1): 153-159. doi: 10.3969/j.issn.0258-2724.20191169
PENG Bo, TANG Ju, ZHANG Yuanyuan, CAI Xiaoyu, MENG Fanhe. Automatic Traffic State Recognition from Road Videos Based on 3D Convolution Neural Network[J]. Journal of Southwest Jiaotong University, 2021, 56(1): 153-159. doi: 10.3969/j.issn.0258-2724.20191169
Citation: PENG Bo, TANG Ju, ZHANG Yuanyuan, CAI Xiaoyu, MENG Fanhe. Automatic Traffic State Recognition from Road Videos Based on 3D Convolution Neural Network[J]. Journal of Southwest Jiaotong University, 2021, 56(1): 153-159. doi: 10.3969/j.issn.0258-2724.20191169

基于3D CNN的道路视频交通状态自动识别

doi: 10.3969/j.issn.0258-2724.20191169
基金项目: 国家自然科学基金(61703064);重庆市科委基础前沿项目(cstc2017jcyjAX0473);重庆市技术创新与应用示范项目(cstc2018jscx-msybX0295)
详细信息
    作者简介:

    彭博(1986—),男,副教授,硕士生导师,研究方向为交通视频智能分析、城市交通状态识别与演变,E-mail:pengbo351@126.com

  • 中图分类号: U491.1

Automatic Traffic State Recognition from Road Videos Based on 3D Convolution Neural Network

  • 摘要: 为了从视频直接有效地提取交通信息,提出了基于三维卷积神经网络 (3D convolutional neural networks,3D CNN)的交通状态识别方法.首先,以C3D (convolutional 3D)深度卷积网络为3D CNN原型,对卷积层数量与位置、平面卷积尺寸及三维卷积深度进行优化调整,形成了37个备选模型;其次,建立了视频数据集,对备选模型进行系统的训练测试,提出了交通状态识别模型C3D*;然后,对C3D* 和现有三维卷积网络模型进行视频交通状态识别测试分析;最后,对比测试了C3D* 及常用二维卷积网络的交通状态识别效果. 对比结果显示:针对视频交通状态识别,C3D* 的F均值为91.32%,比C3D、R3D (region convolutional 3D network)、R (2+1) D (resnets adopting 2D spatial convolution and a 1D temporal convolution)分别高12.24%、26.72%、28.02%;与LeNet、AlexNet、GoogleNet、VGG16的图像识别结果相比,C3D* 的F均值分别高32.61%、69.91%、50.11%、69.17%.

     

  • 图 1  调整卷积层

    Figure 1.  Adjustment of convolutional layers

    图 2  优化平面卷积核尺寸

    Figure 2.  Optimizing size of plane convolutional kernels

    图 3  交通视频数据集制作

    Figure 3.  Making datasets of traffic videos

    图 4  平面卷积尺寸优化结果

    Figure 4.  Optimization results of plane convolutional size

    图 5  卷积核深度优化结果

    Figure 5.  Optimization results of convolutional depth

    图 6  视频测试指标对比

    Figure 6.  Index comparison of video tests

    表  1  既有三维卷积模型测试指标

    Table  1.   Test indexes of existing 3D convolutional models %

    交通状态C3DR3DR (2+1) D
    准确率召回率$F$ 值准确率召回率$F$ 值准确率召回率$F$ 值
    畅通 92.86 96.89 94.83 64.56 95.03 76.88 69.78 97.52 81.35
    缓行 93.75 50.68 65.79 71.17 53.38 61.00 75.32 39.19 51.56
    拥堵 62.11 100.00 76.63 73.77 45.00 55.90 55.14 59.00 57.00
    均值 82.91 82.52 79.08 69.83 64.47 64.60 66.75 65.23 63.30
    下载: 导出CSV

    表  2  C3D删减conv4b或conv5b后识别结果

    Table  2.   C3D recognition results without conv4b or conv5b %

    交通状态C3D 删减 conv4b 后C3D 删减 conv5b 后
    准确率召回率F准确率召回率F
    畅通 98.73 96.89 97.81 100.00 59.01 74.22
    缓行 94.74 60.81 74.07 64.74 83.11 72.78
    拥堵 64.10 100.00 78.13 79.84 99.00 88.39
    均值 85.86 85.90 83.33 81.53 80.37 78.46
    下载: 导出CSV

    表  3  视频检测样例

    Table  3.   Samples of video detection results

    模型
    C3D畅通,Pr=0.57拥堵,Pr=1.00拥堵,Pr=1.00
    R3D缓行,Pr=0.45拥堵,Pr=0.41拥堵,Pr=0.64
    R(2+1)D缓行,Pr=0.49缓行,Pr=0.46拥堵,Pr=0.97
    C3D*畅通,Pr=0.99缓行,Pr=0.97拥堵,Pr=1.00
    真实状态畅通缓行拥堵
    下载: 导出CSV

    表  4  C3D*指标提升(相对于C3D)

    Table  4.   Index increase of C3D* compared to C3D %

    交通状态准确率召回率F
    畅通7.14−12.42−3.25
    缓行−9.7041.8922.31
    拥堵27.89−1.0017.66
    均值8.449.4912.24
    下载: 导出CSV

    表  5  C3D*与二维卷积模型对比

    Table  5.   Comparison between C3D* and 2-d convolutional models %

    交通状态LenetAlxnetGooglenetVGG16C3D*
    准确率召回率F准确率召回率F准确率召回率F准确率召回率F准确率召回率F
    畅通 88.13 32.61 47.61 0 0 0 90.01 36.94 52.38 35.02 100.00 51.87 98.31 76.15 85.82
    缓行 76.64 38.74 51.47 33.01 100.00 49.64 35.69 85.06 50.28 0 0 0 76.36 87.83 81.70
    拥堵 45.44 100.00 62.49 0 0 0 17.79 3.88 6.37 0 0 0 86.06 98.49 91.86
    均值 70.07 57.12 53.85 11.00 33.33 16.55 47.83 41.96 36.35 11.67 33.33 17.29 86.91 87.49 86.46
    对比 C3D*
    下降值
    16.84 30.37 32.61 75.91 54.16 69.91 39.08 45.53 50.11 75.24 54.16 69.17
    下载: 导出CSV
  • WEI L, HONG Y D. Real-time road congestion detection based on image texture analysis[J]. Procedia Engineering, 2016, 137: 196-201. doi: 10.1016/j.proeng.2016.01.250
    SHI X, SHAN Z, ZHAO N. Learning for an aesthetic model for estimating the traffic state in the traffic video[J]. Neurocomputing, 2016, 181: 29-37. doi: 10.1016/j.neucom.2015.08.099
    崔华,袁超,魏泽发,等. 利用FCM对静态图像进行交通状态识别[J]. 西安电子科技大学学报,2017,44(6): 85-90.

    CUI Hua, YUAN Chao, WEI Zefa, et al. Traffic state recognition using state images and FCM[J]. Journal of Xidian University, 2017, 44(6): 85-90.
    LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. doi: 10.1109/5.726791
    Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations. San Diego: [s.n.], 2015: 1-14.
    JI S, XU W, YANG M, et al. 3D Convolutional neural networks for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221-231. doi: 10.1109/TPAMI.2012.59
    TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]//IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 4489-4497.
    XU H, DAS A, SAENKO K. R-C3D: region convolutional 3D network for temporal activity detection[C]//IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 5794-5803.
    TRAN D, WANG H, TORRESANI L, et al. A closer look at spatiotemporal convolutions for action recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6450-6459.
  • 加载中
图(6) / 表(5)
计量
  • 文章访问数:  559
  • HTML全文浏览量:  274
  • PDF下载量:  53
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-12-09
  • 修回日期:  2020-03-16
  • 网络出版日期:  2020-04-01
  • 刊出日期:  2021-02-01

目录

    /

    返回文章
    返回