• ISSN 0258-2724
  • CN 51-1277/U
  • EI Compendex
  • Scopus 收录
  • 全国中文核心期刊
  • 中国科技论文统计源期刊
  • 中国科学引文数据库来源期刊

多尺度注意力学习的Faster R-CNN口罩人脸检测模型

李泽琛 李恒超 胡文帅 杨金玉 华泽玺

李泽琛, 李恒超, 胡文帅, 杨金玉, 华泽玺. 多尺度注意力学习的Faster R-CNN口罩人脸检测模型[J]. 西南交通大学学报, 2021, 56(5): 1002-1010. doi: 10.3969/j.issn.0258-2724.20210017
引用本文: 李泽琛, 李恒超, 胡文帅, 杨金玉, 华泽玺. 多尺度注意力学习的Faster R-CNN口罩人脸检测模型[J]. 西南交通大学学报, 2021, 56(5): 1002-1010. doi: 10.3969/j.issn.0258-2724.20210017
LI Zechen, LI Hengchao, HU Wenshuai, YANG Jinyu, HUA Zexi. Masked Face Detection Model Based on Multi-scale Attention-Driven Faster R-CNN[J]. Journal of Southwest Jiaotong University, 2021, 56(5): 1002-1010. doi: 10.3969/j.issn.0258-2724.20210017
Citation: LI Zechen, LI Hengchao, HU Wenshuai, YANG Jinyu, HUA Zexi. Masked Face Detection Model Based on Multi-scale Attention-Driven Faster R-CNN[J]. Journal of Southwest Jiaotong University, 2021, 56(5): 1002-1010. doi: 10.3969/j.issn.0258-2724.20210017

多尺度注意力学习的Faster R-CNN口罩人脸检测模型

doi: 10.3969/j.issn.0258-2724.20210017
基金项目: 国家自然科学基金(61871335);中央高校基本业务费专项资金(2682020XG02,2682020ZT35);国家重点研发计划(2020YFB1711902)
详细信息
    作者简介:

    李泽琛(1996—),男,博士研究生,研究方向为图像处理与模式识别,E-mail:Lizc@my.swjtu.edu.cn

    通讯作者:

    华泽玺(1968—),男,副教授,博士,研究方向为轨道交通智慧运维、传感器与智能检测、监测,E-mail:huazexi@163.com

  • 中图分类号: TP391.41;TP183

Masked Face Detection Model Based on Multi-scale Attention-Driven Faster R-CNN

  • 摘要: 针对在佩戴口罩等有遮挡条件下的人脸检测问题,提出了多尺度注意力学习的Faster R-CNN (MSAF R-CNN)人脸检测模型. 首先,为充分考虑人脸目标多尺度信息,相较于原始Faster R-CNN框架,引入Res2Net分组残差结构,获取更细粒度的特征表征;其次,基于空间-通道注意力结构改进的Res2Net模块,结合注意力机制自适应学习目标不同尺度特征;最后,为学习目标的全局信息并减轻过拟合现象,在模型顶端嵌入加权空间金字塔池化网络,采用由粗到细的方式进行特征尺度划分. 在AIZOO和FMDD两个人脸数据集上的实验结果表明:所提出MSAF R-CNN模型对佩戴口罩的人脸检测准确率分别达到90.37%和90.11%,验证了模型的可行性和有效性.

     

  • 图 1  Res2Net模块

    Figure 1.  Res2Net module

    图 2  SCA-Res2Net模块

    Figure 2.  Structure of SCA-Res2Net module

    图 3  WSPP-Net模块

    Figure 3.  Structure of WSPP-Net

    图 4  MSAF R-CNN模型

    BN —batch normalization

    Figure 4.  MSAF R-CNN model

    图 5  数据集部分图像

    Figure 5.  Partial images of datasets

    表  1  不同分组数实验结果

    Table  1.   Experimental results under different numbers of groups %

    数据集类别分组数
    246810
    AIZOOFace90.4390.3290.1189.9290.10
    Mask89.9590.3789.8690.2789.50
    mAP90.1990.3589.9990.1089.80
    FMDDFace86.2187.2786.1786.5086.17
    Mask89.9990.1190.0490.2189.99
    mAP88.1088.6988.1088.3588.08
    下载: 导出CSV

    表  2  不同压缩比实验结果

    Table  2.   Experimental results under different compression ratios %

    数据集类别压缩比
    1012141618
    AIZOOFace90.3190.3990.1290.3290.41
    Mask89.7990.0890.2090.3789.87
    mAP90.0590.2390.1690.3590.14
    FMDDFace86.9884.8986.2687.2786.30
    Mask89.9089.6890.2590.1189.86
    mAP88.4487.2988.2688.6988.08
    下载: 导出CSV

    表  3  WSPP-Net不同多尺度窗口大小实验结果

    Table  3.   Experimental results under different window sizes in WSPP-Net %

    数据集类别窗口大小
    S1S2S3S4
    AIZOOFace90.0890.3290.3290.38
    Mask90.3190.3790.1390.01
    mAP90.2090.3590.2290.19
    FMDDFace86.5187.2786.5686.45
    Mask89.7690.1189.6089.99
    mAP88.1488.6988.0888.22
    下载: 导出CSV

    表  4  不同检测方法的性能

    Table  4.   Performance of different methods %

    数据集类别模型 1模型 2模型 3模型 4MSAF R-CNN
    AIZOOFace87.3290.4289.9490.1990.32
    Mask78.1589.8489.7189.9990.37
    mAP82.7390.1389.8290.0990.35
    FMDDFace86.0186.4184.4485.0587.27
    Mask77.9590.0189.9490.1090.11
    mAP81.9888.2187.1987.5888.69
    下载: 导出CSV

    表  5  消融实验结果

    Table  5.   Ablation experimental results of feature removal and fusion %

    数据集类别模型 5模型 6模型 7MSAF R-CNN
    AIZOOFace90.4389.9290.4090.32
    Mask90.0590.0390.0090.37
    mAP90.2489.9790.2090.35
    FMDDFace85.1386.0186.2387.27
    Mask89.9390.0089.9890.11
    mAP87.5388.0188.1088.69
    下载: 导出CSV
  • VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. Kauai: IEEE, 2001: I.511-I.518.
    LIENHART R, MAYDT J. An extended set of Haar-like features for rapid object detection[C]//Proceedings of International Conference on Image Processing. New York: IEEE, 2002: I.900-I.903.
    胡丽乔,仇润鹤. 一种自适应加权HOG特征的人脸识别算法[J]. 计算机工程与应用,2017,53(3): 164-168. doi: 10.3778/j.issn.1002-8331.1506-0183

    HU Liqiao, QIU Runhe. Face recognition based on adaptively weighted HOG[J]. Computer Engineering and Applications, 2017, 53(3): 164-168. doi: 10.3778/j.issn.1002-8331.1506-0183
    张路达,邓超. 多尺度融合的YOLOv3人群口罩佩戴检测方法[J]. 计算机工程与应用,2021,57(16): 283-290.

    ZHANG Luda, DENG Chao. Multi-scale fusion of YOLOv3 crowd mask wearing detection method[J]. Computer Engineering and Applications, 2021, 57(16): 283-290.
    魏丽,王洁,姜昕言,等. 遮挡条件下的人脸检测与遮挡物属性判识[J]. 计算机仿真,2020,37(9): 441-445,450. doi: 10.3969/j.issn.1006-9348.2020.09.093

    WEI Li, WANG Jie, JIANG Xinyan, et al. Face detection and obstacle attribute identification under occlusion[J]. Computer Simulation, 2020, 37(9): 441-445,450. doi: 10.3969/j.issn.1006-9348.2020.09.093
    薛均晓,程君进,张其斌,等. 改进轻量级卷积神经网络的复杂场景口罩佩戴检测方法[J]. 计算机辅助设计与图形学学报,2021,33(7): 1045-1054.

    XUE Junxiao, CHENG Junjin, ZHANG Qibin, et al. Improved efficient convolutional neural network for complex scene mask-wearing detection[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(7): 1045-1054.
    LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Computer Vision – ECCV 2016. Cham: Springer International Publishing, 2016: 21-37.
    迟万达,王士奇,张潇,等. 基于轻量化SSD的人脸检测模型设计[J]. 计算机与网络,2021,47(5): 69-73. doi: 10.3969/j.issn.1008-1739.2021.05.055

    CHI Wanda, WANG Shiqi, ZHANG Xiao, et al. Design on face detection model based on lightweight SSD[J]. Computer & Network, 2021, 47(5): 69-73. doi: 10.3969/j.issn.1008-1739.2021.05.055
    GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
    GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision (ICCV). Santiago: IEEE, 2015: 1440-1448.
    REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
    GAO S H, CHENG M M, ZHAO K, et al. Res2Net:a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2): 652-662. doi: 10.1109/TPAMI.2019.2938758
    BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL]. (2014-09-01)[2020-12-20]. https://www.researchgate.net/publication/265252627_Neural_Machine_Translation_by_Jointly_Learning_to_Align_and_Translate.
    ZHU Y S, ZHAO C Y, GUO H Y, et al. Attention coupleNet:fully convolutional attention coupling network for object detection[J]. IEEE Transactions on Image Processing, 2019, 28(1): 113-126. doi: 10.1109/TIP.2018.2865280
    ZHANG J F, NIU L, ZHANG L Q. Person re-identification with reinforced attribute attention selection[J]. IEEE Transactions on Image Processing, 2021, 30: 603-616. doi: 10.1109/TIP.2020.3036762
    HE L, CHAN J C W, WANG Z M. Automatic depression recognition using CNN with attention mechanism from videos[J]. Neurocomputing, 2021, 422: 165-175. doi: 10.1016/j.neucom.2020.10.015
    HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 770-778.
    MUDUMBI T, BIAN N Z, ZHANG Y Y, et al. An approach combined the faster RCNN and mobilenet for logo detection[J]. Journal of Physics:Conference Series, 2019, 1284: 012072.1-012072.8.
    SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations. San Diego: [s.n.], 2015: 1-14.
    SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Boston: IEEE, 2015: 1-9.
    HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.
    WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 3-19.
    XI O Y, KANG G, PAN Z. Spatial pyramid pooling mechanism in 3D convolutional network for sentence-level classification[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2018, 26(11): 2167-2179. doi: 10.1109/TASLP.2018.2852502
    YANG R, ZHANG Y, ZHAO P F, et al. MSPPF-nets: a deep learning architecture for remote sensing image classification[C]//IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium. Yokohama: IEEE, 2019: 3045-3048.
    WANG H J, SHI Y Y, YUE Y J, et al. Study on freshwater fish image recognition integrating SPP and DenseNet network[C]//2020 IEEE International Conference on Mechatronics and Automation (ICMA). Beijing: IEEE, 2020: 564-569.
    WANG T, YUAN L, ZHANG X, et al. Distilling object detectors with fine-grained feature imitation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4928-4937.
    YANG S, LUO P, LOY C C, et al. WIDER FACE: a face detection benchmark[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 5525-5533.
    GE S M, LI J, YE Q T, et al. Detecting masked faces in the wild with LLE-CNNs[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017: 426-434.
    LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//2017 IEEE International Conference on Computer Vision (ICCV). Venice: IEEE, 2017: 2999-3007.
  • 期刊类型引用(21)

    1. 韦伟,陶亚明,王翔翔. 基于改进YOLOv5的口罩佩戴识别研究. 中国新技术新产品. 2025(06): 44-46 . 百度学术
    2. 李林,王家华,周晨阳,孔思曼,孙践知. 目标检测数据集研究综述. 数据与计算发展前沿(中英文). 2024(02): 177-193 . 百度学术
    3. 徐佩,陈亚江. 融合Swin Transformer的YOLOv5口罩检测算法. 智能计算机与应用. 2024(05): 83-92 . 百度学术
    4. 高民,陈高华,古佳欣,张春美. FLM-YOLOv8:一种轻量级的口罩佩戴检测算法. 计算机工程与应用. 2024(17): 203-215 . 百度学术
    5. 杨艳春,闫岩,王可. 基于注意力机制与光照感知网络的红外与可见光图像融合. 西南交通大学学报. 2024(05): 1204-1214 . 本站查看
    6. 旷华江,刘光辉,李大林,徐骁,杨卫康,杨廷发,邓兴兴,张运波,田茂豪. 基于Cascade Mask Region-Convolutional Neural Network-ResNeSt的隧道光面爆破炮孔残痕智能识别方法. 现代隧道技术. 2024(05): 99-110 . 百度学术
    7. 陈高汝,陈晖,高翔,李长元,黄浩斌,李积捷,蔡嘉炜,林喆,江灏. 基于YOLOX-Tiny级联算法的变电站安全佩戴行为检测. 福州大学学报(自然科学版). 2023(01): 56-61 . 百度学术
    8. 葛云飞,祁云嵩,孟祥宇. YOLOv5改进的轻量级口罩人脸检测. 计算机系统应用. 2023(03): 195-201 . 百度学术
    9. 贾志,李茂军,李婉婷. 基于改进YOLOv5+DeepSort算法模型的交叉路口车辆实时检测. 计算机工程与科学. 2023(04): 674-682 . 百度学术
    10. 王馨悦,周小天. 基于视觉识别的智能翻译机器人人机交互系统研究. 自动化与仪器仪表. 2023(05): 207-211 . 百度学术
    11. 赵睿,刘辉,雷音,李达,刘沛霖. 面向密集人流的实时口罩检测算法. 传感器与微系统. 2023(07): 144-147 . 百度学术
    12. 陈子健,段春红. 面向在线学习情境的认知情绪面部表情识别. 计算机与现代化. 2023(10): 92-98 . 百度学术
    13. 王欣然,田启川,张东. 人脸口罩佩戴检测研究综述. 计算机工程与应用. 2022(10): 13-26 . 百度学术
    14. 王克丽,景运革. 基于YOLO的人脸口罩检测. 运城学院学报. 2022(03): 60-64 . 百度学术
    15. 巢渊,刘文汇,唐寒冰,马成霞,王雅倩. 基于改进YOLO-v4的室内人脸快速检测方法. 计算机工程与应用. 2022(14): 105-113 . 百度学术
    16. 郑新科,钮焱,李军. 基于改进SSD算法的遥感图像目标检测研究. 激光杂志. 2022(07): 106-112 . 百度学术
    17. 杨飞. 基于深度学习口罩检测算法综述. 工业控制计算机. 2022(08): 124-126 . 百度学术
    18. 陈继平,陈永平,谢懿,朱建清,曾焕强. Ghost-YOLO:轻量化口罩人脸检测算法. 信号处理. 2022(09): 1954-1964 . 百度学术
    19. 宋梦媛. 基于改进Faster RCNN的多尺度人脸检测网络研究. 自动化仪表. 2022(11): 39-43+48 . 百度学术
    20. 林宛杨. 基于机器视觉的人脸口罩佩戴检测装置设计. 应用技术学报. 2022(04): 370-375+382 . 百度学术
    21. 焦双健,谢似霞. 基于YOLOv5s的口罩佩戴实时检测系统设计. 电视技术. 2022(11): 52-56+60 . 百度学术

    其他类型引用(38)

  • 加载中
图(5) / 表(5)
计量
  • 文章访问数:  710
  • HTML全文浏览量:  375
  • PDF下载量:  67
  • 被引次数: 59
出版历程
  • 收稿日期:  2021-01-11
  • 修回日期:  2021-07-07
  • 网络出版日期:  2021-07-16
  • 刊出日期:  2021-10-15

目录

    /

    返回文章
    返回