Masked Face Detection Model Based on Multi-scale Attention-Driven Faster R-CNN

LI Zechen; LI Hengchao; HU Wenshuai; YANG Jinyu; HUA Zexi

doi:10.3969/j.issn.0258-2724.20210017

Volume 56 Issue 5

Oct. 2021

Turn off MathJax

Article Contents

Abstract

References

Journal of Southwest Jiaotong University > 2021 > 56(5): 1002-1010.

LI Zechen, LI Hengchao, HU Wenshuai, YANG Jinyu, HUA Zexi. Masked Face Detection Model Based on Multi-scale Attention-Driven Faster R-CNN[J]. Journal of Southwest Jiaotong University, 2021, 56(5): 1002-1010. doi: 10.3969/j.issn.0258-2724.20210017

Citation:

LI Zechen, LI Hengchao, HU Wenshuai, YANG Jinyu, HUA Zexi. Masked Face Detection Model Based on Multi-scale Attention-Driven Faster R-CNN[J]. Journal of Southwest Jiaotong University, 2021, 56(5): 1002-1010. doi: 10.3969/j.issn.0258-2724.20210017

Citation:

PDF( 1835 KB)

Masked Face Detection Model Based on Multi-scale Attention-Driven Faster R-CNN

doi: 10.3969/j.issn.0258-2724.20210017

Received Date: 11 Jan 2021
Rev Recd Date: 07 Jul 2021

Available Online: 16 Jul 2021

Publish Date: 15 Oct 2021

Abstract

Abstract

For the purpose of masked face detection, a multi-scale attention-driven faster region-based convolutional neural network (MSAF R-CNN) model is proposed. First, given the Faster R-CNN model architecture and the multi-scale information of the face, Res2Net, a grouped-residual structure, is introduced to model more fine-grained features. Then, inspired by the attention mechanism, a novel spatial-channel attention Res2Net (SCA-Res2Net) module is developed to learn the multi-scale features adaptively. Finally, to further learn the global feature representation and ease the overfitting problem, the weighted spatial pyramid pooling network is embedded on the top of the model, which can segment the feature maps into different groups from finer to coarser scales. Experimental results on the AIZOO and FMDD datasets show that the accuracy of masked face detection with the proposed MSAF R-CNN model can reach 90.37% and 90.11%, respectively, thus verifying the feasibility and effectiveness of the proposed model.
- masked face,
- deep learning,
- attention mechanism,
- multi-scale learning,
- feature fusion,
- object detection

FullText(HTML)

References(29)

References

VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. Kauai: IEEE, 2001: I.511-I.518.

LIENHART R, MAYDT J. An extended set of Haar-like features for rapid object detection[C]//Proceedings of International Conference on Image Processing. New York: IEEE, 2002: I.900-I.903.

胡丽乔,仇润鹤. 一种自适应加权HOG特征的人脸识别算法[J]. 计算机工程与应用,2017,53(3): 164-168. doi: 10.3778/j.issn.1002-8331.1506-0183

HU Liqiao, QIU Runhe. Face recognition based on adaptively weighted HOG[J]. Computer Engineering and Applications, 2017, 53(3): 164-168. doi: 10.3778/j.issn.1002-8331.1506-0183

张路达,邓超. 多尺度融合的YOLOv3人群口罩佩戴检测方法[J]. 计算机工程与应用,2021,57(16): 283-290.

ZHANG Luda, DENG Chao. Multi-scale fusion of YOLOv3 crowd mask wearing detection method[J]. Computer Engineering and Applications, 2021, 57(16): 283-290.

魏丽,王洁,姜昕言,等. 遮挡条件下的人脸检测与遮挡物属性判识[J]. 计算机仿真,2020,37(9): 441-445,450. doi: 10.3969/j.issn.1006-9348.2020.09.093

WEI Li, WANG Jie, JIANG Xinyan, et al. Face detection and obstacle attribute identification under occlusion[J]. Computer Simulation, 2020, 37(9): 441-445,450. doi: 10.3969/j.issn.1006-9348.2020.09.093

薛均晓,程君进,张其斌,等. 改进轻量级卷积神经网络的复杂场景口罩佩戴检测方法[J]. 计算机辅助设计与图形学学报,2021,33(7): 1045-1054.

XUE Junxiao, CHENG Junjin, ZHANG Qibin, et al. Improved efficient convolutional neural network for complex scene mask-wearing detection[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(7): 1045-1054.

LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Computer Vision – ECCV 2016. Cham: Springer International Publishing, 2016: 21-37.

迟万达,王士奇,张潇,等. 基于轻量化SSD的人脸检测模型设计[J]. 计算机与网络,2021,47(5): 69-73. doi: 10.3969/j.issn.1008-1739.2021.05.055

CHI Wanda, WANG Shiqi, ZHANG Xiao, et al. Design on face detection model based on lightweight SSD[J]. Computer & Network, 2021, 47(5): 69-73. doi: 10.3969/j.issn.1008-1739.2021.05.055

GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.

GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision (ICCV). Santiago: IEEE, 2015: 1440-1448.

REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031

GAO S H, CHENG M M, ZHAO K, et al. Res2Net:a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2): 652-662. doi: 10.1109/TPAMI.2019.2938758

BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL]. (2014-09-01)[2020-12-20]. https://www.researchgate.net/publication/265252627_Neural_Machine_Translation_by_Jointly_Learning_to_Align_and_Translate.

ZHU Y S, ZHAO C Y, GUO H Y, et al. Attention coupleNet:fully convolutional attention coupling network for object detection[J]. IEEE Transactions on Image Processing, 2019, 28(1): 113-126. doi: 10.1109/TIP.2018.2865280

ZHANG J F, NIU L, ZHANG L Q. Person re-identification with reinforced attribute attention selection[J]. IEEE Transactions on Image Processing, 2021, 30: 603-616. doi: 10.1109/TIP.2020.3036762

HE L, CHAN J C W, WANG Z M. Automatic depression recognition using CNN with attention mechanism from videos[J]. Neurocomputing, 2021, 422: 165-175. doi: 10.1016/j.neucom.2020.10.015

HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 770-778.

MUDUMBI T, BIAN N Z, ZHANG Y Y, et al. An approach combined the faster RCNN and mobilenet for logo detection[J]. Journal of Physics:Conference Series, 2019, 1284: 012072.1-012072.8.

SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations. San Diego: [s.n.], 2015: 1-14.

SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Boston: IEEE, 2015: 1-9.

HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.

WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 3-19.

XI O Y, KANG G, PAN Z. Spatial pyramid pooling mechanism in 3D convolutional network for sentence-level classification[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2018, 26(11): 2167-2179. doi: 10.1109/TASLP.2018.2852502

YANG R, ZHANG Y, ZHAO P F, et al. MSPPF-nets: a deep learning architecture for remote sensing image classification[C]//IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium. Yokohama: IEEE, 2019: 3045-3048.

WANG H J, SHI Y Y, YUE Y J, et al. Study on freshwater fish image recognition integrating SPP and DenseNet network[C]//2020 IEEE International Conference on Mechatronics and Automation (ICMA). Beijing: IEEE, 2020: 564-569.

WANG T, YUAN L, ZHANG X, et al. Distilling object detectors with fine-grained feature imitation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4928-4937.

YANG S, LUO P, LOY C C, et al. WIDER FACE: a face detection benchmark[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 5525-5533.

GE S M, LI J, YE Q T, et al. Detecting masked faces in the wild with LLE-CNNs[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017: 426-434.

LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//2017 IEEE International Conference on Computer Vision (ICCV). Venice: IEEE, 2017: 2999-3007.

Relative Articles

[1]	WANG Huiqin, GUO Ruili, HE Yongqiang, LIU Bincan. NGO-Based CNN-BiLSTM-AM Model for Landslide Displacement Prediction[J]. Journal of Southwest Jiaotong University. doi: 10.3969/j.issn.0258-2724.20240550
[2]	LI Linchao, ZHONG Liangjian, SU Qing, REN Lu, DU Bowen. Fine Urban Land Use Identification Based on Fusion of Multi-source Data[J]. Journal of Southwest Jiaotong University, 2025, 60(2): 326-335. doi: 10.3969/j.issn.0258-2724.20230296
[3]	LIU Hongen, HU Minsheng, HU Hailin. Reinforcement Learning Braking Control of Maglev Trains Based on Self-Learning of Hybrid Braking Features[J]. Journal of Southwest Jiaotong University, 2024, 59(4): 839-847. doi: 10.3969/j.issn.0258-2724.20230517
[4]	CHEN Yong, WANG Zhen, ZHANG Jiaojiao. Lightweight Detection of Railway Object Intrusion Based on Spectral Pooling and Shuffled-Convolutional Block Attention Module Enhancement[J]. Journal of Southwest Jiaotong University, 2024, 59(6): 1294-1304. doi: 10.3969/j.issn.0258-2724.20220074
[5]	YANG Jun, GAO Zhiming, LI Jintai, ZHANG Chen. Correspondence Calculation of Three-Dimensional Point Cloud Model Based on Attention Mechanism[J]. Journal of Southwest Jiaotong University, 2024, 59(5): 1184-1193. doi: 10.3969/j.issn.0258-2724.20220682
[6]	YANG Yanchun, YAN Yan, WANG Ke. Infrared and Visible Image Fusion Based on Attention Mechanism and Illumination-Aware Network[J]. Journal of Southwest Jiaotong University, 2024, 59(5): 1204-1214. doi: 10.3969/j.issn.0258-2724.20230529
[7]	XIA Ying, LIU Min. Traffic Flow Prediction Based on Spatial-Temporal Attention Convolutional Neural Network[J]. Journal of Southwest Jiaotong University, 2023, 58(2): 340-347. doi: 10.3969/j.issn.0258-2724.20210526
[8]	YANG Yueyi, WANG Lide, WANG Chong, WANG Huizhen, LI Ye. Fault Diagnosis Method Based on Deep Active Learning For MVB Network[J]. Journal of Southwest Jiaotong University, 2022, 57(6): 1342-1348, 1385. doi: 10.3969/j.issn.0258-2724.20210195
[9]	ZHU Jun, ZHANG Tianyi, XIE Yakun, ZHANG Jie, LI Chuangnong, ZHAO Li, LI Weilian. Intelligent Statistic Method for Video Pedestrian Flow Considering Small Object Features[J]. Journal of Southwest Jiaotong University, 2022, 57(4): 705-712, 736. doi: 10.3969/j.issn.0258-2724.20200425
[10]	GONG Xun, ZHANG Zhiying, LIU Lu, MA Bing, WU Kunlun. A Survey of Human-Object Interaction Detection[J]. Journal of Southwest Jiaotong University, 2022, 57(4): 693-704. doi: 10.3969/j.issn.0258-2724.20210339
[11]	TIAN Sheng, ZHANG Jianfeng, ZHANG Yutian, XU Kai. Lane Detection Algorithm Based on Dilated Convolution Pyramid Network[J]. Journal of Southwest Jiaotong University, 2020, 55(2): 386-392, 416. doi: 10.3969/j.issn.0258-2724.20181026
[12]	ZHAI Donghai, HOU Jialin, LIU Yue. Parallel Algorithms for Text Sentiment Analysis Based on Deep Learning[J]. Journal of Southwest Jiaotong University, 2019, 54(3): 647-654. doi: 10.3969/j.issn.0258-2724.20160948
[13]	XIANG Yu, CONG Deming, ZHANG Yang, YUAN Fei. Two-Stream Neural Network Fusion Model for Highway Fog Detection[J]. Journal of Southwest Jiaotong University, 2019, 54(1): 173-179. doi: 10.3969/j.issn.0258-2724.20180205
[14]	HOU Jin, LÜ Zhiliang, XU Mao, WU Peijun, LIU Yuling, ZHANG Xiaoyu, CHENG Zeng. Combined Neural Networks Based on Deep Learning for Signal Detection in Aeronautical Communications[J]. Journal of Southwest Jiaotong University, 2019, 54(4): 863-869, 878. doi: 10.3969/j.issn.0258-2724.20180164
[15]	LIU Jiajia, LI Bailin, LUO Jianqiao, . Railway Fastener Detection Algorithm Integrating PHOG and MSLBP Features[J]. Journal of Southwest Jiaotong University, 2015, 28(2): 256-263. doi: 10.3969/j.issn.0258-2724.2015.02.008
[16]	HUANG Jin, JIN Weidong, QIN Na. Moving Objects Detection Algorithm Based on Three-Dimensional Gaussian Mixture Codebook Model[J]. Journal of Southwest Jiaotong University, 2012, 25(4): 662-668. doi: 10.3969/j.issn.0258-2724.2012.04.020
[17]	SHEN Yuanxia, WANG Guoyin. Data-Driven Q-Learning in Dynamic Environment[J]. Journal of Southwest Jiaotong University, 2009, 22(6): 877-881.
[18]	YE Li-sheng, HE Feng-dao. The Learning of BP Neural Network Based on Evolutionary Programming[J]. Journal of Southwest Jiaotong University, 2001, 14(5): 545-548.
[19]	He Zhengyou, Qian Qingquan. An Improved Wavelet Neural Network Structure and Its Learning Algorithm[J]. Journal of Southwest Jiaotong University, 1999, 12(5): 436-440.
[20]	HE Zheng-You, Jian-Qing-Quan. An Improved Wavelet Neural Network Structure and Its Learning Algorithm[J]. Journal of Southwest Jiaotong University, 1999, 12(5): 436-440.

Supplements(0)

Cited By

Cited by

Periodical cited type(21)

1.	韦伟，陶亚明，王翔翔. 基于改进YOLOv5的口罩佩戴识别研究. 中国新技术新产品. 2025(06): 44-46 .
2.	李林，王家华，周晨阳，孔思曼，孙践知. 目标检测数据集研究综述. 数据与计算发展前沿(中英文). 2024(02): 177-193 .
3.	徐佩，陈亚江. 融合Swin Transformer的YOLOv5口罩检测算法. 智能计算机与应用. 2024(05): 83-92 .
4.	高民，陈高华，古佳欣，张春美. FLM-YOLOv8：一种轻量级的口罩佩戴检测算法. 计算机工程与应用. 2024(17): 203-215 .
5.	杨艳春，闫岩，王可. 基于注意力机制与光照感知网络的红外与可见光图像融合. 西南交通大学学报. 2024(05): 1204-1214 . 本站查看
6.	旷华江，刘光辉，李大林，徐骁，杨卫康，杨廷发，邓兴兴，张运波，田茂豪. 基于Cascade Mask Region-Convolutional Neural Network-ResNeSt的隧道光面爆破炮孔残痕智能识别方法. 现代隧道技术. 2024(05): 99-110 .
7.	陈高汝，陈晖，高翔，李长元，黄浩斌，李积捷，蔡嘉炜，林喆，江灏. 基于YOLOX-Tiny级联算法的变电站安全佩戴行为检测. 福州大学学报(自然科学版). 2023(01): 56-61 .
8.	葛云飞，祁云嵩，孟祥宇. YOLOv5改进的轻量级口罩人脸检测. 计算机系统应用. 2023(03): 195-201 .
9.	贾志，李茂军，李婉婷. 基于改进YOLOv5+DeepSort算法模型的交叉路口车辆实时检测. 计算机工程与科学. 2023(04): 674-682 .
10.	王馨悦，周小天. 基于视觉识别的智能翻译机器人人机交互系统研究. 自动化与仪器仪表. 2023(05): 207-211 .
11.	赵睿，刘辉，雷音，李达，刘沛霖. 面向密集人流的实时口罩检测算法. 传感器与微系统. 2023(07): 144-147 .
12.	陈子健，段春红. 面向在线学习情境的认知情绪面部表情识别. 计算机与现代化. 2023(10): 92-98 .
13.	王欣然，田启川，张东. 人脸口罩佩戴检测研究综述. 计算机工程与应用. 2022(10): 13-26 .
14.	王克丽，景运革. 基于YOLO的人脸口罩检测. 运城学院学报. 2022(03): 60-64 .
15.	巢渊，刘文汇，唐寒冰，马成霞，王雅倩. 基于改进YOLO-v4的室内人脸快速检测方法. 计算机工程与应用. 2022(14): 105-113 .
16.	郑新科，钮焱，李军. 基于改进SSD算法的遥感图像目标检测研究. 激光杂志. 2022(07): 106-112 .
17.	杨飞. 基于深度学习口罩检测算法综述. 工业控制计算机. 2022(08): 124-126 .
18.	陈继平，陈永平，谢懿，朱建清，曾焕强. Ghost-YOLO:轻量化口罩人脸检测算法. 信号处理. 2022(09): 1954-1964 .
19.	宋梦媛. 基于改进Faster RCNN的多尺度人脸检测网络研究. 自动化仪表. 2022(11): 39-43+48 .
20.	林宛杨. 基于机器视觉的人脸口罩佩戴检测装置设计. 应用技术学报. 2022(04): 370-375+382 .
21.	焦双健，谢似霞. 基于YOLOv5s的口罩佩戴实时检测系统设计. 电视技术. 2022(11): 52-56+60 .

Other cited types(38)

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(5) / Tables(5)

Get Citation

PDF

XML

Article views(718) PDF downloads(67)