科学研究
学术论文

Recognizing people’s identity in construction sites with computer vision: A spatial and temporal attention pooling network

来源:   作者:  发布时间:2019年09月18日  点击量:

Recognizing people’s identity in construction sites with computer vision: A spatial and temporal attention pooling network


Ran Wei, Peter ED Love, Weili Fang*, Hanbin Luo, Shuangjie Xu



Abstract


Several prototype vision-based approaches have been developed to capture and recognize unsafe behavior in construction automatically. Vision-based approaches have been difficult to use due to their inability to identify individuals who commit unsafe acts when captured using digital images/video. To address this problem, we applied a novel deep learning approach that utilizes a Spatial and Temporal Attention Pooling Network to remove redundant information contained in a video to enable a person’s identity to be automatically determined. The deep learning approach we have adopted focuses on: (1) extracting spatial feature maps using the spatial attention network; (2) extracting temporal information using the temporal attention networks; and (3) recognizing a person’s identity by computing the distance between features. To validate the feasibility and effectiveness of the adopted deep learning approach, we created a database of videos that contained people performing their work on construction sites, conducted an experiment, and then performed k-fold cross-validation. The results demonstrated that the approach could accurately identify a person’s identity from videos captured from construction sites. We suggest that our computer-vision approach can potentially be used by site managers to automatically recognize those individuals that engage in unsafe behavior and therefore be used to provide instantaneous feedback about their actions and possible consequences.


Key words: Recognition; Convolutional neural network; Recurrent neural network; Videos; Computer vision


LINK: https://doi.org/10.1016/j.aei.2019.100981