科学研究
学术论文

Visual attention framework for identifying semantic information from construction monitoring video

来源:   作者:  发布时间:2023年09月28日  点击量:

Visual attention framework for identifying semantic information from construction monitoring video

Botao Zhong, Luoxin Shen, Xing Pan, Lei Lei

Abstract

Construction safety management has been extensively investigated. Construction cameras have been widely adopted to monitor people’s performance in construction on-site. However, manually analyzing large quantities of video or image data is time-consuming and labor-intensive. Existing studies mostly focus on single element identification in videos or images, while the deeper semantic understanding of construction scenes with the whole scene is limited. Drawing on the attention mechanism, a framework is proposed to address this problem and identify semantic information such as multiple objects, relationships, and attributes from construction videos. This framework comprises the following two-step modeling approach: (1) a frame extraction model with an interframe difference mechanism is proposed to extract frames/images from construction videos and (2) an image scene understanding model that integrates a ResNet101 “encoder” and an LSTM + Attention “decoder” is put forward to identify semantic information/natural language descriptions from frames/images. Finally, the proposed framework is validated by multiple experiments with offline image datasets of construction scenes. The contributions of this research are twofold: (1) The proposed visual attention framework represents a significant and data-driven advancement in the cross-modal processing of construction video-image-natural language descriptions; (2) The automatic generation of video semantic information facilitates construction safety management such as workers’ safety state estimation and monitoring video/image retrieval and storage.

Keywords:Construction safety management;
Monitoring video;
Scene understanding;
Visual attention framework;
Frame extraction;
Semantic information

https://www.sciencedirect.com/science/article/pii/S0925753523000644