Hazard analysis: A deep learning and text mining framework for accident prevention
Botao Zhong , Xing Pan , Peter E.D. Love , Jun Sun , Chanjuan Tao
Abstract
Learning from past accidents is pivotal for improving safety in construction. However, hazard records are typically documented and stored as unstructured or semi-structured free-text rendering the ability to analyse such data a difficult task. The research presented in this study presents a novel and robust framework that combines deep learning and text mining technologies that provide the ability to analyse hazard records automatically. The framework comprises four-step modelling approach: (1) identification of hazard topics using a Latent Dirichlet Allocation algorithm (LDA) model; (2) automatic classification of hazards using a Convolution Neural Network (CNN) algorithm; (3) the production of a Word Co-occurrence Network (WCN) to determine the interrelations between hazards; and (4) quantitative analysis by Word Cloud (WC) technology of keywords to provide a visual overview of hazard records. The proposed framework is validated by analysing hazard records collected from a large-scale transport infrastructure project. It is envisaged that the use of the framework can provide managers with new insights and knowledge to better ensure positive safety outcomes in projects. The contributions of this research are threefold: (1) it is demonstrated that the process of analysing hazard records can be automated by combining deep learning and text learning; (2) hazards are able to be visualized using a systematic and datadriven process; and (3) the automatic generation of hazard topics and their classification over specific time periods enabling managers to understand their patterns of manifestation and therefore put in place strategies to prevent them from reoccurring.
Keywords: Deep learning, Construction, Hazards Topic model, Text mining, Safety
https://doi.org/10.1016/j.aei.2020.101152