面向HBIM的历史建筑跨模态图文信息检索方法研究
袁嘉梦
摘 要
如何对历史建筑进行保护性修缮是当前的重要议题,社会对其关注程度日益提升,数字化技术的发展为历史建筑修缮改造工作带来了新的契机。历史建筑信息建模(Historic Building Modeling, HBIM)作为参数化对象库在历史建筑修缮过程中可以起到辅助修缮的作用,对 HBIM 数据库中数据的管理和重用带来了对 HBIM 信息检索方法的需求。
本文首先研究了 HBIM 多维结构化数据库构建方法,深入挖掘了建筑的历史文化要素,并对建筑数据进行分类提取及整理,分别存储在四个子数据库中,实现历史建筑信息的全面表达。其次,建立了包含 459 个建筑、6838 张图像、459 个描述性文本在内的历史建筑图文数据集,为了规范描述性文本的内容制作了文本信息模板,同时还制作了查询语句样例集用于算法的评估测试。此外,提出了历史建筑跨模态图文信息检索方法,定义了相似建筑的概念以及图像和文本数据与建筑物之间特定的相似度度量规则,实现了通过输入图像或自然语言文本检索相似历史建筑。其中,在图像检索建筑时,利用了“ViT/B-16”模型进行图像特征提取,并在自制的数据集上达到了 95.21%的检索准确率,显著高于其他经典图像检索算法。在文本检索建筑时,利用 BERT 模型进行文本特征提取,将 DALL-E3 模型集成到检索算法中,解决历史建筑领域图文数据跨模态比较的难题,并通过实验确定了文本-建筑检索算法中的最佳参数配置,在对比实验中验证了检索算法的准确性。最后,面向历史建筑修缮工程的设计人员和施工人员,基于 HBIM 数据库与历史建筑跨模态图文信息检索算法,设计并构建了历史建筑跨模态检索系统,通过说明各子数据库在历史建筑修缮改造过程中的参考作用,验证检索系统的实际应用价值。
本文为历史建筑修缮工作提供了新的方法和工具,通过相似建筑的检索,间接实现建筑关联的 HBIM 数据库中数据资料的检索,为后续的辅助修缮工作提供参考,实现历史建筑修缮过程中的知识重用。
关键词:HBIM 数据库;历史建筑;跨模态检索;相似性度量;保护性修缮
Abstract
The protection and restoration of historic buildings are currently important topics, with increasing societal attention. The development of digital technologies has introduced new opportunities for the restoration and renovation of historic buildings. Historic Building Information Modeling (HBIM), as a parametric object library, plays an assisting role in the restoration process of historic buildings. It enables the management and reuse of data within HBIM databases, thereby necessitating the development of HBIM information retrieval methods to support these functionalities.
This paper first investigates the construction method of a multi-dimensional structured HBIM database, delving into the historical and cultural elements of buildings, and categorizing and organizing building data stored in four sub-databases to comprehensively express historic building information. Subsequently, a historic building image-text dataset is established, comprising 459 buildings, 6838 images, and 459 descriptive texts. To standardize descriptive text content, a text information template is created, and a set of query examples is generated for algorithm evaluation testing. Furthermore, a cross-modal retrieval method for historic buildings is proposed, defining the concept of similar buildings and specific similarity measurement rules between image and text data and buildings. This method enables the retrieval of similar historic buildings through input images or natural language text. For image-based building retrieval, the "ViT/B-16" model is employed for image feature extraction, achieving a retrieval accuracy of 95.21% on the custom dataset, significantly higher than other classical image retrieval algorithms. For text-based building retrieval, the BERT model is used for text feature extraction, integrating the DALL-E3 model into the retrieval algorithm to address challenges in cross-modal comparison of historic building image-text data. Through experiments, the optimal parameter configuration for the text-building retrieval algorithm is determined, verifying the accuracy of the retrieval algorithm through comparative experiments. Finally, targeting designers and construction workers involved in historic building restoration projects, a historic building cross-modal retrieval system is designed and developed based on the HBIM database and historic building cross-modal retrieval algorithm. By illustrating the role of each sub-database in the process of historic building restoration and renovation, the practical application value of the retrieval engine is validated.
This paper presents a new method and tool for the restoration of historic buildings. By retrieving similar buildings, it indirectly facilitates the retrieval of data from the associated HBIM database, providing references for subsequent restoration work and enabling the reuse of knowledge in the restoration process of historic buildings.
Key words: HBIM Database, Historic Building, Cross-modal Retrieval, Similarity Measurement, Protective Restoration