水力发电学报
            首 页   |   期刊介绍   |   编委会   |   投稿须知   |   下载中心   |   联系我们   |   学术规范   |   编辑部公告   |   English

水力发电学报 ›› 2022, Vol. 41 ›› Issue (3): 133-141.doi: 10.11660/slfdxb.20220313

• • 上一篇    下一篇

基于改进LDA的水电工程进度管理文本智能分析

  

  • 出版日期:2022-03-25 发布日期:2022-03-25

Intelligent text analysis of hydropower project progress management based on improved LDA

  • Online:2022-03-25 Published:2022-03-25

摘要: 进度控制是水电工程管理的重要任务,及时总结进度管理信息有助于工程进度计划的制定与调整。水电工程建设中的进度信息多以半结构化、非结构化的文本形式呈现,增加了信息提取难度,实现水电工程进度文本信息自动化与智能化挖掘是当前亟待解决的问题。本文提出基于改进LDA的水电工程进度信息智能提取方法,智能提取进度管理文本中的关键信息。该方法基于传统LDA模型针对吉布斯采样机制,充分考虑词语间的关联关系,将原有随机单个采样过程改进为以共现度为基准的词对采样,强化了词语间的语义关联,提高了主题词语间的紧密性以及主题词语对主题描述的准确性。将所提出的方法应用于实际水电工程,对221份水电工程施工监理周报进行分析,共提取12个主题的工序关键词,并依照计算结果提取出主副工序;结果表明,改进LDA主题模型在水电工程进度文本工序特征词提取效果优于传统LDA主题模型,有助于提高工程施工进度关键工序词提取与信息挖掘效率,为水电工程施工智能化管理提供了新的手段。

关键词: 水电工程, 施工进度, 关键词提取, 改进LDA主题模型, 共现度, 文本智能分析

Abstract: Schedule control is a key task of hydropower project management; A timely summary of schedule management information helps formulate and adjust the project schedules. In hydropower project construction, progress information is often presented in semi-structured or unstructured text forms, thereby inducing a difficulty in information extraction. An urgent issue is how to realize automation and intelligent mining of the text information of hydropower project progress. This paper presents a new intelligent extraction method of hydropower project schedule information based on an improved LDA method for intelligent extraction of the key information from schedule management texts. This method, based on the Gibbs sampling mechanism of the traditional LDA model, takes full consideration of the association relationship between words, and improves semantic association between words, closeness between words, and accuracy in the description of topic words. It is applied to practical hydropower projects to analyze 221 weekly reports on construction supervision, and extracted the key words of 12 themes, with the main and secondary processes extracted through calculations. Results show that our improved LDA topic model is better than the traditional LDA and it helps improve word extraction efficiency and information mining efficiency for hydropower construction.

Key words: hydropower project, construction progress, keyword extraction, improved-LDA topic model, co-occurrence, text intelligence analysis

京ICP备13015787号-3
版权所有 © 2013《水力发电学报》编辑部
编辑部地址:中国北京清华大学水电工程系 邮政编码:100084 电话:010-62783813
本系统由北京玛格泰克科技发展有限公司设计开发  技术支持:support@magtech.com.cn