Construction Research Congress 2020
Classifying Semantic Relationships of Utility-Specific Terminologies via LSTM Networks along Shortest Dependency Paths
Publication: Construction Research Congress 2020: Computer Applications
ABSTRACT
The inconsistency of vocabulary terms used by different utility organizations makes data integration from distinct sources a challenging task. A semantic resource that classifies utility terminologies into sets of synonyms, hyponyms, and meronyms enables the computers to interpret the lexical semantics and to avoid mismatches when integrating heterogeneous utility data. However, constructing such semantic resources requires significant amounts of effort and time. This paper presents a shortest dependency path (SDP)-long short-term memory (LSTM) approach to automatically classify the semantic relations (i.e., is-a, part-of, is-similar, and random relationships) of two utility terms in a definition sentence. SDP is the shortest dependency path between two domain terms in a sentence, which retains the most informative information to relation classification while eliminating irrelevant words in the sentence. SDP-LSTM leverages LSTM units to pick up heterogeneous feature information along the SDPs and conducts automatic feature learning for semantic classification. The proposed approach was tested on a corpus of definition texts collected from utility design manuals. The preliminary results show an overall accuracy of over 80% and thus, the newly created method can serve as a good starting point to construct a semantic resource for the utility domain.
Get full access to this article
View all available purchase options and get full access to this chapter.
REFERENCES
Osman, H. M., and El-Diraby, T. E. (2010). “Knowledge-enabled decision support system for routing urban utilities.” Journal of Construction Engineering and Management, 137(3), 198-213.
El-Diraby, T. E., and Osman, H. (2011). “A domain ontology for construction concepts in urban infrastructure products.” Automation in Construction, 20(8), 1120-1132.
Abuzir, Y., and Abuzir, M. D. O. (2003). “Constructing the civil engineering thesaurus (CET) using ThesWB.” In Computing in Civil Engineering, 400-412.
buildingSMART (2016). “buildingSMART data dictionary.” <http://bsdd.buildingsmart.org/> (Jul. 25, 2019).
Le, T., and David Jeong, H. (2017). “NLP-based approach to semantic classification of heterogeneous transportation asset data terminology.” Journal of Computing in Civil Engineering, 31(6), 04017057.
Xu, Y., Mou, L., Li, G., Chen, Y., Peng, H., and Jin, Z. (2015). “Classifying relations via long short term memory networks along shortest dependency paths.” In proceedings of the 2015 conference on empirical methods in natural language processing, 1785-1794.
Hochreiter, S., and Schmidhuber, J. (1997). “Long short-term memory.” Neural computation, 9(8), 1735-1780.
Kingma, D. P., and Ba, J. (2014). “Adam: A method for stochastic optimization.” arXiv preprint arXiv:1412.6980.
Information & Authors
Information
Published In
Construction Research Congress 2020: Computer Applications
Pages: 372 - 379
Editors: Pingbo Tang, Ph.D., Arizona State University, David Grau, Ph.D., Arizona State University, and Mounir El Asmar, Ph.D., Arizona State University
ISBN (Online): 978-0-7844-8286-5
Copyright
© 2020 American Society of Civil Engineers.
History
Published online: Nov 9, 2020
Published in print: Nov 9, 2020
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.