Parsing the text of terminology dictionaries

: pp. 90-100
Lviv Polytechnic National University

The article outlines a range of tasks, approaches and stages of developing parsing technology for text of a multilingual explanatory terminology dictionary. Research was conducted for the “Dictionary of Ukrainian Biological Terminology”. Among all the vocabulary diversity, this dictionary was chosen because terminology dictionaries provide a lexical-semantic basis for further creation of systems for the intelligent processing of professional texts, which provide information on specific subject areas. This terminographical work encompasses the normative general scientific and widely used terminology of biological sciences, recorded in modern encyclopedic, general and special dictionaries, in scientific, popular science, educational and informative literature. After studying the chosen dictionary, the model of its lexicographic system into other subject areas, which will create the preconditions for the formation of an integral multidisciplinary digital lexicographic space will be generalized. Working with dictionaries converted into computer text formats is very inefficient and needs to be converted into lexicographic database formats, which is a special task not known in classical lexicography. This is the meaning of the term “parsing
dictionaries”. During investigation, a model of the lexicographic system, which is the basis of XML, was constructed. Further work on converting a printed version of the dictionary into an online system is based on an XML file. The polygraphic design, organization and structure of the printed text of the dictionary are analyzed in order to identify the elements of the conceptual model of the L-system of the SUBT. Based on the conceptual model, the structure of an XML document is proposed, which is to be used as an intermediary between the printed version of the dictionary and its implementation as an online lexicographic system. In the future, it is planned to build a universal parsing procedure, by improving the structure of the XML document.

