Automatization of functions determination and productivity of terms in terminology systems

: pp. 47 - 49
Zhytomyr Ivan Franko State University

The essence of the problem is the complexity of the analysis of large-scale terminology systems containing hundreds or thousands of terms. To solve this problem it is suggested to use the method of network analysis proposed by E. F. Skorohodko. The method involves the representation of the terminology system in the form of a matrix and the separation of the initial, derivative and final terms, as well as the definition of their productivity. Given the correct representation in the dictionary of output data (that is, according to the accepted format), the method can be quite simply automated. An example is given of the format of the developed dictionary, as well as methods of its formalized analysis.

The essence of the problem studied is the following: when people compose traditional systems of terminology, it is usually difficult to determine the productivity of the terms in these systems, and, therefore, determine the functions of these terms. The lack of data about productivity and functions of terms makes it impossible to effectively edit existing terms, as well as to further extend the terminology system.

The previously unresolved part of the problem is the need for automation of the analysis of terminology systems. Automation of the solution to this problem will allow us to obtain objective data about the productivity and functions of each term for a system of terminology of any size.

The purpose of the article is to show the possible way of such automation.

Composing terminological dictionaries, it is expedient to use such technology to determine the functions and productivity of terms.

1. Prepare a dictionary with a text editor in docrtf, etc format. Each record of such dictionary should have a heading and its definition (a divider, as it is traditionally in dictionaries, may be dashes). Of course, it is assumed that the definition of terms must meet the requirements of formal logic to the definitions.

2. In the definitions of terms in the manual mode, it is necessary to distinguish the fonts of those terms (bold, italic, underlined, etc.) which are used in this dictionary as a header.

3. Using any spreadsheet program, such as Excel, to construct the matrix Nk × Nk, where Ni is the header term of the dictionary (i = 1, 2, 3 ... k). In this table, the rows will specify the header terms, and the columns are the terms used in their definitions.

4. At the intersection of the rows and columns in the cells, it is necessary to specify the definitions terms of which have other heading terms. This operation can be performed both manually and in automated modes. To perform this operation in automated mode, masks (usually three characters) must be applied at the end of the word to identify the terms. To automate this operation you need to develop special software. The complexity of the creation of such means corresponds to the complexity of the thesis for masters studying in one of the specialties in the direction of "Programming".

5. Based on the matrix Nk × Nk, it is needed to determine the functions of each term (that is, to calculate which terms are initial, derivative, finite ones), as well as their productivity (in the number of occurrences in the definition of other terms). The result of the processing can be presented in the form of a table, in which in alphabetical order for each term its functions and productivity are indicated.

6. Using the available software packages to visualize the data, it is recommended to submit the matrix Nk × Nk in the form of a grid graph.

In the framework of the above-mentioned technology (see items 1 and 2 above), we have issued a dictionary of publishing terms (more than 800 terms). However, further research of this terminology system is hindered by the lack of the above software tools.

Taking into account the above mentioned, we consider it expedient to initiate under the auspices of the Technical Committee for the Standardization of Scientific and Technical Terminology for the creation of the above mentioned software tools that could be useful for studying the terminology of the Ukrainian language in our state under the auspices of the L'viv Polytechnic National University. Such software is expedient to distribute on a royalty-free basis.

1. Konverskyi A. Ie. Lohika (tradytsiina ta suchasna) : pidruchnyk / A. Ie. Konverskyi. – K. : Tsentr navchalnoi literatury, 2004. – 536 s. 2. Partyko Z. V. Slovnyk vydavnychykh terminiv / Z. V. Partyko. – Zaporizhzhia : KPU, 2013. – 68 s. 3. Skorokhodko E. F. Semantycheskye sety y avtomatycheskaia obrabotka teksta / E. F. Skorokhodko. – K. : Nauk. dumka, 1983. – 218 s. 4. Skorokhodko E. F. Termin u naukovomu teksti (do stvorennia terminotsentrychnoi teorii naukovoho dyskursu) / E. F. Skorokhodko. – K. : Lohos, 2006. – 98 s.

Partyko Z. Automatization of functions determination and productivity of terms in terminology systems // Website of TC STTS: Herald of L'viv Polynechnic National University "Problems of Ukrainian Terminology". – 2017. – # 869.