The article outlines the problem of finding meaningful units in electronic text documents and analyzes the main shortcomings of existing approaches of extracting knowledge from textual information. The article is devoted to the study of the peculiarities of the process of construction of logic and linguistic models of electronic text documents, in particular the description and research of the peculiarities of knowledge bases of the system of automated construction of logic and linguistic models of Ukrainian- language text documents.
The study analyzes the scheme of services for determining the uniqueness of electronic text documents, considered their main characteristics while checking the originality of the article. An author presents structure of the system for comparative analysis of electronic text documents by content, she outlines the operating principle each of its major components.
The paper analyzes the shortcomings of existing models of text documents, it proposes an uniform content model of text that formed on the basis of synthesis of logic and linguistic models of its sentences. The work shows basic steps of the proposed algorithm for constructing content models of text
The article analyzes the concept of formal algorithms; it demonstrates an example of using the commands of Turing machine to analyze natural language sentence. The research proposes the algorithm of automated linguistic processing of electronic documents based on constructing logico-linguistic models of sentences.