Methods of Semantic Analysis in Annotated Generalization of Text Documents

2020;
: pp. 53 - 58
1
Lviv Polytechnic National University, Computer Engineering Department
2
Lviv Polytechnic National University, Computer Engineering Department

The article is devoted to the use of semantic analysis in the generalization of text documents. The analysis of features of the most widespread methods of generalization of text documents and an estimation of quality of results of an estimation is carried out. Features of the improved method of annotative generalization of text documents, which uses the principles of hidden semantic analysis and elements of fuzzy logic to identify semantically important sentences, are presented. It is proposed to use a new approach to evaluating the effectiveness of generalization, based on elements of fuzzy logic and a statistical indicator used to assess the importance of words in the context and class of the document, which allows to determine the correspondence between the original document and its summary. The results of verification of the proposed tools, certifying their effectiveness.

  1. Ahmad K., Vrusias B. PCF Oliveira: Final evaluation and categorization of the text. Proceedings of the 26th Annual ACM SIGIR International Conference on Information Search Research and Development, Toronto, Canada, 2003. pp. 443–444.
  2. Ginek J., Hedgehog K.: A practical approach to automatic generalization of the text. Proceedings of the ELPUB '03 Conference, Guimarães, Portugal, 2003, pp. 378–388.
  3. Gong Yu., Liu X.: Generalization of the general text by means of measurement of relevance and the hidden semantic analysis. Proceedings of the 24th Annual ACM SIGIR International Conference on Information Research and Development, New Orleans, Louisiana, USA, 2001. pp. 19–25.
  4. HP Edmundson: New methods for automatic removal. Journal of the Association of Computers 16 (2), 2001. pp 228–264.
  5. Kupiek J., Pedersen J., Chen F.: Summary: Proceedings of the Eighteenth Annual ACM SIGIR International Conference on Information Search Research and Development, Seattle, Washington, USA, 1995. pp. 68–73
  6. Radev R., Teufel S., Saggion H., Lam V., Blitzer J., Qi H., Celebi A., Liu D., Drabek E.: Problems of evaluation in a large generalization of documents. Issue 41st Annual Meeting of the Association of Computational Linguistics, Sapporo, Japan, 2003. pp. 375–382.
  7. Understanding Inverse Document Frequency: On theoretical arguments for IDF. Stephen Robertson. Reprinted from: Journal of Documentation 60, No. 5, pp. 503–520.
  8. Using Latent Semantic Analysis in Text Summarization and Summary Evaluation. Josef Steinberger, Karel Jezek/https://www.researchgate.net/publication/313673360.
  9. Design System of Image Recognition Based on Neural Network / Vitaliy Yarkun, Yaroslav Paramud and Roman-Andriy Ivantsiv // 15th International Conference. The Experience of Designing and Application of CAD Systems (CADSM`2019). Polyana (Svalyava), Ukraine, February 26 – March, 2019. – рр. 2/41–2/44.
  10. Paramud Y., Yarkun V. Method rozpiznavannya symvoliv na zobragennyakh na osnovi zhortkovoii neiironnoi meregi./ Transactions on Computer systems and networks, Lviv Polytechnic National University Press, 2018, No. 905, pp. 96–105 (in Ukrainian).