трансформери

Intellectual Analysis of Textual Data in Social Networks Using BERT and XGBoost

This article presents a comprehensive approach to sentiment analysis in social networks by leveraging modern text processing methods and machine learning algorithms. The primary focus is the integration of the Sentence-BERT model for text vectorization and XGBoost for sentiment classification. Using the Sentiment140 dataset, an extensive study of text messages labeled with sentiment annotations was conducted. The Sentence-BERT model enables the generation of high-quality vector representations of textual data, preserving both lexical and contextual relationships between words.

Data Set Formation Method for Checking the Quality of Learning Language Models of the Transitive Relation in the Logical Conclusion Problem Context

A method for data set formation has been developed to verify the ability of pre-trained models to learn transitivity dependencies. The generated data set was used to test the quality of learning the transitivity dependencies in the task of natural language inference (NLI). Testing of a data set with a size of 10,000 samples (MultiNLI) used to test the RoBerta model.