Method for Detection of Disinformation Based on Text Data Analysis Using TF-IDF and Contextual Vector Representations
The article considers an approach to detecting fake news in the digital environment through text analysis using machine learning and natural language processing methods. The proposed method is based on a hybrid text representation combining frequency features (TF-IDF) and contextual embeddings obtained using the IBM Granite model. A complete data processing cycle was developed, covering the stages of exploratory analysis (EDA), text preprocessing and tokenization, forming vector representations, training a logistic regression model, and obtaining key metrics.