Stock price values are known for their volatility due to multiple factors making their predictability a difficult task. As social media posts and news can be considered as one of the major factors in price change, we aim in this paper to predict the next-day stock price of 4~different companies, using both social media and financial datasets that range from September 30, 2021, to September 30, 2022, as inputs. The datasets go through a preprocessing pipeline that includes sentiment analysis methods, where tweets are classified by employing TextBlob and finetuned RoBERTa to extract new features. The best model produces a 93% R$^2$ score and an RMSE value of 1.35.
- Adi Laksono R., Sungkono K., Sarno R., Wahyuni C. Sentiment Analysis of Restaurant Customer Reviews on TripAdvisor using Naive Bayes. 2019 12th International Conference on Information & Communication Technology and System (ICTS). 49–54 (2019).
- Duan Y., Liu L., Wang Z. COVID-19 Sentiment and the Chinese Stock Market: Evidence from the Official News Media and Sina Weibo. Research in International Business and Finance. 58, 101432 (2021).
- Pagolu V. S., Reddy K. N., Panda G., Majhi B. Sentiment analysis of Twitter data for predicting stock market movements. 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES). 1345–1350 (2016).
- zhayunduo/roberta-base-stocktwits-finetuned Hugging Face. https://huggingface.co/zhayunduo/roberta-base-stocktwits-finetuned (2024).
- Rebala G., Ravi A., Churiwala S. Machine Learning Definition and Basics. An Introduction to Machine Learning, G. Rebala, A. Ravi, and S. Churiwala, Eds., Cham: Springer International Publishing. 1–17 (2019).
- Hochreiter S. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. 06 (02), 107–116 (1998).
- Hochreiter S., Schmidhuber J. Long Short-Term Memory. Neural Computation. 9 (8), 1735–1780 (1997).
- Tarsi M., Douzi S., Marzak A. Forecasting financial market dynamics: an in-depth analysis of social media data for predicting price movements in the next day. Social Network Analysis and Mining. 14 (1), 169 (2024).
- Aasi A., Imtiaz S. A., Qadeer H. A., Singarajah M., Kashef R. Stock Price Prediction Using a Multivariate Multistep LSTM: A Sentiment and Public En-gagement Analysis Model. 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS). 1–8 (2021).
- Masini R. P., Medeiros M. C., Mendes E. F. Machine learning advances for time series forecasting. Journal of Economic Surveys. 37 (1), 76–111 (2023).
- Tutorial: Quickstart – TextBlob 0.18.0.post0 documentation. https://textblob.readthedocs.io/en/dev/quickstart.html#sentiment-analysis.
- Devlin J., Chang M. W., Lee K., Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Preprint arXiv:1810.04805 (2019).
- Bello A., Ng S.-C.,Leung M.-F. A BERT Framework to Sentiment Analysis of Tweets. Sensors. 23 (1), 506 (2023).
- Li M., Li W., Wang F., Jia X., Rui G. Applying BERT to analyze investor sentiment in stock market. Neural Computing and Applications. 33 (10), 4663–4676 (2021).
- Mahgoub A., et al. Sentiment Analysis: Amazon Electronics Reviews Using BERT and Textblob. 2022 20th International Conference on Language Engineering (ESOLEC). 6–10 (2022).
- Thormann M.-L., Farchmin J., Weisser C., Kruse R.-M., Säfken B., Silbersdorff A. Stock Price Predictions with LSTM Neural Networks and Twitter Sentiment. Statistics, Optimization & Information Computing. 9 (2), 2 (2021).
- Fuller A. Predicting Stock Market Indicators Through Sentiment Analysis on Twitter, report, University of Iowa. https://hal.science/hal-03516008 (2022).
- Jena P. R., Majhi R. Are Twitter sentiments during COVID-19 pandemic a critical determinant to predict stock market movements? A machine learning approach. Scientific African. 19, e01480 (2023).
- Tarsi M., Douzi S., Marzak A. Predicting stock price using LSTM and Social Media dataset. 2023 3rd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET). 1–4 (2023).
- Mehtab S., Sen J. A Robust Predictive Model for Stock Price Prediction Using Deep Learning and Natural Language Processing. TechRxiv. (2021).