Twitter-sentiment analysis of Moroccan diabetic using Fuzzy C-means SMOTE and deep neural network

Effectively managing diabetes as a lifestyle condition involves fostering awareness, and social media is a powerful tool for this purpose.  Analyzing the content of tweets on platforms like Twitter can greatly inform health communication strategies aimed at raising awareness about diabetes within the Moroccan community.  Unfortunately, the corpus of tweets is imbalanced and the feature extraction leads to data sets with a very high dimension which affects the quality of sentiment analysis.  This study focused on analyzing the content, sentiment, and reach of tweets specifically related to diabetes in Morocco.  The proposed strategy processes in five steps: (a) data collection from Twitter platforms and manual labilization, (b) feature extraction using TF-IDF technique, (c) dimension reduction using deep neural network, (d) data balancing using Fuzzy $C$-Means SMOTE, and (e) tweets classification using five well-known classifiers.  The proposed approach was compared with the classic system, which works directly on very large, unbalanced tweets.  In terms of recall, precision, F1-score, and CPU time, the proposed system can perform highly accurate sentiment analysis in a reasonable CPU time.

  1. Smailhodzic E., Hooijsma W., Boonstra A., Langley D. J.  Social media use in healthcare: A systematic review of effects on patients and on their relationship with healthcare professionals.  BMC Health Services Research.  16, 442 (2016).
  2. Rajani R., Berman D. S., Rozanski A.  Social networks – are they good for your health? The era of Facebook and Twitter.  QJM: An International Journal of Medicine.  104 (9), 819–820 (2011).
  3. Murray C. J. L., Lopez A. D., Wibulpolprasert S.  Monitoring global health: time for new solutions.  BMJ.  329, 1096 (2004).
  4. Moorhead S. A., Hazlett D. E., Harrison L., Carroll J. K., Irwin A., Hoving C.  A New Dimension of Health Care: Systematic Review of the Uses, Benefits, and Limitations of Social Media for Health Communication.  Journal of Medical Internet Research.  15 (4), e85 (2013).
  5. Korda H., Itani Z.  Harnessing Social Media for Health Promotion and Behavior Change.  Health Promotion Practice.  14 (1), 15–23 (2013).
  6. Richardson C. R., Buis L. R., Janney A. W., Goodrich D. E., Sen A., Hess M. L., et al.  An Online Community Improves Adherence in an Internet-Mediated Walking Program. Part 1: Results of a Randomized Controlled Trial.  Journal of Medical Internet Research.  12 (4), e71 (2010).
  7. Diamond J.  Diabetes in India.  Nature.  469, 478–479 (2011).
  8. Ho E. Y., Chesla C. A., Chun K. M.  Health Communication With Chinese Americans About Type 2 Diabetes.  The Science of Diabetes Self-Management and Care.  38 (1), 67–76 (2012).
  9. White R. O., Eden S., Wallston K. A., Kripalani S., Barto S., Shintani A., et al.  Health communication, self-care, and treatment satisfaction among low-income diabetes patients in a public health setting. Patient Education and Counseling.  98 (2), 144–149 (2015).
  10. Haghravan S., Mohammadi-Nasrabadi F., Rafraf M.  A critical review of national diabetes prevention and control programs in 12 countries in Middle East.  Diabetes & Metabolic Syndrome: Clinical Research & Reviews.  15 (1), 439–445 (2021).
  11. Kumar A., Goel M. K., Jain R. B., Khanna P., Chaudhary V.  India towards diabetes control: Key issues.  Australasian Medical Journal.  6 (10), 524–531 (2013).
  12. Lenoir P., Moulahi B., Azé J., Bringay S., Mercier G., Carbonnel F.  Raising Awareness About Cervical Cancer Using Twitter: Content Analysis of the 2015 \#SmearForSmear Campaign.  Journal of Medical Internet Research.  19 (10), e344 (2017).
  13. Nisar S., Shafiq M.  Framework for efficient utilisation of social media in Pakistan's healthcare sector.  Technology in Society.  56, 31–43 (2019).
  14. Diddi P., Lundy L. K.  Organizational Twitter Use: Content Analysis of Tweets during Breast Cancer Awareness Month.  Journal of Health Communication.  22 (3), 243–253 (2017).
  15. Von Muhlen M., Ohno-Machado L.  Reviewing social media use by clinicians.  Journal of the American Medical Informatics Association.  19 (5), 777–781 (2012).
  16. Alanzi T.  Role of Social Media in Diabetes Management in the Middle East Region: Systematic Review.  Journal of Medical Internet Research.  20 (2), e58 (2018).
  17. Elnaggar A., Ta Park V., Lee S. J., Bender M., Siegmund L. A., Park L. G.  Patients' Use of Social Media for Diabetes Self-Care: Systematic Review.  Journal of Medical Internet Research.  22 (4), e14209 (2020).
  18. Greene J. A., Choudhry N. K., Kilabuk E., Shrank W. H.  Online Social Networking by Patients with Diabetes: A Qualitative Evaluation of Communication with Facebook.  Journal of General Internal Medicine.  26, 287–292 (2011).
  19. Stellefson M., Paige S., Apperson A., Spratt S.  Social Media Content Analysis of Public Diabetes Facebook Groups.  Journal of Diabetes Science and Technology.  13 (3), 428–438 (2019).
  20. Årsand E., Bradway M., Gabarron E.  What Are Diabetes Patients Versus Health Care Personnel Discussing on Social Media?  Journal of Diabetes Science and Technology.  13 (2), 198–205 (2019).
  21. Staite E., Zaremba N., Macdonald P., Allan J., Treasure J., Ismail K., Stadler M.  'Diabulima' through the lens of social media: a qualitative review and analysis of online blogs by people with Type 1 diabetes mellitus and eating disorders.  Diabetic Medicine.  35, 1329–1336 (2018).
  22. Karami A., Dahl A. A., Turner-McGrievy G., Kharrazi H., Shaw G.  Characterizing diabetes, diet, exercise, and obesity comments on Twitter.  International Journal of Information Management.  38 (1), 1–6 (2018).
  23. Shaw G., Karami A.  Computational content analysis of negative tweets for obesity, diet, diabetes, and exercise.  Proceedings of the Association for Information Science and Technology.  54 (1), 357–365 (2017).
  24. Liu Y., Mei Q., Hanauer D. A., Zheng K., Lee J. M.  Use of Social Media in the Diabetes Community: An Exploratory Analysis of Diabetes-Related Tweets.  JMIR Diabetes.  1 (2), e4 (2016).
  25. Patel K. D., Zainab K., Heppner A., Srivastava G., Mago V.  Using Twitter for diabetes community analysis.  Network Modeling Analysis in Health Informatics and Bioinformatics.  9, 36 (2020).
  26. Patel K. D., Heppner A., Srivastava G., Mago V.  Analyzing use of Twitter by diabetes online community.  ASONAM'19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.  937–944 (2019).
  27. Salas-Zárate M. D. P., Medina-Moreira J., Lagos-Ortiz K., Luna-Aveiga H., Rodríguez-García M. Á., Valencia-García R.  Sentiment Analysis on Tweets about Diabetes: An Aspect-Level Approach.  Computational and Mathematical Methods in Medicine.  2017, 5140631 (2017).
  28. Gabarron E., Dorronzoro E., Rivera-Romero O., Wynn R.  Diabetes on Twitter: A Sentiment Analysis.  Journal of Diabetes Science and Technology.  13 (3), 439–444 (2018).
  29. Hong L., Ahmed A., Gurumurthy S., Smola A., Tsioutsiouliklis K.  Discovering geographical topics in the twitter stream.  WWW'12: Proceedings of the 21st international conference on World Wide Web.  769–778 (2012).
  30. Raamkumar A. S., Pang N., Foo S.  When countries become the talking point in microblogs: Study on country hashtags in Twitter.  First Monday.  21 (1), 1–4 (2016).
  31. Alhabash S., Ma M.  A Tale of Four Platforms: Motivations and Uses of Facebook, Twitter, Instagram, and Snapchat Among College Students?  Social Media + Society.  3 (1), 1–13 (2017).
  32. King D., Ramirez-Cano D., Greaves F., Vlaev I., Beales S., Darzi A.  Twitter and the health reforms in the English national health service.  Health Policy.  110 (2–3), 291–297 (2013).
  33. Bounabi M., El Moutaouakil K., Satori K.  The Optimal Inference Rules Selection for Unstructured Data Multi-Classification.  Statistics, Optimization & Information Computing.  10 (1), 225–235 (2022).
  34. El Moutaouakil K., Ahourag A., Chellak S., Baїzri H., Cheggour M.  Fuzzy Deep Daily Nutrients Requirements Representation.  Revue d'Intelligence Artificielle.  36 (2), 263–269 (2022).
  35. El Moutaouakil K., Saliha C., Chellak S. Optimal fuzzy deep daily nutrients requirements representation: Application to optimal Morocco diet problem.  Mathematical Modeling and Computing.  9 (3), 607–615 (2022).
  36. El Moutaouakil K., Ahourag A., Chakir S., Kabbaj Z., Chellack S., Cheggour M., Baizri H.  Hybrid firefly genetic algorithm and integral fuzzy quadratic programming to an optimal Moroccan diet.  Mathematical Modeling and Computing.  10 (2), 338–350 (2023).
  37. El Ouissari A., El Moutaouakil K.  Density based fuzzy support vector machine: application to diabetes dataset.  Mathematical Modeling and Computing.  8 (4), 747–760 (2020).
  38. El Moutaouakil K., Roudani M., El Ouissari A.  Optimal Entropy Genetic Fuzzy-C-Means SMOTE (OEGFCM-SMOTE).  Knowledge-Based Systems.  262, 110235 (2023).
  39. El Moutaouakil K., Palade V., Safouan S., Charroud A.  FP-Conv-CM: Fuzzy Probabilistic Convolution C-Means.  Mathematics.  11 (8), 1931 (2023).
  40. El Moutaouakil K., El Ouissari A., Hicham B., Saliha C., Cheggour M.  Multi-objectives optimization and convolution fuzzy C-means: Control of diabetic population dynamic.  RAIRO-Operations Research.  56 (2), 3245–3256 (2022).
  41. Wang Y., Pan Z., Dong J.  A new two-layer nearest neighbor selection method for kNN classifier.  Knowledge-Based Systems.  235, 107604 (2022).
  42. Choubey D. K., Kumar M., Shukla V., Tripathi S., Dhandhania V. K.  Comparative analysis of classification methods with PCA and LDA for diabetes.  Current Diabetes Reviews.  16 (8), 833–850 (2020).
  43. Saritas M. M., Yasar A.  Performance analysis of ANN and Naive Bayes classification algorithm for data classification.  International Journal of Intelligent Systems and Applications in Engineering.  7 (2), 88–91 (2019).
  44. Chen S., Webb G. I., Liu L., Ma X.  A novel selective naïve Bayes algorithm.  Knowledge-Based Systems.  192, 105361 (2020).