Development of random forest model for stroke prediction| International Journal of Innovative Science and Research Technology (2024)

Authors : Nnanna, Chidera Egegamuka; Nnanna, Ekedebe; Ajoku, Kingsley Kelechi; Okafor, Chidozie Raymond Patrick; Ozor, Chidinma C

Volume/Issue : Volume 9 - 2024, Issue 4 - April

Google Scholar : https://tinyurl.com/2a3u5kn8

Scribd : https://tinyurl.com/4rtxzs6x

DOI : https://doi.org/10.38124/ijisrt/IJISRT24APR2566

Abstract : Stroke is a significant cause of mortality andmorbidity worldwide, and early detection and preventionof stroke are essential for improving patient outcomes.Machine learning algorithms have been used in recentyears to predict the risk of stroke by leveraging largeamounts of clinical and demographic data. Thedevelopment of a stroke prediction system using RandomForest machine learning algorithm is the main objectiveof this thesis. The primary goal of the project is to increasethe accuracy of stroke detection while addressing theshortcomings of the current system, which include real-time deployment and interpretability issues with logisticregression. The development and use of an ensemblemachine learning-based stroke prediction system,performance optimization through the use of ensemblemachine learning algorithms, performance assessment,and real-time model deployment through the use ofPython Django are among the goals of the research. Thestudy's potential to improve public health by lessening theseverity and consequences of strokes through earlydiagnosis and treatment makes it significant. Datacollection, preprocessing, model selection, evaluation, andreal-time deployment using Python Django are all part ofthe research technique. Our dataset consists of 5110 rowsof tuples and columns with total size of 69kg. Theperformance of our stroke prediction algorithm wasevaluated using confusion metrics-consisting of accuracy,precision, recall and F1-score. At the end of the research,Random Forest model gave an accuracy of 98.5%compared to the existing model logistic regression whichhas 86% accuracy.

Keywords : Machine Learning Algorithms, Preporcessing, Random Forest Model, Confusion Matrix, F-Score Measurement, Stroke Prediction.

References :

  1. L. K. Ursell, J. L. Metcalf, L. W. Parfrey, and R. Knight, “Defining the human microbiota,” Nutr. Rev., vol. 70, no. suppl_1, pp. S38–S44, Aug. 2012, doi: 10.1111/j.1753-4887.2012.00493.x.
  2. A.-K. Am, F. W, H. J, M. F, J. Sa, and K. Jm, “Gut Microbiota and Atrial Fibrillation: Pathogenesis, Mechanisms and Therapies,” Arrhythmia Electrophysiol. Rev., vol. 12, Apr. 2023, doi: 10.15420/aer.2022.33.
  3. N. Hasan and H. Yang, “Factors affecting the composition of the gut microbiota, and its modulation,” PeerJ, vol. 7, p. e7502, Aug. 2019, doi: 10.7717/peerj.7502.
  4. T. Pj, L. Re, H. M, F.-L. Cm, K. R, and G. Ji, “The human microbiota project,” Nature, vol. 449, no. 7164, Oct. 2007, doi: 10.1038/nature06244.
  5. P. J et al., “The NIH Human Microbiota Project,” Genome Res., vol. 19, no. 12, Dec. 2009, doi: 10.1101/gr.096651.109.
  6. A. Leewenhoeck, “An abstract of a letter from Mr. Anthony Leevvenhoeck at Delft, dated Sep. 17. 1683. Containing some microscopical observations, about animals in the scurf of the teeth, the substance called worms in the nose, the cuticula consisting of scales,” Philos. Trans. R. Soc. Lond., vol. 14, no. 159, pp. 568–574, Jan. 1997, doi: 10.1098/rstl.1684.0030.
  7. S. K. Mazmanian, C. H. Liu, A. O. Tzianabos, and D. L. Kasper, “An Immunomodulatory Molecule of Symbiotic Bacteria Directs Maturation of the Host Immune System,” Cell, vol. 122, no. 1, pp. 107–118, Jul. 2005, doi: 10.1016/j.cell.2005.05.007.
  8. F. Magne et al., “The Firmicutes/Bacteroidetes Ratio: A Relevant Marker of Gut Dysbiosis in Obese Patients?,” Nutrients, vol. 12, no. 5, Art. no. 5, May 2020, doi: 10.3390/nu12051474.
  9. A. Hiergeist, J. Gläsner, U. Reischl, and A. Gessner, “Analyses of Intestinal Microbiota: Culture versus Sequencing,” ILAR J., vol. 56, no. 2, pp. 228–240, Aug. 2015, doi: 10.1093/ilar/ilv017.
  10. J. S. Johnson et al., “Evaluation of 16S rRNA gene sequencing for species and strain-level microbiota analysis,” Nat. Commun., vol. 10, no. 1, p. 5029, Nov. 2019, doi: 10.1038/s41467-019-13036-1.
  11. “Impact of 16S rRNA Gene Sequence Analysis for Identification of Bacteria on Clinical Microbiology and Infectious Diseases | Clinical Microbiology Reviews.” Accessed: May 08, 2024. [Online]. Available: https://journals.asm.org/doi/10.1128/cmr. 17.4.840-862.2004
  12. T. J. Sharpton, “An introduction to the analysis of shotgun metagenomic data,” Front. Plant Sci., vol. 5, Jun. 2014, doi: 10.3389/fpls.2014.00209.
  13. “Metabolomic Profiling for Diagnosis and Prognostication in S... : Annals of Surgery.” Accessed: May 08, 2024. [Online]. Available: https://journals.lww.com/annalsofsurgery/abstract/2021/02000/metabolomic_profiling_for_diagnosis_and.12.aspx
  14. “Metabolomics by Gas Chromatography–Mass Spectrometry: Combined Targeted and Untargeted Profiling - Fiehn - 2016 - Current Protocols in Molecular Biology - Wiley Online Library.” Accessed: May 08, 2024. [Online]. Available: https://currentprotocols.onlinelibrary.wiley.com/doi/10.1002/0471142727.mb3004s114
  15. A.-H. Emwas et al., “NMR Spectroscopy for Metabolomics Research,” Metabolites, vol. 9, no. 7, Art. no. 7, Jul. 2019, doi: 10.3390/metabo9070123.
  16. S. H. Ralston, I. D. Penman, M. W. J. Strachan, and R. Hobson, “Davidson’s Principles and Practice of Medicine”.
  17. V. Markides and R. J. Schilling, “Atrial fibrillation: classification, pathophysiology, mechanisms and drug treatment,” Heart, vol. 89, no. 8, pp. 939–943, Aug. 2003, Accessed: May 08, 2024. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1767799/
  18. J. Newbury et al., “Stroke Epidemiology in an Australian Rural Cohort (SEARCH),” Int. J. Stroke Off. J. Int. Stroke Soc., vol. 12, no. 2, pp. 161–168, Feb. 2017, doi: 10.1177/1747493016670174.
  19. Z. Nesheiwat, A. Goyal, and M. Jagtap, “Atrial Fibrillation,” in StatPearls, Treasure Island (FL): StatPearls Publishing, 2024. Accessed: May 08, 2024. [Online]. Available: http://www.ncbi.nlm.nih.gov/ books/NBK526072/
  20. L. Yu et al., “A potential relationship between gut microbes and atrial fibrillation: Trimethylamine N-oxide, a gut microbe-derived metabolite, facilitates the progression of atrial fibrillation,” Int. J. Cardiol., vol. 255, pp. 92–98, Mar. 2018, doi: 10.1016/j.ijcard. 2017.11.071.
  21. A. Eisen et al., “Sudden Cardiac Death in Patients With Atrial Fibrillation: Insights From the ENGAGE AF‐TIMI 48 Trial,” J. Am. Heart Assoc., vol. 5, no. 7, p. e003735, doi: 10.1161/JAHA.116.003735.
  22. D. D. McManus, M. Rienstra, and E. J. Benjamin, “An Update on the Prognosis of Patients With Atrial Fibrillation,” Circulation, vol. 126, no. 10, pp. e143–e146, Sep. 2012, doi: 10.1161/ CIRCULATIONAHA. 112.129759.
  23. V. Mertz et al., “Prognosis of Atrial Fibrillation with or without Comorbidities: Analysis of Younger Adults from a Nationwide Database,” J. Clin. Med., vol. 11, no. 7, p. 1981, Apr. 2022, doi: 10.3390/jcm11071981.
  24. D. R. Van Wagoner and M. K. Chung, “Inflammation, Inflammasome Activation, and Atrial Fibrillation,” Circulation, vol. 138, no. 20, pp. 2243–2246, Nov. 2018, doi: 10.1161/CIRCULATIONAHA.118. 036143.
  25. Z. Y et al., “Gut microbiota dysbiosis promotes age-related atrial fibrillation by lipopolysaccharide and glucose-induced activation of NLRP3-inflammasome,” Cardiovasc. Res., vol. 118, no. 3, Feb. 2022, doi: 10.1093/cvr/cvab114.
  26. C. Yao et al., “Enhanced Cardiomyocyte NLRP3 Inflammasome Signaling Promotes Atrial Fibrillation,” Circulation, vol. 138, no. 20, pp. 2227–2242, Nov. 2018, doi: 10.1161/CIRCULATIONAHA. 118.035202.
  27. K. D et al., “The emerging role of gut microbiota in cardiovascular diseases,” Indian Heart J., vol. 73, no. 3, Jun. 2021, doi: 10.1016/j.ihj.2021.04.008.
  28. Z. K et al., “Disordered gut microbiota and alterations in metabolic patterns are associated with atrial fibrillation,” GigaScience, vol. 8, no. 6, Jun. 2019, doi: 10.1093/gigascience/giz058.
  29. Z. K et al., “Different Types of Atrial Fibrillation Share Patterns of Gut Microbiota Dysbiosis,” mSphere, vol. 5, no. 2, Mar. 2020, doi: 10.1128/mSphere.00071-20.
  30. K. Huang et al., “Gut Microbiota and Metabolites in Atrial Fibrillation Patients and Their Changes after Catheter Ablation,” Microbiol. Spectr., vol. 10, no. 2, pp. e01077-21, Apr. 2022, doi: 10.1128/spectrum.01077-21.
  31. S. Yang et al., “Gut Microbiota-Dependent Marker TMAO in Promoting Cardiovascular Disease: Inflammation Mechanism, Clinical Prognostic, and Potential as a Therapeutic Target,” Front. Pharmacol., vol. 10, Nov. 2019, doi: 10.3389/fphar.2019.01360.
  32. S. H. Zeisel and K.-A. da Costa, “Choline: An Essential Nutrient for Public Health,” Nutr. Rev., vol. 67, no. 11, p. 615, Nov. 2009, doi: 10.1111/j.1753-4887.2009.00246.x.
  33. Z. Wang et al., “Gut flora metabolism of phosphatidylcholine promotes cardiovascular disease,” Nature, vol. 472, no. 7341, pp. 57–63, Apr. 2011, doi: 10.1038/nature09922.
  34. “The transformation of atrial fibroblasts into myofibroblasts is promoted by trimethylamine N-oxide via the Wnt3a/β-catenin signaling pathway - Yang - Journal of Thoracic Disease.” Accessed: May 10, 2024. [Online]. Available: https://jtd.amegroups.org/article/view/64805/html
  35. Z. Li et al., “Gut microbe-derived metabolite trimethylamine N-oxide induces cardiac hypertrophy and fibrosis,” Lab. Invest., vol. 99, no. 3, pp. 346–357, Mar. 2019, doi: 10.1038/s41374-018-0091-y.
  36. B. Bertani and N. Ruiz, “Function and Biogenesis of Lipopolysaccharides,” EcoSal Plus, vol. 8, no. 1, p. 10.1128/ecosalplus.ESP-0001–2018, Aug. 2018, doi: 10.1128/ecosalplus.esp-0001-2018.
  37. R. Okazaki et al., “Lipopolysaccharide Induces Atrial Arrhythmogenesis via Down-Regulation of L-Type Ca2+ Channel Genes in Rats,” Int. Heart. J., vol. 50, no. 3, pp. 353–363, 2009, doi: 10.1536/ihj.50.353.
  38. Y.-Y. Chen et al., “α-adrenoceptor-mediated enhanced inducibility of atrial fibrillation in a canine system inflammation model,” Mol. Med. Rep., vol. 15, no. 6, pp. 3767–3774, Jun. 2017, doi: 10.3892/mmr.2017.6477.
  39. W. H. W. Tang, D. Y. Li, and S. L. Hazen, “Dietary metabolism, the gut microbiota, and heart failure,” Nat. Rev. Cardiol., vol. 16, no. 3, pp. 137–154, Mar. 2019, doi: 10.1038/s41569-018-0108-7.
  40. “Regulation of microglial inflammatory response by sodium butyrate and short‐chain fatty acids - Huuskonen - 2004 - British Journal of Pharmacology - Wiley Online Library.” Accessed: May 10, 2024. [Online]. Available: https://bpspubs.onlinelibrary.wiley.com/doi/10.1038/sj.bjp.0705682
  41. N. Natarajan et al., “Microbial short chain fatty acid metabolites lower blood pressure via endothelial G protein-coupled receptor 41,” Physiol. Genomics, vol. 48, no. 11, pp. 826–834, Nov. 2016, doi: 10.1152/physiolgenomics.00089.2016.
  42. J. Park et al., “Short-chain fatty acids induce both effector and regulatory T cells by suppression of histone deacetylases and regulation of the mTOR–S6K pathway,” Mucosal Immunol., vol. 8, no. 1, pp. 80–93, Jan. 2015, doi: 10.1038/mi.2014.44.
  43. K. Kasahara et al., “Interactions between Roseburia intestinalis and diet modulate atherogenesis in a murine model,” Nat. Microbiol., vol. 3, no. 12, pp. 1461–1471, Dec. 2018, doi: 10.1038/s41564-018-0272-x.
  44. P.-C. Fan et al., “Serum indoxyl sulfate predicts adverse cardiovascular events in patients with chronic kidney disease,” J. Formos. Med. Assoc., vol. 118, no. 7, pp. 1099–1106, Jul. 2019, doi: 10.1016/j.jfma.2019.03.005.
  45. W.-T. Chen et al., “The Uremic Toxin Indoxyl Sulfate Increases Pulmonary Vein and Atrial Arrhythmogenesis,” J. Cardiovasc. Electrophysiol., vol. 26, no. 2, pp. 203–210, 2015, doi: 10.1111/jce.12554.
  46. S. Lekawanvijit, A. Adrahtas, D. J. Kelly, A. R. Kompa, B. H. Wang, and H. Krum, “Does indoxyl sulfate, a uraemic toxin, have direct effects on cardiac fibroblasts and myocytes?,” Eur. Heart J., vol. 31, no. 14, pp. 1771–1779, Jul. 2010, doi: 10.1093/eurheartj/ehp574.
  47. F. Yamagami et al., “Indoxyl Sulphate is Associated with Atrial Fibrillation Recurrence after Catheter Ablation,” Sci. Rep., vol. 8, no. 1, p. 17276, Nov. 2018, doi: 10.1038/s41598-018-35226-5.
  48. H. Koike et al., “The relationship between serum indoxyl sulfate and the renal function after catheter ablation of atrial fibrillation in patients with mild renal dysfunction,” Heart Vessels, vol. 34, no. 4, pp. 641–649, Apr. 2019, doi: 10.1007/s00380-018-1288-0.
  49. M. S. Desai and D. J. Penny, “Bile acids induce arrhythmias: old metabolite, new tricks,” Heart, vol. 99, no. 22, pp. 1629–1630, Nov. 2013, doi: 10.1136/heartjnl-2013-304546.
  50. X. Wang, Z. Li, M. Zang, T. Yao, J. Mao, and J. Pu, “Circulating primary bile acid is correlated with structural remodeling in atrial fibrillation,” J. Interv. Card. Electrophysiol., vol. 57, no. 3, pp. 371–377, Apr. 2020, doi: 10.1007/s10840-019-00540-z.
  51. A. Alonso et al., “Metabolomics and Incidence of Atrial Fibrillation in African Americans: The Atherosclerosis Risk in Communities (ARIC) Study,” PLOS ONE, vol. 10, no. 11, p. e0142610, Nov. 2015, doi: 10.1371/journal.pone.0142610.
  52. S. H. S. A. Kadir et al., “Bile Acid-Induced Arrhythmia Is Mediated by Muscarinic M2 Receptors in Neonatal Rat Cardiomyocytes,” PLOS ONE, vol. 5, no. 3, p. e9689, Mar. 2010, doi: 10.1371/journal.pone.0009689.

Stroke is a significant cause of mortality andmorbidity worldwide, and early detection and preventionof stroke are essential for improving patient outcomes.Machine learning algorithms have been used in recentyears to predict the risk of stroke by leveraging largeamounts of clinical and demographic data. Thedevelopment of a stroke prediction system using RandomForest machine learning algorithm is the main objectiveof this thesis. The primary goal of the project is to increasethe accuracy of stroke detection while addressing theshortcomings of the current system, which include real-time deployment and interpretability issues with logisticregression. The development and use of an ensemblemachine learning-based stroke prediction system,performance optimization through the use of ensemblemachine learning algorithms, performance assessment,and real-time model deployment through the use ofPython Django are among the goals of the research. Thestudy's potential to improve public health by lessening theseverity and consequences of strokes through earlydiagnosis and treatment makes it significant. Datacollection, preprocessing, model selection, evaluation, andreal-time deployment using Python Django are all part ofthe research technique. Our dataset consists of 5110 rowsof tuples and columns with total size of 69kg. Theperformance of our stroke prediction algorithm wasevaluated using confusion metrics-consisting of accuracy,precision, recall and F1-score. At the end of the research,Random Forest model gave an accuracy of 98.5%compared to the existing model logistic regression whichhas 86% accuracy.

Keywords : Machine Learning Algorithms, Preporcessing, Random Forest Model, Confusion Matrix, F-Score Measurement, Stroke Prediction.

Development of random forest model for stroke prediction| International Journal of Innovative Science and Research Technology (2024)
Top Articles
Latest Posts
Article information

Author: Prof. Nancy Dach

Last Updated:

Views: 6703

Rating: 4.7 / 5 (57 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Prof. Nancy Dach

Birthday: 1993-08-23

Address: 569 Waelchi Ports, South Blainebury, LA 11589

Phone: +9958996486049

Job: Sales Manager

Hobby: Web surfing, Scuba diving, Mountaineering, Writing, Sailing, Dance, Blacksmithing

Introduction: My name is Prof. Nancy Dach, I am a lively, joyous, courageous, lovely, tender, charming, open person who loves writing and wants to share my knowledge and understanding with you.