Affinity Workshop: Women in Machine Learning

Heart Disease Prediction Using Machine Learning Techniques

Asegunloluwa Babalola · Tekena Solomon


Heart disease is one of the major diseases with life alarming danger in the world. As estimated by the World Health Organization report, about 17 million people die every year as a result of this disease, and it is projected to affect almost 23.6 million people by the year 2030. Heart disease refers to diseases in the heart and the blood vessels. Due to the importance and effect of this disease, early detection is crucial to reducing its effect on mankind. The expensive and unavailability of healthcare in several places has been a major bottleneck in tackling this disease. This work adopts four machine learning algorithms, namely; K-Nearest Neighbor, Decision Trees, Naïve Bayes, and Logistic Regression for the prediction of heart disease. Five different datasets with a total of 1190 records out of which 272 duplicate records were removed from the UCI Machine Learning Repository were used with 11 features. After data cleaning, a total of 272 records were removed. The records had 79% males with ages from 28 to 77 and 21% percent females with ages from 30 to 76.The dataset was split using a Stratified K-Fold over 5 folds for cross-validation and the accuracy of every fold was calculated using the area under the Receiver Operating Characteristic curve. The model was implemented in python programming language because of its vast and easy-to-use libraries. The result of the developed system is a classification of the output into the absence of heart disease (0) and the presence of heart disease (1). It is available on The developed system showed that KNN had the highest accuracy of 91.6%, followed by Naïve Bayes and Logistic Regression with 88% and Decision Trees with 77.2% accuracy. The developed system helps in early heart disease prediction faster because it computes quickly and is always available and stress-free.

Chat is not available.