Poster
in
Affinity Workshop: Black in AI
Predicting the Level of Anemia among Ethiopian Pregnant Women using Homogeneous Ensemble Machine Learning Algorithm and Deploy on Cloud-based Framework.
Belayneh Dejene · Tesfamariam Abuhay · Dawit Shibabaw
Keywords: [ Computer Vision ] [ Deep Learning ] [ artificial intelligence ] [ machine learning ]
This study aims to predict the level of anemia among pregnant women in the case of Ethiopia using homogeneous ensemble machine learning algorithms and deploy them on Heroku-based cloud computing for potential users. In this study, the data were gathered from the Ethiopian demographic, health survey (EDHS) collected three times at five-year intervals. The data were preprocessed to get quality data that are suitable for the machine learning algorithm to develop a model that predicts the levels of anemia among pregnant. The study was conducted following a design science approach. Random forest, cat boost, and extreme gradient boosting with class decomposition (one versus one and one versus rest) and without class decomposition were employed to build the predictive model. For constructing the proposed model, nine experiments were conducted with a total of 29104 instances with 23 features, and a training and testing dataset split ratio of 80/20. The overall accuracy of random forest, extreme gradient boosting, and cat boost without class decompositions are 91.34%, 94.26%, and 97.08.90%, respectively. The overall accuracy of random forest, extreme gradient boosting, and cat boost with one versus one are 94.4%, 95.21%, and 97.44%, respectively. The overall accuracy of random forest, extreme gradient boosting, and cat boost with one versus the rest are 94.4%, 94.54%, and 97.6%, respectively. Finally, the researcher decided to use cat boost algorithms with one versus the rest for further use in the development of artifacts, model deployment, risk factor analysis, and generating rules because it has registered better performance with 97.6% accuracy. We identified the most determinant risk factors using feature importance. Some of them are the duration of the current pregnancy, age in 5-year groups, source of drinking water, respondent's occupation, number of household members, wealth index, husband/partner's education level, and birth history.