Timezone: »

Machine Learning for the Developing World (ML4D): Global Challenges
Paula Rodriguez Diaz · Konstantin Klemmer · Sally Simone Fobi · Oluwafemi Azeez · Niveditha Kalavakonda · Aya Salama · Tejumade Afonja · Ritwik Gupta

Tue Dec 14 06:00 AM -- 01:40 PM (PST) @ None
Event URL: https://ml4d.notion.site/ml4d/Machine-Learning-for-the-Developing-World-ML4D-2021-548251eab3df4517819c4742c2e5c853 »

While some nations are regaining normality after almost a year and a half since the COVID-19 pandemic struck as a global challenge –schools are reopening, face mask mandates are being dropped, economies are recovering, etc ... –, other nations, especially developing ones, are amid their most critical scenarios in terms of health, economy, and education. Although this ongoing pandemic has been a global challenge, it has had local consequences and necessities in developing regions that are not necessarily shared globally. This situation makes us question how global challenges such as access to vaccines, good internet connectivity, sanitation, water, as well as poverty, climate change, environmental degradation, amongst others, have had and will have local consequences in developing nations, and how machine learning approaches can assist in designing solutions that take into account these local characteristics.

Past iterations of the ML4D workshop have explored: the development of smart solutions for intractable problems, the challenges and risks that arise when deploying machine learning models in developing regions, and building machine learning models with improved resilience. This year, we call on our community to identify and understand the particular challenges and consequences that global issues may result in developing regions while proposing machine learning-based solutions for tackling them.

Additionally, as part of COVID-19's global and local consequences, we will dedicate part of the workshop to understand the challenges in machine learning research in developing regions since the pandemic started. We aim to support and incentivize ML4D research while considering current challenges by including new sections such as a guidance and mentorship session for project proposals and a round table session focused on understanding the constraints faced by researchers in our community.

Tue 6:00 a.m. - 6:15 a.m.
Opening Remarks
Tue 6:15 a.m. - 6:18 a.m.
Invited Talk 1 - Intro (Intro to Invited Talk)
Tue 6:18 a.m. - 6:45 a.m.
Invited Talk 1 (Invited Talk)
Milind Tambe
Tue 6:45 a.m. - 6:55 a.m.
Invite Talk 1 - Q&A (Invited Talk - Q&A)
Tue 6:55 a.m. - 6:57 a.m.
Contributed Talk 1 - Intro (Contributed Talk - Intro)
Tue 6:55 a.m. - 7:10 a.m.
(Contributed Talk)   

Text detection in natural scene images has applications for autonomous driving, navigation help for elderly and blind people. However, the research on Urdu text detection is usually hindered by lack of data resources. We have developed a dataset of scene images with Urdu text. We present the use of machine learning methods to perform detection Urdu text from the scene images. We extract text regions using channel enhanced Maximally Stable Extremal Region (MSER) method. First, we classify text and noise based on their geometric properties. Next, we use a support vector machine for early discarding of non-text regions. To further remove the non-text regions, we use histogram of oriented gradients (HoG) features obtained and train a second SVM classifier. This improves the overall performance on text region detection within the scene images. To support research on Urdu text, We aim to make the data freely available for research use. We also aim to highlight the challenges and the research gap for Urdu text detection.

Hazrat Ali
Tue 7:10 a.m. - 7:13 a.m.
Poster Session - Intro/Info (Intro)
Tue 7:15 a.m. - 8:40 a.m.
Poster Session
Tue 8:40 a.m. - 8:43 a.m.
Invited Talk 2 - Intro (Intro to Invited Talk)
Tue 8:43 a.m. - 9:10 a.m.
Invited Talk 2 (Invited Talk)
Stephanie Sy
Tue 9:10 a.m. - 9:20 a.m.
Invite Talk 2 - Q&A (Q&A)
Tue 9:20 a.m. - 9:25 a.m.
Problem Pitches - Intro (Intro)
Tue 9:25 a.m. - 10:55 a.m.
Problem Pitches Session
Tue 11:10 a.m. - 11:15 a.m.
Contributed Talks 2 and 3 - Intro (Intro)
Tue 11:15 a.m. - 11:30 a.m.
(Contributed Talk)   

In this paper, we seek to reduce the communication barrier between the hearing-impaired community and the larger society who are usually not familiar with sign language in the region with the largest occurrences of hearing disability cases, i.e. sub-Saharan Africa while using Nigeria as a case study. The dataset is a pioneer dataset for the Nigerian Sign Language and it was created via collaboration with relevant stakeholders. We preprocessed the data in readiness for two different object detection models and a classification model and employed diverse evaluation metrics. We convert the sign texts to speech and deploy the best performing model in a lightweight sign-to-speech machine learning application that works in real-time and achieves impressive results converting sign words/phrases to text and speech.

Steven Kolawole Kolawole
Tue 11:30 a.m. - 11:45 a.m.
(Contributed Talk)   

Governments and international organizations the world over are investing towards the goal of achieving universal energy access for improving socio-economic development. However, in developing settings, monitoring electrification efforts is typically inaccurate, infrequent, and expensive. In this work, we develop and present techniques for high-resolution monitoring of electrification progress at scale. Specifically, our 3 unique contributions are: (i) identifying areas with(out) electricity access, (ii) quantifying the extent of electrification in electrified areas (percentage/number of electrified structures), and (iii) differentiating between customer types in electrified regions (estimating the percentage/number of residential/non-residential electrified structures). We combine high-resolution 50 cm daytime satellite images with Convolutional Neural Networks (CNNs) to train a series of classification and regression models. We evaluate our models using unique ground truth datasets on building locations, building types (residential/non-residential), and building electrification status. Our classification models show a 92% accuracy in identifying electrified regions, 85% accuracy in estimating percent of (low/high) electrified buildings within the region, and 69% accuracy in differentiating between (low/high) percentage of electrified residential buildings. Our regressions show R2 scores of 78% and 80% in estimating the number of electrified buildings and number of residential electrified building in images respectively. We also demonstrate the generalizability of our models in never-before-seen regions to assess their potential for consistent and high-resolution measurements of electrification in emerging economies, and conclude by highlighting opportunities for improvement.

Zeal Shah
Tue 11:45 a.m. - 12:45 p.m.
Panel Session (Panel)
Rajius Idzalika · Kathleen Siminyu · David Hughes · Alvaro Riascos
Tue 12:45 p.m. - 1:00 p.m.
Virtual Coffee Break (Break)
Tue 1:00 p.m. - 1:03 p.m.
Invited Talk 3 - Intro (Intro)
Tue 1:03 p.m. - 1:30 p.m.
Invited Talk 3 (Invited Talk)
Tue 1:30 p.m. - 1:40 p.m.
Invite Talk 3 - Q&A (Q&A)

Author Information

Paula Rodriguez Diaz (Harvard University (SEAS))
Konstantin Klemmer (University of Warwick, The Alan Turing Institute)
Sally Simone Fobi (Columbia University)
Oluwafemi Azeez (Borealis AI)
Niveditha Kalavakonda (University of Washington)
Aya Salama (Aigorithm Tech)
Tejumade Afonja (Saarland University)

Tejumade Afonja is a Graduate Student at Saarland University studying Computer Science. Previously, she worked as an AI Software Engineer at InstaDeep Nigeria. She holds a B.Tech in Mechanical Engineering from Ladoke Akintola University of Technology (2015). She’s currently a remote research intern at Vector Institute where she is conducting research in the areas of privacy, security, and machine learning. Tejumade is the co-founder of AI Saturdays Lagos, an AI community in Lagos, Nigeria focused on conducting research and teaching machine learning related subjects to Nigerian youths. Tejumade is one of the 2020 Google EMEA Women Techmakers Scholar. Tejumade was a co-organizer for ML4D 2019 NeurIPS workshop and she is serving as the lead organizer this year. She is affiliated with several other workshops like BIA, WIML, ICLR, Deep Learning Indaba, AI4D, and DSA where she occasionally serves as a volunteer or mentor.

Ritwik Gupta (University of California, Berkeley)

I am currently a first year Ph.D. student at the University of California, Berkeley co-advised by Drs. Trevor Darrell and Shankar Sastry. My focus is on efficient machine learning for humanitarian assistance and disaster response and the policy surrounding the use of ML in developing areas. I am also the Founder and President of Neural Tangent, a company aimed at creating ML solutions to humanitarian assistance and disaster response problems. I also provide consulting in the space of machine learning, artificial intelligence, edge computing, and remote sensing. Feel free to poke around the site and hopefully you find something thought provoking. If you’re going to be stopping by Berkeley at some point, please reach out if you want a tour or just want to chat!

More from the Same Authors