AmericasNLP aims to encourage and increase the visibility of research on machine learning approaches for Indigenous languages of the Americas, as, until recently, those have often been overlooked by researchers. For the Second AmericasNLP Competition: Speech-to-Text Translation for Indigenous Languages of the Americas we ask participants to develop or contribute to the development of speech-to-text translation systems for five Indigenous languages of the Americas (Bribri, Guaraní, Kotiria, Quechua and Wa’ikhana), for which available resources are extremely limited. The main task of this competition is speech-to-text translation, and we additionally invite submissions to its two subtasks: automatic speech recognition and text-to-text machine translation.
Wed 5:00 a.m. - 5:15 a.m.
|
Fifteen-minute Competition Overview Video
(
Overview
)
SlidesLive Video » Welcome, workshop schedule, and competition overview. |
Abteen Ebrahimi · Katharina Kann · Thang Vu · Sofía Flores-Solórzano · Rolando Coto-Solano · Rodolfo Joel Zevallos Salazar · Pavel Denisov · Manuel Mager · Luis Chiruzzo · Kristine Stenzel · John E. Ortega · Ivan Vladimir Meza Ruiz · Hilaria Cruz · Félix Arturo Oncevay Marcos · Alexis Palmer · Aldo Alvarez · Adam Wiemerslage
|
Wed 5:15 a.m. - 6:00 a.m.
|
Challenges and Opportunities in NLP for Under-represented Languages
(
Invited Talk by Sebastian Ruder
)
Natural language processing (NLP) technology has seen tremendous improvements in recent years but most of these successes have been concentrated in languages with large amounts of data. In this talk, I will discuss challenges and potential solutions on the way to scaling NLP to more of the world's 7000 languages. In particular, I will highlight recent progress in NLP for African languages and present methods that are applicable to languages with limited data such as employing alternative sources of data and multi-modal information. |
🔗 |
Wed 6:00 a.m. - 7:15 a.m.
|
Poster Session: Competition Participants
(
Poster Session
)
The teams who have participated in the Second AmericasNLP Competition: Speech-to-Text Translation for Indigenous Languages of the Americas present their approaches, 1 poster per team. |
🔗 |
Wed 7:15 a.m. - 8:00 a.m.
|
Challenges in Achieving a Corpus Infrastructure to Advance Research in Computational Linguistics and Natural Language Processing in Native American Languages
(
Invited Talk by Hilaria Cruz
)
Natural Language Processing researchers and computational linguists frequently express disappointment and frustration over the lack of corpus in endangered languages that they can use to train and test their language models. This hindrance, caused in large part by a dwindling number of speakers and language keepers to create new data such as stories, prayers, political speeches, and everyday conversation. Coupled with this is the severe lack of capacity among speakers of endangered languages to prepare a corpus including transcribers, annotators, and translators. What can NLP researchers do to help create and facilitate the corpus in these languages? Collaborating with communities to increase capacity to develop corpora with members would be a first step. Furthermore, teaching basic programming courses in local high schools and colleges, working with legacy materials in language archives, and doing fieldwork to collect data alongside community members would greatly enhance the creation of endangered language corpora for NLP. |
🔗 |