Deep Learning for Language Documentation and Revitalization

When

3 to 4:30 p.m., Nov. 3, 2023

Deep learning algorithms have made enough progress in recent years that they can be used to create tools to aid in language documentation and revitalization. In this talk I will focus on examples of NLP work for Cook Islands Māori, while also using examples from Indigenous languages from Costa Rica. These examples include speech recognition and synthesis, syntactic parsing, phonetic documentation, machine translation and embedding analysis. We have used these to augment corpora, help train language teachers, expand domains of usage for the language, and explore the applicability of deep learning to these tasks. I will show some of the lessons learned working with extremely low-resource scenarios, and the differences between simulating low-resource computing and actually working with Indigenous and minority languages.