Oral Presentation
in
Affinity Workshop: LatinX in AI
Oral Presentation 10: An Interactive Framework for Identifying Latent Themes in Large Text Collections
Maria Leonor Pacheco
Experts across diverse academic and professional disciplines struggle with making sense of large amounts of linguistic data. Automatically uncovering latent themes from large textual resources remains an open challenge in natural language processing. Traditionally, researchers and practitioners approach this challenge using noisy unsupervised techniques such as topic models, or by manually identifying the relevant themes and annotating them in the text. In this paper, we propose an interactive framework that combines computational and qualitative techniques to discover and ground latent themes in large text collections. Our framework strikes a balance between automated techniques and manual coding, allowing experts to maintain control of their study while reducing the manual effort required.