Position: Thematic Analysis of Unstructured Clinical Transcripts with Large Language Models
Abstract
This position paper examines how large language models (LLMs) can supportthematic analysis of unstructured clinical transcripts, a widely used but resource-intensive method for uncovering patterns in patient and provider narratives. Weconducted a systematic review of recent studies applying LLMs to thematic analysis,complemented by an interview with a practicing clinician. Our findings revealthat current approaches remain fragmented across multiple dimensions includingtypes of thematic analysis, datasets, prompting strategies and models used, mostnotably in evaluation. Existing evaluation methods vary widely (from qualitativeexpert review to automatic similarity metrics), hindering progress and preventingmeaningful benchmarking across studies. We argue that establishing standardizedevaluation practices is critical for advancing the field. To this end, we proposean evaluation framework centered on three dimensions: validity, reliability, andinterpretability.