Isabel Michelle
Abstract
Mental health communication in India is linguistically fragmented, culturally diverse, and often underrepresented in clinical NLP Khanuja et al. [2021], Prakash et al. [2024]. Current health ontologies and mental health resources are dominated by diagnostic frameworks centered on English or Western culture, leaving a gap in representing patient distress expressions in Indian languages Kirmayer et al. [2017], Paniagua [2018]. To address this, we propose Cross-Lingual Graphs of Patient Distress Expressions (CL-PDE), a framework for building cross-lingual mental health ontologies through graph-based methods that capture culturally embedded expressions of distress, align them across languages, and link them with clinical terminology (Figure ??). The corpus comprises patient narratives from counseling transcripts, helplines, online forums, and community health workers across Indian languages, capturing socioeconomic and regional diversity with linguistic and cultural context annotations. The data is modeled as a heterogeneous graph with two node types: Expression Nodes (culture-bound idioms and metaphors) and Concept Nodes (diagnostic categories from ICD-11, DSM-5, and DSM-5 Cultural Concepts of Stress). Edges en- code intra-lingual links (related expressions within one language), cross-lingual links (equivalent expressions across languages), and expression-concept links (patient language to clinical categories), each annotated with metadata for transparency. This structure preserves culturally grounded ex- pressions without direct clinical equivalents while providing pathways to standardized psychiatric frameworks. Graph construction integrates multilingual LLMs with human-in-the-loop validation to handle uncertain, context-dependent mappings.
Our approach addresses critical gaps in healthcare communication by grounding AI systems in culturally valid representations, allowing more inclusive and patient-centric NLP tools for mental health care in multilingual contexts.