Bridging Reading Accessibility Gaps: Responsible Multimodal Simplification with Generative AI
Abstract
We present a multimodal, retrieval-augmented system that simplifies text and images for improved accessibility in education, healthcare, and technical domains. The system integrates Age-of-Acquisition guidance, word-sense disambiguation, graph-based retrieval-augmented generation (RAG), image captioning, and a human-in-the-loop feedback loop. Across 14,000 items, it improves readability over GPT-4 baselines (+22.21\% SARI, +14.11\% Flesch), increases domain retrieval precision (+11\%), and yields further gains with user feedback (+8\% SARI, +15\% satisfaction). In classroom use with 200 K--12 students and additional professional cohorts, users rated outputs as easier to understand and more useful. This Creative Demo highlights how responsible AI design can support accessibility while maintaining semantic integrity.