FoCus: Improving Faithfulness in Chain-of-Thoughts by Training on Structured Reasoning Data
Abstract
Chain-of-Thought (CoT) prompting improves interpretability of large language models (LLMs) but often lacks faithfulness, yielding post-hoc rationalizations that can be unreliable. To address this issue, we propose FoCus, a condition-utilized framework that enumerates problem conditions and grounds reasoning on them. Using a two-stage pipeline, FoCus generates faithful reasoning traces to fine-tune LLMs. On four reasoning benchmarks, FoCus improves average faithfulness—by up to 22.95% for DeepSeek-Qwen3-8B, 31.05% for Nemotron-7B, and 29.4% for Qwen3-8B—over both normal (original) models and prompt-engineered baselines. These findings demonstrate that explicit condition grounding is an effective strategy for enhancing faithful reasoning in LLMs.