Poster
in
Workshop: CogInterp: Interpreting Cognition in Deep Learning Models

Deconstructing the Reasoning Process of a Neuro-Fuzzy Agent: From Learned Concepts to Natural Language Narratives

Yumin Zhou · Whye Tung · Hiok Quek

Project Page [ OpenReview]

Abstract

A key goal in AI is to understand the internal cognitive processes that drive model decisions by analyzing their underlying algorithms and representations. We present a neuro-fuzzy framework designed to instantiate and analyze a complete cognitive pipeline within a "glass-box" agent. Our framework provides a transparent, multi-level cognitive account by showing how an agent: (1) develops its own perceptual concepts from raw data via regularized end-to-end learning; (2) processes information using these concepts in an explicit, dynamic symbolic reasoning algorithm; and (3) organizes its low-level processing into high-level behavioral strategies, which we reveal by abstracting thousands of raw rules into a handful of core "mental models". By modeling this entire pipeline, we offer a concrete methodology for building and dissecting AI systems whose learned cognitive processes are transparent by design.

Chat is not available.