Poster
in
Workshop: CogInterp: Interpreting Cognition in Deep Learning Models

STAT: Skill-Targeted Adaptive Training

Yinghui He · Abhishek Panigrahi · Yong Lin · Sanjeev Arora

Project Page [ OpenReview]

Abstract

Small language models (SLMs) often show little to no improvement when trained on data similar to those in their training set (e.g., MATH). We introduce a distillation strategy, AdaptDistill, that enables a teacher model to help such a student SLM. The teacher uses its metacognition to create a list of skills needed for the task [Didolkar et al., 2024], and to label each data point with the skills required for it. AdaptDistill constructs a missing-skill profile for the SLM by identifying which skills were absent in the model’s responses and how frequently each skill was missing. We propose AdaptDistill-selected, which performs a weighted selection of training examples according to the missing-skill profile. We also propose AdaptDistill-synthetic, an analogous method where teacher LLM synthesizes additional examples, again targeted to the missing skills. On MATH, both methods improve Llama-Instruct models by up to 7.5% where naive fine-tuning fails, and also enhance out-of-distribution performances. These results highlight the promise of skill-aware targeted training for improving SLMs.

Chat is not available.