Editing Language Models with Natural Language Feedback
in
Affinity Workshop: Muslims in ML
Abstract
Even the most sophisticated language models are not immune to inaccuracies, bias or becoming obsolete, highlighting the need for efficient model editing. Model editing involves altering a model’s knowledge or representations to achieve specific outcomes without the need for extensive retraining. Traditional research has focused on editing factual data within a narrow scope—limited to knowledge triplets like ‘subject-object-relation.’ Yet, as language model applications broaden, so does the necessity for diverse editing approaches. In this talk, I will describe our work that introduces a novel dataset where edit requests are natural language sequences, expanding the editing capabilities beyond factual adjustments to encompass a more comprehensive suite of modifications including bias mitigation. This development not only enhances the precision of language models but also increases their adaptability to evolving information and application demands.