Timezone: »

How to Win LMs and Influence Predictions: Using Short Phrases to Control NLP Models
Sameer Singh

Current NLP pipelines rely significantly on finetuning large pre-trained language models. Relying on this paradigm makes such pipelines challenging to use in real-world settings since massive task-specific models are neither memory- nor inference-efficient, nor do we understand how they fare in adversarial settings. This talk will describe our attempts to address these seemingly unrelated concerns by investigating how specific short phrases in the input can control model behavior. These short phrases (which we call triggers) will help us identify model vulnerabilities and introduce new paradigms of training models.
In the first part of the talk, I will focus on the adversarial setting. I will show how easy it is for adversaries to craft triggers that cause a target model to misbehave when the trigger appears in the input. In the second part of the talk, I will show how these triggers can also be used to “prompt” language models to act as task-specific models, providing a negligible-memory, no-learning way to create classifiers. I will end with a comprehensive study of the interplay between prompting and finetuning, providing some guidelines for effectively performing few-shot learning with large language models.

Author Information

Sameer Singh (University of California, Irvine)

More from the Same Authors