Timezone: »

Mitigating Lies in Vision-Language Models
Junbo Li · Xianhang Li · Cihang Xie

In this work, we bring new insights into the honesty of vision-language models,particularly in visual question answering (VQA). After a throughout revisit of theexisting ‘lie’ behavior in pure language models, our work makes an unprecedentedextension of ’lies’ to vision-language models. The results indicate that the lieprefixes have a more obvious misleading effect on vision-language models thanon language models. We also propose a novel visual prefix and prove that theconsistent vision-language prefix is more threatening to vision-language models.To defend the models from the stated ’lies’, we put forward an unsupervisedframework based on Gaussian mixture modeling and obtain improvement with 3%against the language prefix and 12% against the vision-language prefix.

Author Information

Junbo Li (University of California, Santa Cruz)
Xianhang Li (University of Central Florida)
Cihang Xie ( University of California, Santa Cruz)

More from the Same Authors