Timezone: »

Training language models for deeper understanding improves brain alignment
Khai Loong Aw · Mariya Toneva
Event URL: https://openreview.net/forum?id=CLPNdsxZCs- »

Building systems that understand information across long contexts is one important goal in natural language processing (NLP). One approach in recent works is to scale up model architectures to accept very long inputs and then train them on datasets to learn to extract critical information from long input texts. However, it is still an open question whether these models are simply learning a heuristic to solve the tasks, or really learning to understand information across long contexts. This work investigates this further by turning to the one system with truly long-range and deep language understanding: the human brain. We show that training language models for long-range narrative understanding results in richer representations that have improved alignment to human brain activity. This suggests they have indeed improved understanding across long contexts. However, although these models can take in thousands of input words, their brain alignment peaks after only 500 words. This suggests possible limitations with either model training or architecture. Overall, our findings have consequences both for cognitive neuroscience by revealing some of the significant factors behind brain-NLP alignment, and for NLP by highlighting limitations with existing approaches for longer-range understanding.

Author Information

Khai Loong Aw (Max Planck Institute for Software Systems (MPI-SWS))
Khai Loong Aw

My research interests are in Machine Learning as well as the long-term goal of working on artificial general intelligence. Towards this goal, I have done research internships working on Machine Learning, Computer Vision and the intersection of Neuroscience and Natural Language Processing (NLP). I am currently working with Dr Mariya Toneva on Bridging AI and Neuroscience. We use human brain activity recordings to study, interpret & improve neural network representations in NLP.

Mariya Toneva (MPI for Software Systems)

More from the Same Authors