Timezone: »

DataMUX: Data Multiplexing for Neural Networks
Vishvak Murahari · Carlos Jimenez · Runzhe Yang · Karthik Narasimhan

Tue Nov 29 02:00 PM -- 04:00 PM (PST) @ Hall J #806
In this paper, we introduce \emph{data multiplexing} (DataMUX), a technique that enables deep neural networks to process multiple inputs simultaneously using a single compact representation. DataMUX demonstrates that neural networks are capable of generating accurate predictions over \emph{mixtures} of inputs, resulting in increased inference throughput with minimal extra memory requirements. Our approach uses two key components -- 1) a multiplexing layer that performs a fixed linear transformation to each input before combining them to create a "mixed" representation of the same size as a single input, which is then processed by the base network, and 2) a demultiplexing layer that converts the base network's output back into independent representations before producing predictions for each input. We show the viability of DataMUX for different architectures (Transformers, and to a much lesser extent MLPs and CNNs) across six different tasks spanning sentence classification, named entity recognition and image classification. For instance, DataMUX for Transformers can multiplex up to 20x/40x inputs, achieving up to 11x/18x increase in inference throughput with absolute performance drops of $<2\%$ and $<4\%$ respectively compared to a vanilla Transformer on MNLI, a natural language inference task. We also provide a theoretical construction for multiplexing in self-attention networks and analyze the effect of various design elements in DataMUX.

Author Information

Vishvak Murahari (Princeton University)
Vishvak Murahari

Vishvak Murahari is a 3rd year Ph.D. student (focus on Natural Language Processing, Machine Learning) at Princeton University, advised by Prof. Karthik Narasimhan. He is also a student researcher at Google Brain. He earned his Masters in Computer Science from Georgia Tech and was advised by Prof. Devi Parikh and Abhishek Das and worked closely with Prof. Dhruv Batra. He earned his Bachelors in Computer Science (focus on AI and Devices) from Georgia Tech and was advised by Prof. Thomas Ploetz and worked closely with Prof. Aman Parnami. He has previously interned at the Allen Institute for Artificial Intelligence and Microsoft and has deep experience in building both research and applied ML systems. He has published multiple papers at top ML conferences such as NeurIPS, EMNLP, and ECCV.

Carlos Jimenez (Princeton University)
Carlos Jimenez

I’m a PhD student at Princeton University, advised by Prof. Karthik Narasimhan. I study AI and ML for natural language processing. My research interests include representation learning, multi-modal learning, task-oriented dialogue, and AI safety.

Runzhe Yang (Princeton University)

I’m “Tony” Runzhe Yang (杨闰哲), currently a final-year Ph.D. student at Computer Science Department and Neuroscience Institute at Princeton University. Previously, I worked as a research intern at Google Brain, Simons Foundation, and Cornell University. I received my Bachelor Degree in Computer Science from ACM Honors Class, Zhiyuan College, SJTU. Research Interests: NLP/RL/NeuroAI

Karthik Narasimhan (Princeton University)

More from the Same Authors