Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Machine Learning for Systems

MASE: An Efficient Representation for Software-Defined ML Hardware System Exploration

Cheng Zhang · Jianyi Cheng · Zhewen Yu · Yiren Zhao


Abstract:

Machine learning (ML) accelerators have been studied and used extensively to compute ML models with high performance and low power. However, designing such accelerators normally takes a long time and requires significant effort. Unfortunately, the pace of development of ML software models is much faster than the accelerator design cycle, leading to frequent and drastic modifications in the model architecture, thus rendering many accelerators obsolete. Existing design tools and frameworks can provide quick accelerator prototyping, but only for a limited range of models that can fit into a single hardware device, such as an FPGA. Furthermore, with the emergence of large language models, such as GPT-3, there is an increased need for hardware prototyping of these large models within a many-accelerator system to ensure the hardware can scale with the ever-growing model sizes.The design space is often huge, involving both software and hardware optimization. To address this, we propose a novel representation named MASE IR (Machine-learning Accelerator System Exploration Intermediate Representation) that describes data types, software algorithms, and hardware design constraints. MASEIR opens up opportunities for exploring software and hardware co-optimization at scale. As an application of MASEIR, we implemented a PyTorch-based framework named MASE that automatically optimizes and maps an ML model onto an efficient hardware accelerator system. We believe MASE IR will open new research opportunities for ML system design.

Chat is not available.