Skip to yearly menu bar Skip to main content

Workshop: NeurIPS 2023 Workshop on Machine Learning for Creativity and Design

HARP: Bringing Deep Learning to the DAW with Hosted, Asynchronous, Remote Processing

Hugo Flores Garcia · Christodoulos Benetatos · Patrick O'Reilly · Aldo Aguilar · Zhiyao Duan · Bryan Pardo

[ ]
Sat 16 Dec 1:30 p.m. PST — 2:30 p.m. PST


Deep learning models have the potential to transform how artists interact with audio across a range of creative applications. While digital audio workstations (DAWs) like Logic or Pro Tools are the most popular software environment for producing audio, state-of-the-art deep learning models are typically available as Python repositories or web demonstrations (e.g., Gradio apps). Attempts to bridge this divide have focused on deploying lightweight models as DAW plug-ins that run real-time, locally on the CPU. This often requires significant modifications to the models, and precludes large compute-heavy models and alternative interaction paradigms (e.g., text-to-audio). To bring state-of-the-art models into the hands of artistic creators, we release HARP, a free Audio Random Access (ARA) plug-in for DAWs. HARP supports [h]osted, [a]synchronous, [r]emote [p]rocessing with de≈ep learning models by routing audio from the DAW through Gradio endpoints. Through HARP, Gradio-compatible models hosted on the web (e.g., on Hugging Face Spaces) can become directly useable within the DAW. Using our API, developers can define interactive controls and audio processing logic within their Gradio endpoint. A sound artist can then enter the model's URL into a dialog box on the HARP plugin and the plug-in interface will automatically populate controls, prepare routing, and render any processed audio. Thus, sound artists can create and modify audio using deep learning models in-DAW, maintaining an unbroken creative workflow.

Chat is not available.