NeurIPS NexusRaven: A Commercially-Permissive Language Model for Function Calling

Oral
in
Workshop: Foundation Models for Decision Making

NexusRaven: A Commercially-Permissive Language Model for Function Calling

Venkat Krishna Srinivasan · Zhen Dong · Banghua Zhu · Brian Yu · Hanzi Mao · Damon Mosk-Aoyama · Kurt Keutzer · Jiantao Jiao · Jian Zhang

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

The rise of open-source, commercially permissive large language models (LLMs) is revolutionizing generative AI, presenting organizations with enhanced control, minimized data risks, and cost benefits compared to proprietary models. However, in the field of tool use and function-calling LLMs, many open-source models, such as Gorilla and ToolLLAMA, are dependent on proprietary LLMs like GPT-4 for high-quality training data, which often faces legal restrictions for competitive commercial applications. In this paper, we introduce NexusRaven-13B, an open-source LLM designed for function calls. Originating from the CodeLLAMA-13B lineage, NexusRaven-13B employs a unique data curation via multi-step refinement, ensuring high-quality training data without relying on GPT-4 distillation. NexusRaven-13B matches GPT-3.5 in zero-shot function-calling accuracy. When combined with our second core technique, demonstration retrieval augmentation, its performance significantly surpasses GPT-4. The code, model,and demo will be available after the review process.

Chat is not available.

Oral in Workshop: Foundation Models for Decision Making

NexusRaven: A Commercially-Permissive Language Model for Function Calling

Venkat Krishna Srinivasan · Zhen Dong · Banghua Zhu · Brian Yu · Hanzi Mao · Damon Mosk-Aoyama · Kurt Keutzer · Jiantao Jiao · Jian Zhang

Oral
in
Workshop: Foundation Models for Decision Making