NeurIPS Hallucination of Large Language Models in Finance: An Empirical Examination

Poster
in
Workshop: I Can’t Believe It’s Not Better (ICBINB): Failure Modes in the Age of Foundation Models

Hallucination of Large Language Models in Finance: An Empirical Examination

Haoqiang Kang · Xiao-Yang Liu

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

The hallucination issue is recognized as a fundamental deficiency of large language models (LLMs), especially when applied to domains such as finance, education, and law. Despite the growing concerns, there has been a lack of empirical study. In this paper, we provide an empirical examination of LLMs' hallucination behaviors in financial tasks. Firstly, we empirically investigate the ability of explaining financial concepts and terminologies. Secondly, we assess the models' capacity of querying historical stock prices. Thirdly, to alleviate hallucination, we evaluate two practical methods: the Retrieval Augmentation Generation (RAG) method and the zero-shot tool learning method for a function to generate a query command. Finally, our finding is that off-the-shelf LLMs experience serious hallucination behaviors in financial tasks. Therefore, there is an urgent need to call for research efforts in mitigating LLMs' hallucination.

Chat is not available.

Poster in Workshop: I Can’t Believe It’s Not Better (ICBINB): Failure Modes in the Age of Foundation Models

Hallucination of Large Language Models in Finance: An Empirical Examination

Haoqiang Kang · Xiao-Yang Liu

Poster
in
Workshop: I Can’t Believe It’s Not Better (ICBINB): Failure Modes in the Age of Foundation Models