Skip to yearly menu bar Skip to main content

Workshop: Table Representation Learning Workshop

Augmentation for Context in Financial Numerical Reasoning over Textual and Tabular Data with Large-Scale Language Model

Yechan Hwang · Jinsu Lim · Young-Jun Lee · Ho-Jin Choi

Keywords: [ Numerical Reasoning ] [ Financial QA ] [ Hybrid QA ] [ Data Augmentation ]


Constructing large-scale datasets for numerical reasoning over tabular and textual data in the financial domain is particularly challenging. Moreover, even the commonly used augmentation techniques for dataset construction prove to be ineffective in augmenting financial dataset. To address this challenge, this paper proposes a context augmentation methodology for enhancing the financial dataset, which generates new contexts for the original question. To do this, we leverage the hallucination capability of large-scale generative language models. Specifically, by providing instructions with constraints for context generation with the original dataset's questions and arithmetic programs together as input to the language model's prompt, we create plausible contexts that provide evidence for the given questions. The experimental results showed that the reasoning performance improved when we augmented the FinQA dataset using our methodology and trained the model with it.

Chat is not available.