Skip to yearly menu bar Skip to main content

Workshop: NeurIPS 2023 Workshop on Tackling Climate Change with Machine Learning: Blending New and Existing Knowledge Systems

Can LLMs Accurately Assess Human Confidence in Climate Statements?

Romain Lacombe · Kerrie Wu · Eddie Dilworth


The potential for public misinformation fueled by “confidently wrong” Large Language Models (LLMs) is especially salient in the climate science and policy domain. We introduce the ICCS dataset, a novel, curated, expert-labeled NLP dataset consisting of 8094 climate science statements and their associated confidence levels collected from the latest IPCC AR6 reports. Using this dataset, we show that recent LLMs can classify human expert confidence in climate-related statements with reasonable—if limited—accuracy, especially in a few-shot learning setting. Overall, models exhibit consistent and significant overconfidence on low and medium confidence statements. We highlight important implications from our results for climate policy and the use of LLMs in information retrieval systems.

Chat is not available.