Abstract
Understanding how climate change affects us and learning about available solutions are key steps toward empowering individuals and communities to mitigate and adapt successfully. As Large Language Models (LLMs) rise in popularity, it is necessary to assess their capability in this domain. In this study, we present a comprehensive evaluation framework, grounded in science communication principles, to analyze LLM responses on climate change topics. This framework emphasizes both the presentational and epistemological adequacy of answers, offering a fine-grained analysis of LLM response systems. Spanning 8 dimensions, our framework discerns up to 30 distinct issues in model outputs. The task is an excellent real-world example for a growing number of problems where LLM abilities can exceed those of humans. To address these challenges, we provide a practical protocol for scalable oversight using AI assistance, and rely on raters with relevant educational background. We evaluate several recent LLMs and conduct a thorough analysis of the results. Our comprehensive analysis sheds light on the potential and limitations of LLMs in the realm of climate communication.
Authors
Jannis Bulian, Mike Schäfer*, afraamini , Heidi Lam, Massimiliano Ciaramita, Ben Gaiarin, Michelle Chen Huebscher, Christian Buck, Niels Mede*, Markus Leippold, Nadine Strauss*
Venue
arXiv