Skip to yearly menu bar Skip to main content

Workshop: Generative AI for Education (GAIED): Advances, Opportunities, and Challenges

Paper 26: The Behavior of Large Language Models When Prompted to Generate Code Explanations

Priti Oli · Rabin Banjade · Vasile Rus · Jeevan Chapagain · Priti Oli

Keywords: [ Code Comprehension ] [ Code Explanations ] [ CS Education ] [ large language models (LLMs) ]


This paper systematically investigates the generation of code explanations by Large Language Models (LLMs) for code examples commonly encountered in introductory programming courses. Our findings reveal significant variations in the nature of code explanations produced by LLMs, influenced by factors such as the wording of the prompt, the specific code examples under consideration, the programming language involved, the temperature parameter, and the version of the LLM. However, a consistent pattern emerges for Java and Python, where explanations exhibit a Flesch-Kincaid readability level of approximately 7-8 grade and a consistent lexical density, indicating the proportion of meaningful words relative to the total explanation size. Additionally, the generated explanations consistently achieve high scores for correctness, but lower scores on three other metrics: completeness, conciseness, and specificity.

Chat is not available.