Where Are Large Language Models for Code Generation on GitHub?

This study explores the use of Large Language Models (LLMs) like ChatGPT and Copilot in generating code on GitHub. Surprisingly, these LLMs are most frequently utilized for Python, Java, and TypeScript scripts for data processing, with code snippets being short and of low complexity. Interestingly, projects containing LLM-generated code are often small and led by individuals or small teams, yet they continue to evolve and improve. Compared to human-written code, LLM-generated code undergoes fewer modifications, with minimal changes due to bugs. However, comments on this generated code often lack detailed information. This research sheds light on the real-world applications and implications of LLM-generated code.

https://arxiv.org/abs/2406.19544

To top