Llm.c – LLM training in simple, pure C/CUDA

llm.c offers LLM training in simple C/CUDA, avoiding the need for large PyTorch or cPython dependencies. By utilizing clean code, training a GPT-2 model becomes straightforward. The project aims to develop a direct CUDA implementation for faster processing and optimize the CPU version with SIMD instructions for efficiency. Notably, the focus is on maintaining simple reference implementations alongside optimized versions. The process involves downloading and tokenizing datasets like tinyshakespeare or TinyStories. While training from scratch is inefficient due to baseline code limitations, initializing with GPT-2 weights for finetuning is recommended. The site provides a step-by-step tutorial and unit tests to ensure code accuracy, all in a concise manner.

https://github.com/karpathy/llm.c