Gemma.cpp: lightweight, standalone C++ inference engine for Gemma models

gemma.cpp is a lightweight standalone C++ inference engine for Gemma foundation models by Google. It targets experimentation and research, focusing on simplicity and directness. Users can obtain model weights and the tokenizer from Kaggle, with various model variations available. The build system uses CMake and Clang, and gemma.cpp can be run from the command line. Incorporating gemma.cpp into projects is easy via FetchContent. Active development is ongoing, with contributions welcome, and the project follows Google’s Open Source Community Guidelines. Developed by Austin Huang and Jan Wassenberg, gemma.cpp is not an officially supported Google product.

https://github.com/google/gemma.cpp