LlamaGym – fine-tune LLM agents with online reinforcement learning

LlamaGym simplifies fine-tuning Large Language Model (LLM) agents with reinforcement learning (RL) in a Gym-style environment for web data extraction. Although LLM-based agents typically don’t learn online via RL, LlamaGym’s abstract class streamlines the process, allowing for easy experimentation with agent prompting and hyperparameters across different environments. By following a few simple steps to implement methods, define a base LLM, and instantiate an agent, users can train their agents to act, receive rewards, and terminate in an RL loop. While LlamaGym prioritizes simplicity over computational efficiency, it is a work in progress open to contributions.

https://github.com/KhoomeiK/LlamaGym