WebLLM: Llama2 in the Browser

Web LLM Llama 2 7B/13B and Llama 2 70B are now available in Web LLM, allowing users to try them out in our chat demo. The project brings large-language models and LLM-based chatbots to web browsers, running everything inside the browser with no server support and accelerated with WebGPU. This opens up exciting opportunities to build AI assistants for everyone and enable privacy while benefiting from GPU acceleration. The project aims to bring more diversity to the ecosystem by directly baking LLMs into the client side and running them inside a browser. The instructions for trying out the models are provided, and some system requirements are mentioned.

https://webllm.mlc.ai/