Mlx-community/OLMo-2-0325-32B-Instruct-4bit

OLMo 2 32B is claimed to be the first fully-open model to outperform GPT3.5-Turbo and GPT-4o mini. The MLX project offers a recipe for installing the model on a Mac using llm-mlx plugin. By following instructions to download the model, 17GB of data is stored, allowing users to start an interactive chat with OLMo 2. Using specific prompts like ‘Generate an SVG of a pelican riding a bicycle’ with -o unlimited 1 removes the token output cap. The surprising result may be abstract, showcasing unique capabilities of OLMo 2.

https://simonwillison.net/2025/Mar/16/olmo2/