AI21 Labs has released Jamba, the first AI model based on the Mamba architecture, combining aspects of both Mamba Structured State Space model and traditional Transformer architecture for enhanced performance. Jamba’s 256K token context window outshines other models like Meta’s Llama 2. With a hybrid SSM-Transformer architecture and MoE layers, Jamba offers 3x throughput on long contexts, surpassing Transformer-based models like Mixtral 8x7B. It has demonstrated impressive results on benchmarks and is available under the Apache 2.0 license. While currently a research model, a safer version is in the works. The AI community can anticipate further advancements in AI capabilities with the development of new architectures like Jamba.
https://www.maginative.com/article/ai21-labs-unveils-jamba-the-first-production-grade-mamba-based-ai-model/