In a groundbreaking development, researchers have introduced BitNet b1.58, a 1-bit Large Language Model (LLM) that utilizes ternary {-1, 0, 1} parameters. This new model matches full-precision Transformers in performance while being more cost-effective in terms of latency, memory, throughput, and energy consumption. The 1.58-bit LLM not only sets a new standard for high-performance and cost-effective language models but also opens doors for new hardware designs optimized for 1-bit LLMs. This research defines a new scaling law and training recipe for the next generation of LLMs, showing promise for the future of language processing.
https://arxiv.org/abs/2402.17764