Google Open Sources Gemma 4: KV Cache Compressed to 3 Bits, Saving 6 Times Memory; Comprehensive Performance Yet to Be Verified by Third Parties
Google recently launched the open-source multimodal AI model Gemma 4, introducing video and image processing capabilities. The new TurboQuant technology compresses KV caches to 3 bits, achieving significant memory savings.