Groq LPU Sets New LLM Inference Speed Record: 500 Tokens Per Second Far Exceeds GPU
U.S. startup Groq announced its proprietary LPU (Language Processing Unit) has achieved a world record of 500 tokens per second in large language model inference, far surpassing mainstream GPU solutions.