🤖 AI Dev Tools
Gemma 4 Blasts 85 tok/s on Macs – Pip Install Only
Gemma 4 on Apple Silicon just got stupidly fast. One command, 85 tok/s, tools included – cloud services, take notes.
theAIcatchup
Apr 07, 2026
4 min read
⚡ Key Takeaways
-
Gemma 4 hits 85 tok/s on Apple Silicon with one pip install via Rapid-MLX.
𝕏
-
Beats Ollama on decode speed, full tool calling for 18 model families.
𝕏
-
OpenAI-compatible API works with LangChain, Aider, PydanticAI – offline agents unlocked.
𝕏
The 60-Second TL;DR
- Gemma 4 hits 85 tok/s on Apple Silicon with one pip install via Rapid-MLX.
- Beats Ollama on decode speed, full tool calling for 18 model families.
- OpenAI-compatible API works with LangChain, Aider, PydanticAI – offline agents unlocked.
Published by
theAIcatchup
Community-driven. Code-first.
Worth sharing?
Get the best Open Source stories of the week in your inbox — no noise, no spam.