🤖 AI Dev Tools

Gemma 4 Blasts 85 tok/s on Macs – Pip Install Only

Gemma 4 on Apple Silicon just got stupidly fast. One command, 85 tok/s, tools included – cloud services, take notes.

Gemma 4 model running at 85 tokens per second on M3 Ultra Mac benchmark chart

⚡ Key Takeaways

  • Gemma 4 hits 85 tok/s on Apple Silicon with one pip install via Rapid-MLX. 𝕏
  • Beats Ollama on decode speed, full tool calling for 18 model families. 𝕏
  • OpenAI-compatible API works with LangChain, Aider, PydanticAI – offline agents unlocked. 𝕏
Published by

theAIcatchup

Community-driven. Code-first.

Worth sharing?

Get the best Open Source stories of the week in your inbox — no noise, no spam.

Originally reported by Dev.to

Stay in the loop

The week's most important stories from theAIcatchup, delivered once a week.