🤝 Community & Governance

Intel NPU's LLM Reality Check: 96-Second Loads and CPU Wins on Core Ultra

You'd think Intel's NPU would crush local LLMs. Wrong. On a Core Ultra laptop, it loads in 96 seconds and trails the CPU.

Benchmarks chart comparing Intel NPU, CPU, and llama.cpp speeds on Core Ultra laptop

⚡ Key Takeaways

  • NPU loads models 20x slower (96s vs 5s) with no generation speed gain over CPU. 𝕏
  • llama.cpp crushes all: 22 tok/s, 2s loads on Intel Core Ultra. 𝕏
  • Fix NPU: Special export flags + openvino-genai; standard tools crash. 𝕏
Published by

theAIcatchup

Community-driven. Code-first.

Worth sharing?

Get the best Open Source stories of the week in your inbox — no noise, no spam.

Originally reported by Dev.to

Stay in the loop

The week's most important stories from theAIcatchup, delivered once a week.