How do I generate an Agora bot token?

Use the agora-token-builder library. Pass your app ID, certificate, channel name, and bot UID; it returns a JWT valid for 1 hour. The bot joins as a subscriber with read-only audio access, so it can't be booted by participants.

Can I use this for calls with more than 10 participants?

Yes, theoretically unlimited. Each participant gets one WebSocket to AssemblyAI. Your bottleneck will be your connection bandwidth, AssemblyAI's WebSocket concurrency limits, and whether you're running the bot on sufficient CPU/memory. Start testing at 10+; most issues surface there.

What happens if a participant drops mid-call?

The bot catches the `on_user_offline` event, cancels that participant's streaming task, and closes their WebSocket cleanly. Any partial transcript is flushed. When they rejoin, a new stream spins up. No data loss, just a turn boundary.

Does this work with Zoom or Google Meet?

No. This pattern depends on Agora's Server SDK and direct PCM frame access. Zoom and Google Meet don't expose that level of audio access to bots. You'd need to use their respective APIs (which don't offer the same architectural efficiency).

🛠️ Developer Tools

Why Real-Time Transcription Bots Just Got a Lot Faster (and Cheaper)

A new open-source pattern shows how to build transcription bots that join video calls as silent observers and stream speaker-identified transcripts in real-time. The latency gap between this approach and traditional APIs just became impossible to ignore.

Open Source Beat Apr 03, 2026 6 min read 12 views

Architecture diagram showing audio flow from Agora video call through bot to AssemblyAI WebSocket with real-time transcript output

⚡ Key Takeaways

Streaming audio directly from Agora to AssemblyAI eliminates translation layers, cutting transcription latency from 600–900ms to 307ms—a 2–3x improvement. 𝕏
Real-time speaker diarization (knowing who said what without manual labeling) becomes viable at scale, unlocking meeting intelligence, compliance, and voice agent use cases. 𝕏
The economics shift: streaming-based pricing scales better than per-minute rates for deployments with multiple concurrent participants, threatening traditional transcription API business models. 𝕏

Published by

Open Source Beat

Community-driven. Code-first.

#Agora #AssemblyAI #Python #real-time transcription #speech-to-text #voice bots

Worth sharing?

Get the best Open Source stories of the week in your inbox — no noise, no spam.

Originally reported by Dev.to

⚡ Key Takeaways

The 60-Second TL;DR

Open Source Beat

Share this article

Worth sharing?

Related Stories

ckpt: Git's Secret Weapon for Taming Wild AI Coders

rs-trafilatura Supercharges Crawl4AI: 1.7% F1 Boost on Real-World Benchmarks

Token Refresh Stampedes Are Wrecking Apps Everywhere — 40 Lines to Stop the Madness

Rust's Dynamic Duo: rs-trafilatura Turbocharges spider-rs Crawls

Stay in the loop