░ armory · video-clipping · compare

Deepgram Nova-3 vs LR-ASD

Both in the video & clipping category. Side-by-side — pick the one that fits your stack tonight.

Deepgram Nova-3★★★★★

✓ loya-tested💰 paid🛠️ wire-up

The speech-to-text that actually gets word-level timestamps right.

rating: 5★
tested: ✓ loya-tested
cost: paid
install: needs-wiring
stars: 0
updated: 5d ago

#transcription#speech-to-text#word-timestamps#diarization#paid#deepgram

avoid if

You only transcribe short voice notes — use free Whisper locally.

open the full entry →

LR-ASD★★★★★

🆓 free🐍 sidecar

The 2025 state-of-the-art for 'which face is actually talking.' Fast, tiny, accurate.

rating: 4★
tested: —
cost: free
install: sidecar
stars: 109
updated: 1y ago

#video#active-speaker#python#research#lightweight#open-source

avoid if

You're not building a pipeline yourself. This is a research model, not a product.

open the full entry →

why it matters · Deepgram Nova-3

Nova-3 is the transcription engine behind every podcast clipper that ships. You upload audio, get back text with per-word timestamps, speaker labels, and punctuation — the three things you need to cut a clip on a clean sentence boundary instead of mid-word. Costs about 26 cents an hour of audio. Free \$200 credit when you sign up, which gets you through your first 700+ hours before you pay anything. Way more accurate than Whisper on real podcast audio.

why it matters · LR-ASD

LR-ASD is the newest open-source active speaker detection model (Springer IJCV 2025 paper). It tells your video pipeline which person in a multi-face frame is actually talking. Accuracy beats the older TalkNet approach and it's 23 times lighter — fast enough to run on every frame, not just samples. If you're building your own clipping or auto-crop pipeline and accuracy matters more than a pre-built library, this is the one to drop in. MIT, free, Python.

Deepgram Nova-3 vs LR-ASD

why it matters · Deepgram Nova-3

why it matters · LR-ASD

more video & clipping to compare