fast-asd (Sieve)★★★★★
Tells your video which person is actually talking. Powers auto-cropping for clips.
why it matters
If you want to take a multi-person podcast and auto-crop it to the vertical 9:16 format TikTok and Reels want, the video needs to know WHO is talking at any given second. fast-asd figures that out — audio + lip movement detection — so your crop follows the active speaker.
Stale repo (last updated mid-2024) but still works, and the pattern is still how every podcast clipper does speaker tracking under the hood. Python sidecar, MIT, free.
install
git clone https://github.com/sieve-community/fast-asdwhere to find it
no commits in 1 year. this doesn't mean it's broken — some small repos are "finished" — but if you hit an install issue, it may not get patched quickly.
You aren't building your own video pipeline. Most creators should just pay OpusClip and skip the plumbing.