Transcribing Tips via OpenAI's Whisper API.”

Really good post on hacker news for Transcribing tips (comments). Some notes:

A commentor had the idea to also strip silences with the following FFMPEG code block. Less silence means less room for hallucinations, and it’s also cheaper!

ffmpeg -i video-audio.m4a \
 -af "silenceremove=start_periods=1:start_duration=0:start_threshold=-50dB:\
                    stop_periods=-1:stop_duration=0.02:stop_threshold=-50dB,\
                    apad=pad_dur=0.02" \
 -c:a aac -b:a 128k output_minpause.m4a -y

You can also speed up the audio to reduce time as well

ffmpeg -i video-audio.m4a -filter:a “atempo=2.0” -ac 1 -b:a 64k video-audio-2x.mp3

Other enhancements like normalization can also help boost efficiency

← All We See are Shadows Notes Recast Video Notes