Skip to content

Instantly share code, notes, and snippets.

@mkaito
Created November 10, 2012 20:21
Show Gist options
  • Save mkaito/4052372 to your computer and use it in GitHub Desktop.
Save mkaito/4052372 to your computer and use it in GitHub Desktop.
Recording a screencast with ffmpeg
ffmpeg -v info\
-f x11grab -s 1920x1064 -i :0.0+0,16\
-f pulse -i "alsa_output.pci-0000_00_1b.0.analog-stereo.monitor"\
-f pulse -i "alsa_input.usb-Sennheiser_Communication_Sennheiser_USB_headset-00-headset.analog-mono"\
-vcodec libx264 -preset ultrafast -tune zerolatency -crf 0\
-acodec pcm_s16le\
-filter_complex 'amerge, pan=2:c0=0.3*c0+3*c2:c1=0.3*c1+3*c2'\
"$rec"
@mkaito
Copy link
Author

mkaito commented Nov 10, 2012

It took me months to fine tune this baby. Learning more about ffmpeg was really exciting. It's a very powerful tool, much more than you would ever imagine.

Each of the -f lines defines a source stream. There's one video stream, which covers most of my left monitor, sans WM status bar, and two audio streams. The first stream is a "monitor" stream, and the second, a microphone. The monitor stream is a special stream that turns a sink into a source. It allows you to record "what you hear". While alsa is technically capable of doing this via the infamous loopback module, it's a lot easier in pulse.

Both audio and video codec are meant for low compression overhead. I want my resources available for the work I'm trying to record. I recompress the result overnight to obtain more sensible file sizes. I toyed around with true lossless video codecs such as huffyuv and ffv1, but it seems my hard drives aren't fast enough to handle the IO pressure. Hence, ultrafast x264, which results in around 1Gb per 10 minutes at 1080p with PCM audio.

The -filter_complex line is, here comes the punchline, the most complex. Here's the issue. The Sennheiser USB headset provides a very low potency audio source, as do most USB microphones, so it needs a strong gain to be of any use. I tried for a while to apply a -af volume=3 line to the source, but found out that filters are applied not to each source, but to all streams fed to the encoder at once. After discarding the amix filter, I started playing with amerge and pan. amerge takes any number of audio streams, and merges them into one multi-channel stream. In my situation, I had one stereo stream and a mono stream, which sums up to 3 channels. The desired output is a single stereo stream. That's where pan comes in. It allows you to remap and remix channels in a stream, while applying gain to each of them. The scary looking numbers basically just apply some serious gain to my microphone channel, duplicate it, and mix it twice into the stereo channels from the monitor stream.

And an eternity later, I can finally record a screencast on linux with ffmpeg. And the best part is that I now know what I'm doing, and how everything works. The journey's half the fun!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment