FFmpeg

From ArchWiki

From the project home page:

FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video. It includes libavcodec - the leading audio/video codec library.

Installation

Note: You may encounter FFmpeg forks like libav and avconv, see The FFmpeg/Libav situation for a blog article about the differences between the project and the current status of FFmpeg.

Install the ffmpeg package.

For the development version, install the ffmpeg-gitAUR package. There is also ffmpeg-fullAUR, which is built with as many optional features enabled as possible.

Encoding examples

Note:
  • It is important that parameters are specified in the correct order (e.g. input, video, filters, audio, output). Failing to do so may cause parameters being skipped or will prevent FFmpeg from executing.
  • FFmpeg should automatically choose the number of CPU threads available. However you may want to force the number of threads available by the parameter -threads number.

See FFmpeg encoding wiki and ffmpeg(1) § EXAMPLES.

Screen capture

FFmpeg includes the x11grab and ALSA virtual devices that enable capturing the entire user display and audio input.

To take a screenshot screen.png:

$ ffmpeg -f x11grab -video_size 1920x1080 -i $DISPLAY -vframes 1 screen.png

where -video_size specifies the size of the area to capture.

To take a screencast screen.mkv with lossless encoding and without audio:

$ ffmpeg -f x11grab -video_size 1920x1080 -framerate 25 -i $DISPLAY -c:v ffvhuff screen.mkv

Here, the Huffyuv codec is used, which is fast, but produces huge file sizes.

To take a screencast screen.mp4 with lossy encoding and with audio:

$ ffmpeg -f x11grab -video_size 1920x1080 -framerate 25 -i $DISPLAY -f alsa -i default -c:v libx264 -preset ultrafast -c:a aac screen.mp4

Here, the x264 codec with the fastest possible encoding speed is used. Other codecs can be used; if writing each frame is too slow (either due to inadequate disk performance or slow encoding), then frames will be dropped and video output will be choppy.

If the video stream should not be saved as a file, but used as a virtual webcam for screen sharing purposes, see v4l2loopback#Casting X11 using FFmpeg.

See also the official documentation.

Recording webcam

FFmpeg includes the video4linux2 and ALSA input devices that enable capturing webcam and audio input.

The following command will record a video webcam.mp4 from the webcam without audio, assuming that the webcam is correctly recognized under /dev/video0:

$ ffmpeg -f v4l2 -video_size 640x480 -i /dev/video0 -c:v libx264 -preset ultrafast webcam.mp4

where -video_size specifies the largest allowed image size from the webcam.

The above produces a silent video. To record a video webcam.mp4 from the webcam with audio:

$ ffmpeg -f v4l2 -video_size 640x480 -i /dev/video0 -f alsa -i default -c:v libx264 -preset ultrafast -c:a aac webcam.mp4

Here, the x264 codec with the fastest possible encoding speed is used. Other codecs can be used; if writing each frame is too slow (either due to inadequate disk performance or slow encoding), then frames will be dropped and video output will be choppy.

See also the official documentation.

VOB to any container

Concatenate the desired VOB files into a single stream and mux them to MPEG-2:

$ cat f0.VOB f1.VOB f2.VOB | ffmpeg -i - out.mp2

x264

Lossless

The ultrafast preset will provide the fastest encoding and is useful for quick capturing (such as screencasting):

$ ffmpeg -i input -c:v libx264 -preset ultrafast -qp 0 -c:a copy output

On the opposite end of the preset spectrum is veryslow and will encode slower than ultrafast but provide a smaller output file size:

$ ffmpeg -i input -c:v libx264 -preset veryslow -qp 0 -c:a copy output

Both examples will provide the same quality output.

Tip: If your computer is able to handle -preset superfast in realtime, you should use that instead of -preset ultrafast. Ultrafast is far less efficient compression than superfast.

Constant rate factor

Used when you want a specific quality output. General usage is to use the highest -crf value that still provides an acceptable quality. Lower values are higher quality; 0 is lossless, 18 is visually lossless, and 23 is the default value. A sane range is between 18 and 28. Use the slowest -preset you have patience for. See the x264 Encoding Guide for more information.

$ ffmpeg -i video -c:v libx264 -tune film -preset slow -crf 22 -x264opts fast_pskip=0 -c:a libmp3lame -aq 4 output.mkv

-tune option can be used to match the type and content of the of media being encoded.

Two-pass (very high-quality)

Audio deactivated as only video statistics are recorded during the first of multiple pass runs:

$ ffmpeg -i video.VOB -an -vcodec libx264 -pass 1 -preset veryslow \
-threads 0 -b:v 3000k -x264opts frameref=15:fast_pskip=0 -f rawvideo -y /dev/null

Container format is automatically detected and muxed into from the output file extenstion (.mkv):

$ ffmpeg -i video.VOB -acodec aac -b:a 256k -ar 96000 -vcodec libx264 \
-pass 2 -preset veryslow -threads 0 -b:v 3000k -x264opts frameref=15:fast_pskip=0 video.mkv

Video stabilization

Video stablization using the vid.stab plugin entails two passes.

First pass

The first pass records stabilization parameters to a file and/or a test video for visual analysis.

  • Records stabilization parameters to a file only
    $ ffmpeg -i input -vf vidstabdetect=stepsize=4:mincontrast=0:result=transforms.trf -f null -
  • Records stabilization parameters to a file and create test video "output-stab" for visual analysis
    $ ffmpeg -i input -vf vidstabdetect=stepsize=4:mincontrast=0:result=transforms.trf output-stab
Second pass

The second pass parses the stabilization parameters generated from the first pass and applies them to produce "output-stab_final". You will want to apply any additional filters at this point so as to avoid subsequent transcoding to preserve as much video quality as possible. The following example performs the following in addition to video stabilization:

  • unsharp is recommended by the author of vid.stab. Here we are simply using the defaults of 5:5:1.0:5:5:1.0
  • Tip: fade=t=in:st=0:d=4
    fade in from black starting from the beginning of the file for four seconds
  • Tip: fade=t=out:st=60:d=4
    fade out to black starting from sixty seconds into the video for four seconds
  • -c:a pcm_s16le XAVC-S codec records in pcm_s16be which is losslessly transcoded to pcm_s16le
$  ffmpeg -i input -vf vidstabtransform=smoothing=30:interpol=bicubic:input=transforms.trf,unsharp,fade=t=in:st=0:d=4,fade=t=out:st=60:d=4 -c:v libx264 -tune film -preset veryslow -crf 8 -x264opts fast_pskip=0 -c:a pcm_s16le output-stab_final

x265

Example command showing the defaults when libx265 is invoked without any parameters (Constant Rate Factor encoding):

ffmpeg -i input -c:v libx265 -crf 28 -preset medium -c:a libvorbis output.mp4

See FFmpeg H.265/HEVC Video Encoding Guide for more information.

Single-pass MPEG-2 (near lossless)

Allow FFmpeg to automatically set DVD standardized parameters. Encode to DVD MPEG-2 at ~30 FPS:

$ ffmpeg -i video.VOB -target ntsc-dvd output.mpg

Encode to DVD MPEG-2 at ~24 FPS:

$ ffmpeg -i video.VOB -target film-dvd output.mpg

Subtitles

Extracting

Subtitles embedded in container files, such as MPEG-2 and Matroska, can be extracted and converted into SRT, SSA, WebVTT among other subtitle formats [1].

  • Inspect a file to determine if it contains a subtitle stream:
$ ffprobe -hide_banner foo.mkv
...
Stream #0:0(und): Video: h264 (High), yuv420p, 1920x800 [SAR 1:1 DAR 12:5], 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
  Metadata:
  CREATION_TIME   : 2012-06-05 05:04:15
  LANGUAGE        : und
Stream #0:1(und): Audio: aac, 44100 Hz, stereo, fltp (default)
 Metadata:
 CREATION_TIME   : 2012-06-05 05:10:34
 LANGUAGE        : und
 HANDLER_NAME    : GPAC ISO Audio Handler
Stream #0:2: Subtitle: ssa (default)
  • foo.mkv has an embedded SSA subtitle which can be extracted into an independent file:
$ ffmpeg -i foo.mkv foo.ssa

Add -c:s srt to save subtitles in desirable format, e.g. SubRip:

$ ffmpeg -i foo.mkv -c:s srt foo.srt

When dealing with multiple subtitles, you may need to specify the stream that needs to be extracted using the -map key:stream parameter:

$ ffmpeg -i foo.mkv -map 0:2 foo.ssa

Hardsubbing

(instructions based on HowToBurnSubtitlesIntoVideo at the FFmpeg wiki)

Hardsubbing entails merging subtitles with the video. Hardsubs cannot be disabled, nor language switched.

  • Overlay foo.mpg with the subtitles in foo.ssa:
$ ffmpeg -i foo.mpg -vf subtitles=foo.ssa out.mpg

Volume gain

Volume gain can be modified through ffmpeg's filter function. First select the audio stream by using -af or -filter:a, then select the volume filter followed by the number that you want to change the stream by. For example:

$ ffmpeg -i input.flac -af volume=1.5 ouput.flac

Here volume=1.5 provides a 150% volume gain, instead of 1.5 use for example 0.5 to half the volume. The volume filter can also take a decibel measure, use volume=3dB to increase the volume by 3dB or volume=-3dB to decrease it by 3dB.

Note: Doubling a file's volume gain, is not the same thing as doubling its volume. You will have to experiment to find the right volume.
Tip: To know the current average and peak volume of a file, one can use the volumedetect filter: ffmpeg -i input.flac -af volumedetect -f null -. Then, the difference between the target and the current level can be provided to the volume filter to achieve the desired level.

Volume normalization

A given average and peak volume can also be achieved through normalization using the loudnorm filter. To normalize the perceived loudness of a file using fmpeg's default values for target average, peak and range loudness (respectively -24 LUFS, -2 dBTP and 7 LU), use:

$ ffmpeg -i input.flac -af loudnorm output.flac

To obtain a different loudness profile, use the i, tp and lra parameters of the filter to indicate respectively the integrated, true peak and loudness range. For example for a higher perceived loudness than the default, use:

$ ffmpeg -i input.flac -af loudnorm=i=-16:tp=-1.5:lra=11:print_format=summary output.flac

In this example, print_format=summary is also added to display the input and output loudness values of the audio file.

Note: The filter also supports two-pass mode, extracting the measured loudness values from a first run and using them in a second run to perform a linear normalization. See ffmpeg loudnorm documentation for more information.
Tip: To know the current loudness measures of a file, use ffmpeg -i input.flac -af loudnorm=print_format=summary -f null -.

Extracting audio

$ ffmpeg -i video.mpg output.ext
...
Input #0, avi, from 'video.mpg':
  Duration: 01:58:28.96, start: 0.000000, bitrate: 3000 kb/s
    Stream #0.0: Video: mpeg4, yuv420p, 720x480 [PAR 1:1 DAR 16:9], 29.97 tbr, 29.97 tbn, 29.97 tbc
    Stream #0.1: Audio: ac3, 48000 Hz, stereo, s16, 384 kb/s
    Stream #0.2: Audio: ac3, 48000 Hz, 5.1, s16, 448 kb/s
    Stream #0.3: Audio: dts, 48000 Hz, 5.1 768 kb/s
...

Extract the first (-map 0:1) AC-3 encoded audio stream exactly as it was multiplexed into the file:

$ ffmpeg -i video.mpg -map 0:1 -acodec copy -vn video.ac3

Convert the third (-map 0:3) DTS audio stream to an AAC file with a bitrate of 192 kb/s and a sampling rate of 96000 Hz:

$ ffmpeg -i video.mpg -map 0:3 -acodec aac -b:a 192k -ar 96000 -vn output.aac

-vn disables the processing of the video stream.

Extract audio stream with certain time interval:

$ ffmpeg -ss 00:01:25 -t 00:00:05 -i video.mpg -map 0:1 -acodec copy -vn output.ac3

-ss specifies the start point, and -t specifies the duration.

Stripping audio

  1. Copy the first video stream (-map 0:0) along with the second AC-3 audio stream (-map 0:2).
  2. Convert the AC-3 audio stream to two-channel MP3 with a bitrate of 128 kb/s and a sampling rate of 48000 Hz.
$ ffmpeg -i video.mpg -map 0:0 -map 0:2 -vcodec copy -acodec libmp3lame \
-b:a 128k -ar 48000 -ac 2 video.mkv
$ ffmpeg -i video.mkv
...
Input #0, avi, from 'video.mpg':
  Duration: 01:58:28.96, start: 0.000000, bitrate: 3000 kb/s
    Stream #0.0: Video: mpeg4, yuv420p, 720x480 [PAR 1:1 DAR 16:9], 29.97 tbr, 29.97 tbn, 29.97 tbc
    Stream #0.1: Audio: mp3, 48000 Hz, stereo, s16, 128 kb/s
Note: Removing undesired audio streams allows for additional bits to be allocated towards improving video quality.

Splitting files

You can use the copy codec to perform operations on a file without changing the encoding. For example, this allows you to easily split any kind of media file into two:

$ ffmpeg -i file.ext -t 00:05:30 -c copy part1.ext -ss 00:05:30 -c copy part2.ext

Hardware video acceleration

Encoding/decoding performance may be improved by using hardware acceleration API's, however only a specific kind of codec(s) are allowed and/or may not always produce the same result when using software encoding.

VA-API

VA-API can be used for encoding and decoding on Intel CPUs (requires intel-media-driver or libva-intel-driver) and on certain AMD GPUs when using the open-source AMDGPU driver (requires libva-mesa-driver). See the FFmpeg documentation for information about available parameters and supported platforms.

An example of encoding using the supported H.264 codec:

$ ffmpeg -threads 1 -i file.ext -vaapi_device /dev/dri/renderD128 -vcodec h264_vaapi -vf format='nv12|vaapi,hwupload' output.mp4
Note: VA-API is generally enabled by ffmpeg's autodetect feature during build time, as long as it detects the respective headers and libraries included in libva, which should be a dependency of the FFmpeg package.

For a quick reference, a constant quality encoding can be achieved with:

$ ffmpeg -vaapi_device /dev/dri/renderD128 -i input.mp4 -vf 'format=nv12,hwupload' -c:v hevc_vaapi -f mp4 -rc_mode 1 -qp 25 output.mp4

If using hevc_vaapi, tune -qp between 25 (visually identical) and more (28 starts to have very small visual loss). If using h264_vaapi, tune between 18 (visually identical) and more (20 starts to have very small visual loss). Also, hevc_vaapi seems to encode 50% faster than h264_vaapi.

NVIDIA NVENC/NVDEC

NVENC and NVDEC can be used for encoding/decoding when using the proprietary NVIDIA driver with the nvidia-utils package installed. Minimum supported GPUs are from 600 series, see Hardware video acceleration#NVIDIA for details.

This old gist provides some techniques. NVENC is somewhat similar to CUDA, thus it works even from terminal session. Depending on hardware NVENC is several times faster than Intel's VA-API encoders.

To print available options execute (hevc_nvenc may also be available):

$ ffmpeg -help encoder=h264_nvenc

Example usage:

$ ffmpeg -i source.ext -c:v h264_nvenc -rc constqp -qp 28 output.mkv

Intel QuickSync (QSV)

Intel® Quick Sync Video uses media processing capabilities of an Intel GPU to decode and encode fast, enabling the processor to complete other tasks and improving system responsiveness.

This requires a libmfx runtime implementation to be installed. libmfx is a dispatcher library that loads an implementation at runtime based on the underlying hardware platform.

When running under Iron Lake (Gen5) to Ice Lake (Gen10) GPUs, it will load intel-media-sdk as the runtime implementation.

When running under Tiger Lake (Gen11) and newer GPUs, libmfx will load onevpl-intel-gpu. See also the oneVPL-intel-gpu system-requirements.

The runtime implementation cannot be changed or chosen on systems with a single Intel GPU, and the corresponding implementation should be installed following the hardware where it will run.

Failure to install said runtime will result in errors like the following:

[AVHWDeviceContext @ 0x558283838c80] Error initializing an MFX session: -3.
Device creation failed: -1313558101.

The usage of QuickSync is describe in the FFmpeg Wiki. It is recommended to use VA-API [2] with either the iHD or i965 driver instead of using libmfx directly, see the FFmpeg Wiki section Hybrid transcode for encoding examples and Hardware video acceleration#Configuring VA-API for driver instructions.

AMD AMF

AMD added support for H264 only video encoding on Linux through AMD Video Coding Engine (GPU encoding) with the AMDGPU PRO proprietary packages, and ffmpeg added support for AMF video encoding, so in order to encode using the h264_amf video encoder, amf-amdgpu-proAUR is required. You may need to link to the ICD file provided by the AMDGPU PRO packages as a variable or ffmpeg could use the open AMDGPU's ICD file and not be able to use this video encoder. An example of a command for encoding could be as follows:

$ VK_DRIVER_FILES=/usr/share/vulkan/icd.d/amd_pro_icd64.json ffmpeg -hwaccel auto -vaapi_device /dev/dri/renderD128 -i input.mkv -c:v h264_amf -rc 1 -b:v 8M h264_amf_8M.mp4

For a quick reference, a constant quality encoding can be achieved with:

$ VK_DRIVER_FILES=/usr/share/vulkan/icd.d/amd_pro_icd64.json ffmpeg -hwaccel auto -vaapi_device /dev/dri/renderD128 -i input.mp4 -c:v h264_amf -f mp4 -rc 0 -qp_b 22 -qp_i 22 -qp_p 22 -quality 2 output.mp4

Tune the three -qp_(b|i|p) together being 18 visually identical and 22 starting to have very small visual loss.

Animated GIF

Whilst animated GIFs are generally a poor choice of video format due to their poor image quality, relatively large file size and lack of audio support, they are still in frequent use on the web. The following command can be used to turn a video into an animated GIF:

$ ffmpeg -i input.mp4 -vf "fps=10,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop -1 output.gif

See http://blog.pkh.me/p/21-high-quality-gif-with-ffmpeg.html for more information on using the palette filters to generate high quality GIFs.

Preset files

Populate ~/.ffmpeg with the default preset files:

$ cp -iR /usr/share/ffmpeg ~/.ffmpeg

Create new and/or modify the default preset files:

~/.ffmpeg/libavcodec-vhq.ffpreset
vtag=DX50
mbd=2
trellis=2
flags=+cbp+mv0
pre_dia_size=4
dia_size=4
precmp=4
cmp=4
subcmp=4
preme=2
qns=2

Using preset files

Enable the -vpre option after declaring the desired -vcodec

libavcodec-vhq.ffpreset

  • libavcodec = Name of the vcodec/acodec
  • vhq = Name of specific preset to be called out
  • ffpreset = FFmpeg preset filetype suffix

Tips and tricks

Reduce verbosity

Use a combination of the following options to reduce verbosity to the desired level:

  • -hide_banner: prevents ffmpeg from outputting its copyright notice, build options and library versions
  • -loglevel: modulates verbosity (fine-tuning options are available), e.g. -loglevel warning
  • -nostats: disables printing of encoding progress/statistics

Output the duration of a video

$ ffprobe -select_streams v:0 -show_entries stream=duration -of default=noprint_wrappers=1:nokey=1 file.ext

Output stream information as JSON

$ ffprobe -v quiet -print_format json -show_format -show_streams file.ext

Create a screen of the video every X frames

$ ffmpeg -i file.ext -an -s 319x180 -vf fps=1/100 -qscale:v 75 %03d.jpg

See also