Dear Netflix

As long as you are busy re-encoding your content, can you please fix Star Trek: Voyager? It makes my eyes bleed.

The method that I use when converting content is to never trust what you have been told by the content provider, but to instead analyze every piece of content that is to be converted even if it is in the same series from the same publisher using the same media type.

I use the command line version of MediaInfo and some output from FFmpeg to get things done. I prefer Bash shell scripting as it is what I am most familiar with.

Get your information from MediaInfo:
Mediainfo $inputfile > info.tmp

Capture the frames per second from the video:
fps=$(cat info.tmp | grep Frame | grep [Rr]ate | grep -v [Mm]ode | cut -d “:” -f2 | tr -d ” fps” | head -1)

If the FPS reports as either empty or Variable then force a framerate that works. If I know that the content came from Europe I force it to 25fps whereas if it came from the US I force it to 23.976fps. You may need to review your content post encode to make sure you did not introduce telecine judder.

Check to see if your content is interlaced, progressive, or uses the MBAFF method of interlacing:
scan=$(cat info.tmp | grep “\(Interlaced\|Progressive\|MBAFF\)” | head -1 | cut -d “:” -f2 | tr -d ” “)

If the content is in an MPEG Program Stream container, it reports as being 29.970fps, and does not announce if it is interlaced, progressive, or MBAFF then the content is actually 23.976fps using soft telecine.
if [ “$fps” == “29.970” ] && [ “$scan1” == “” ] && [ “$mpegps” == “MPEG-PS” ] ;
then
fps=$(echo “23.976”)
scan1=$(echo “Progressive”)
fi

The odds are high that your media group received content from your provider in an MPEG-PS VOB container and did not look for interlaced content.

Detecting everything mentioned above will ensure that fewer frames are being encoded, it eliminates telecine judder, you don’t have to worry about encoding interlacing artifacts, it allows for a more optimized bit per pixel density, and it will help with providing higher video quality for the customer.

In addition, order of operations can be important when encoding content. I always deinterlace content if necessary before I force the detected or overridden FPS, crop the content, resize or scale the content, and then rotate the content. An example from my script is as follows:
ffmpeg -fpsprobesize $gop -i $inputfile -pix_fmt yuv420p $totaltime -vsync 1 -sn -vcodec libx264 -map $vtrack $scan -r $fps -vf “crop=$w1:$h1:$x1:$y1,scale=$fixedwidth:$fixedheight$fixrotation” -threads 0 -b:v:$vtrack $averagevideobitrate -bufsize $buffer -maxrate $maximumvideobitrate -minrate $minimumvideobitrate -strict experimental -acodec aac -map $audio -b:a:$audio $audiobitrate -ac 2 -ar $audiofrequency -af “aresample=async=1:min_hard_comp=0.100000:first_pts=0” -pass 1 -preset $newpreset -profile:v $defaultprofile -qmin 0 -qmax 63 -keyint_min $minkeyframe -g $gop $newtune -x264opts no-scenecut -map_metadata -1 -f mp4 -y $outputfile

Now go forth and encode.

Advertisements

Intelligent video encoding

I have been saying this for a few years now. Netflix has finally gotten on the bandwagon.

I worked at RealNetworks for over six years and became their onsite encoding expert for creating H.264 video with AAC audio in an MP4 container using FFmpeg after just three years. Our group was laid off when their Helix Streaming Media Server, which I supported, was discontinued.

I have converted most of my Blu-ray and DVD content, including one HD-DVD, to MP4 files and have found, just as the article says, that not all video is created equal. Why? Movement is expensive. In addition, grain is movement. Please do not get me started on encoding artifacts in the source media. NeatVideo, if you know how to use it, can help with both grain and encoding artifacts without having to resort to sharpening. The use of sharpening is, in my opinion, the refuge of the inept unless the source is so low quality that it looks like a blur. Even then use sparingly only if it is absolutely needed. If you want a challenge run NeatVideo against the movie Fight Club.

As an example, encode for yourself both a high action video and some low action video using x264 using a CRF value of 21 with the veryfast preset and the baseline profile. When you are finished use MediaInfo to look at the bit per pixel density (BPP) of the output video. The action video will have a much higher bitrate and BPP density than the low action video. As such you should target what the video requires.

My procedure for finding a decent bitrate is as follows:

1) Encode the video using the veryfast preset and the baseline profile to grab what the bit per pixel density is at CRF 21.

2) Perform a two pass encode with the medium preset and the high444 profile using the BPP value found in the video. You will see that both the initial CRF encoded video and the two pass video are about the same size and have, obviously, the same BPP density. The output “CRF” value, as reported by FFmpeg, will be about 19.4 due to compression. I have covered this before. Don’t take my word for it, use the Moscow University Video Quality Measurement Tool.

The reason for the medium preset is that mobile devices and other hardware decoders (Roku, Apple TV, etc…) all have limitations on playing H.264 video content that has more than three reference frames. To date I have found no device that cannot handle the high444 profile, which prioritizes the luma (Y’) channel over chrominance (Cb Cr) even though manufacturers state that they only support the main profile with CABAC. The only devices that I have not tested were the old school Blackberry phones.

On a side note, use the information that MediaInfo puts out as well as what FFmpeg puts out to find out what the width, height, and FPS of the source is as well as what the source audio frequency and bitrate are. If you know what you are doing you can detect telecine content in MPEG-PS containers (VOB) so that you do not duplicate frames when encoding. In addition, forcing the frame rate to what the source media says it is will keep the framerate solid. Advanced class is performing automatic crop detection (beware “The Right Stuff” and “Tron Legacy”), and audio normalization if your hearing is poor like mine is.

How will this affect your production workflow? If you decide to implement then not much. All that you need to do is perform a test encode to find the BPP density and then have your MBR content encoded to the same BPP density. If you are converting a series do a test convert of a few episodes and find the right bitrate for you.

Extreme encoding settings, quality, and size.

I’ve been meaning do some output quality testing and have finally gotten around to doing so. Because I like to have my content able to be streamed via RTSP, RTMP, and HTTP (HLS or DASH) I encode to bitrate as RTSP can be sensitive to bitrate fluctuation. I do my testing using CRF 21 for consistency of output and speed. To do this testing I used the MSU Video Quality Measurement Tool which will put out bad frames, a spreadsheet, and even a video showing you the differences between one video and another.

My typical encode is done using the medium preset. It uses a distance of three reference frames which is compatible with hardware decoders.[1] I will also encode using the high444 profile which while technically unsupported by mobile phones but does in fact work. To date I have had zero problems with those settings when I tested multiple handsets from multiple manufactures during my time at RealNetworks supporting their former product Helix Server.

When I am going to encode to bitrate I do a first pass using CRF so that I can get a better idea of what the bit per pixel density is but I will encode using the veryfast preset and the baseline profile. When I perform my two pass encoding I encode to the bit per pixel density that the CRF file reported using MediaInfo. If you look at the first pass of a two pass encode it will be a smaller size than the second pass as the second pass puts back the bits lost due to the compression used on the first pass. This behavior got me thinking.

The tests that I just ran were:
1) Encode using the veryfast preset and the baseline profile using CRF 21.

2) Encode using the medium preset, the high444 profile using CRF 21 and the following options:
-x264opts b-adapt=2:direct=auto:me=tesa:subme=11:aq-mode=2:aq-strength=1.0:fast_pskip=0:rc_lookahead=72:partitions=p8x8:trellis=2:weightp=2:merange=64:bframes=8"

I took the files and then remuxed them into AVI as MSUVQMT was having issues with the MP4 container.

ffmpeg -i input.mp4 -vcodec copy -an intput.avi

Note that the input file framerate was 23.976fps and the output framerate became 47.952fps. Did this invalidate my test?[2] Possibly, but MediaInfo only looks at a small part of the video stream. If your video mixes 29.970fps interlaced content with 23.976fps content then it will know nothing of the 23.976fps content later in the video stream. Yes, I have seen this issue happen with several MPEG-PS files.

After remuxing the files and running them through MSUVQMT I was not surprised to see that there were no quality differences between the baseline file and the high444 profile. The SSIM that was reported in the spreadsheet that MSVQMT reported was “AVG: 0.97723”, which I feel is inline with entropy encoding, and the only other thing different was the size of the video stream.

The baseline file, as reported by MediaInfo, is as follows:
----------------------------------------
Video
ID                                       : 0
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : Baseline@L3.0
Format settings, CABAC                   : No
Format settings, ReFrames                : 1 frame
Codec ID                                 : avc1
Duration                                 : 1mn 0s
Bit rate                                 : 1 459 Kbps
Width                                    : 854 pixels
Height                                   : 322 pixels
Display aspect ratio                     : 2.35:1
Frame rate mode                          : Variable
Frame rate                               : 47.952 fps
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.111
Stream size                              : 10.4 MiB (99%)
Writing library                          : x264 core 142 r2479 dd79a61
Encoding settings                        : cabac=0 / ref=1 / deblock=1:-1:-1 / analyse=0x1:0x111 / me=hex / subme=2 / psy=1 / psy_rd=1.00:0.15 / mixed_ref=0 / me_range=16 / chroma_me=1 / trellis=0 / 8x8dct=0 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=0 / threads=8 / lookahead_threads=2 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=0 / weightp=0 / keyint=120 / keyint_min=12 / scenecut=40 / intra_refresh=0 / rc_lookahead=10 / rc=crf / mbtree=1 / crf=21.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ip_ratio=1.40 / aq=1:1.00
----------------------------------------

The high444 profile with the extra x264 options looks like this:

----------------------------------------
Video
ID                                       : 0
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L3.0
Format settings, CABAC                   : Yes
Format settings, ReFrames                : 4 frames
Codec ID                                 : avc1
Duration                                 : 59s 997ms
Bit rate                                 : 1 364 Kbps
Width                                    : 854 pixels
Height                                   : 322 pixels
Display aspect ratio                     : 2.35:1
Frame rate mode                          : Variable
Frame rate                               : 47.952 fps
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.103
Stream size                              : 9.76 MiB (99%)
Writing library                          : x264 core 142 r2479 dd79a61
Encoding settings                        : cabac=1 / ref=3 / deblock=1:-1:-1 / analyse=0x3:0x10 / me=tesa / subme=11 / psy=1 / psy_rd=1.00:0.15 / mixed_ref=1 / me_range=64 / chroma_me=1 / trellis=2 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=0 / chroma_qp_offset=-3 / threads=8 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=8 / b_pyramid=2 / b_adapt=2 / b_bias=0 / direct=3 / weightb=1 / open_gop=0 / weightp=2 / keyint=120 / keyint_min=12 / scenecut=40 / intra_refresh=0 / rc_lookahead=72 / rc=crf / mbtree=1 / crf=21.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ip_ratio=1.40 / aq=2:1.00
----------------------------------------

Note the Bit Per Pixel density is lower on the more compressed version. This is expected because the video stream is smaller due to higher compression. As noted above the bits are put back and your Bit Per Pixel density is returned to what is expected when using two pass encoding.

What did I learn here? Video quality is directly affected by bitrate while compression merely makes the video stream smaller with no visible increase in quality. With two pass encoding to the target Bit Per Pixel density the quality will be higher at the same bitrate but may have some differences. For example I converted the fight scene from They Live many years ago using two similar bitrate based methods and they did not come out the same. You can see that video on YouTube here.

The question we are left with is how much time do I really want to spend making the file just a bit smaller but at the exact same quality? Me, not that much time.

1) I will always remember that three reference frames are the maximum distance by remembering a scene in Monty Python and the Holy Grail.

…And Saint Attila raised the hand grenade up on high, saying, “O LORD, bless this Thy hand grenade that with it Thou mayest blow Thine enemies to tiny bits, in Thy mercy.” And the LORD did grin and the people did feast upon the lambs and sloths and carp and anchovies and orangutans and breakfast cereals, and fruit bats and large chu… [At this point, the friar is urged by Brother Maynard to “skip a bit, brother”]… And the LORD spake, saying, “First shalt thou take out the Holy Pin, then shalt thou count to three, no more, no less. Three shall be the number thou shalt count, and the number of the counting shall be three. Four shalt thou not count, neither count thou two, excepting that thou then proceed to three. Five is right out. Once the number three, being the third number, be reached, then lobbest thou thy Holy Hand Grenade of Antioch towards thy foe, who being naughty in My sight, shall snuff it.”

2) 23.976 * 2 == 47.952
ffprobe.exe sw4-gout-test-crf-baseline.avi
ffprobe version N-67742-g3f07dd6 Copyright (c) 2007-2014 the FFmpeg developers
built on Nov 16 2014 22:10:05 with gcc 4.9.2 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-zlib
libavutil      54. 13.100 / 54. 13.100
libavcodec     56. 12.101 / 56. 12.101
libavformat    56. 13.100 / 56. 13.100
libavdevice    56.  3.100 / 56.  3.100
libavfilter     5.  2.103 /  5.  2.103
libswscale      3.  1.101 /  3.  1.101
libswresample   1.  1.100 /  1.  1.100
libpostproc    53.  3.100 / 53.  3.100
Input #0, avi, from 'sw4-gout-test-crf-baseline.avi':
Metadata:
encoder         : Lavf56.13.100
Duration: 00:01:00.02, start: 0.000000, bitrate: 1469 kb/s
Stream #0:0: Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 854x322 [SAR 920:1037 DAR 40:17], 1459 kb/s, 47.95 fps, 23.98 tbr, 47.95 tbn, 47.95 tbc

Star Wars Episode 4

I have three versions of Star Wars episode 4 and four images in the screenshots below. This should provide an overview of the challenges involved in performing color correction. Clockwise from the top left.

1) RAW VOB file from the Star Wars Ep 4 “GOUT” edition.

2) GOUT modified to MP4 in Sony Vegas with no filters. Vegas hates MPEG-PS audio tracks and that VOB reports time incorrectly.

3) Despecialized version 2.5 by Harmy.

4) Editdroid’s version from the 1993 Laserdisc in VOB format.

You will note that the GOUT version in Vegas looks washed out. This is caused by a colorspace issue stemming from the NTSC color space (16-235) versus the RGB color space (0-255). I can change the levels in Vegas to make it look exactly the same as it does outside of Vegas with one of the built in presets.

The color pallet used in GOUT is the same as the Laserdisc because they both came from the same master. I would love to get my hands on the 1985 laserdisc release but that thing is beyond rare.

The Despecialized edition suffers from the f’ing Hollywood look with teal & orange slathered all over it as well as oversaturated colors. To fix issues like that I have to skew cyan towards blue to fix some of it. Desaturating yellow and red helps to fix the New Jersey fake tan look in most movies. Green is occasionally oversaturated. Couple all of that with lighness adjustments for cyan, yellow, magenta, red, green, and blue and things begin to get complicated quickly. Wait, levels are sometimes off as well. Add that to the mix.

I will be using the Editdroid version as my source and I am hoping to alter the color palette to be more inline with GOUT. Preliminary results at this point do not look promising at all. It looks like Harmy used the AAV ColorLab plugin, which I have, to modify colors. Sadly that plugin seems to have the side affect, at least on my machine, of screwing up some shades of orange like traffic cones and the orange Pinto in The Blues Brothers. My monitor is color balanced using a Spyder3Pro.


From the research that I have done there is no longer such thing as a “correct” version of Star Wars that a mere mortal like I can get their hands on. My hope is that Disney will fix the color issues in any reissues it may put out, but that is a pipe dream at best.

http://en.wikipedia.org/wiki/List_of_changes_in_Star_Wars_re-releases

GOUT-GOUTinVegas-Despecialized-Editdroid-001GOUT-GOUTinVegas-Despecialized-Editdroid-002GOUT-GOUTinVegas-Despecialized-Editdroid-003GOUT-GOUTinVegas-Despecialized-Editdroid-004GOUT-GOUTinVegas-Despecialized-Editdroid-005GOUT-GOUTinVegas-Despecialized-Editdroid-006GOUT-GOUTinVegas-Despecialized-Editdroid-007GOUT-GOUTinVegas-Despecialized-Editdroid-008GOUT-GOUTinVegas-Despecialized-Editdroid-009GOUT-GOUTinVegas-Despecialized-Editdroid-010GOUT-GOUTinVegas-Despecialized-Editdroid-011GOUT-GOUTinVegas-Despecialized-Editdroid-012GOUT-GOUTinVegas-Despecialized-Editdroid-013GOUT-GOUTinVegas-Despecialized-Editdroid-014GOUT-GOUTinVegas-Despecialized-Editdroid-015GOUT-GOUTinVegas-Despecialized-Editdroid-016