Color grading and how to do it the hard way

I hadn’t paid much attention to color grading until I was blatantly subjected to it via the first Transformers movie. Yes, I have no taste but because I grew up in the 80’s I was compelled to watch it in the theater. A while later I ran across this page that addressed my concerns but then I asked myself if it would be possible to make it suck less.

At the time I was using Sony Vegas Movie Studio HD Platinum 11 but couldn’t find any decent method to do color grading the way I wanted to do it until I ended up stumbling upon AAV ColorLab’s plugin. I was able to scrape away a lot of the problems, but in some cases it just wouldn’t work like I hoped it would.

I then decided to give DaVinci Resolve a try, but I found the interface to be less than intuitive to me even after watching several videos on the subject so back to AAV Color Lab it was.

Around that time I really started to get back into Photography and had a picture that needed some major color correction and it was then that I discovered Selective Color in Photoshop CS5. I looked and looked for a video equivalent to that but didn’t find anything for years until I found that somebody added the same logic for selective color that Photoshop uses to FFmpeg. FFmpeg is my bag and after screwing around with the filter I finally found a workflow that sucks but works consistently.

The things that are most important to me when I am attempting to unscrew that horrible teal and orange color grading, which is often cyan and orange color grading, is to make sure that white remains white, grey remains grey, and black remains black. This also includes skin color and green trees and grass. If things are done properly skin color will return to normal and the overbearing teal/cyan that is slathered over the screen brings the original colors back to what they were, or at least close to what they might have been.

Now don’t get me wrong. There are a lot of things done right with color grading such as the Matrix series of movies and others like “Because of Winn Dixie” which have obvious color grading but help bring the movie to life rather than perform second degree assault on a person’s retinas. What annoys me to no end is when a perfectly good movie gets “the treatment” by somebody who thought the original movie needed a bit of help when put on DVD or Blu-ray. Don’t believe me? Take a look at the differences between “The Alien Legacy” on DVD and the “Alien Anthology” on Blu-ray. The DVDs in “The Alien Legacy” release didn’t have a lot of color grading on them outside some mild low, midrange, and high adjustments, but the “Alien Anthology” was broken so hard I had to fix several sections by hand in Sony Vegas.

For example, the scene where John Hurt’s character Kane descended into the the cave filled with eggs was missing most of the eggs because they crushed the blacks. After a lot of work dicking around with gamma and levels I was able to get most of the eggs back that were present in the DVD which I used for reference. Once done I was able to do a single pass with FFmpeg to unscrew the damage that was done. It now looks very similar to, but not exactly like, the DVD version.

Now how do I find a non horrible setting for color grading? I take a screenshot of the movie where the color grading sucks, pop it into Adobe Photoshop, and start tweaking Selective Color until it sucks less. In some instances it can all come together in ten screenshots or less, however in other instances I’ll have to go across an entire series to find the right global settings. To date every single movie series I have attempted to decrease the sucking wound that is teal/cyan and orange color grading and have used the same settings for multiple series of movies except the Alien series.

Below are a few examples from my FFmpeg script that may provide insight to what this looks like.

Alien 1979:
-vf selectivecolor=reds=0 -0.20 -0.20 0:yellows=0 0 -0.20 0.10:cyans=-0.66 -0.50 0.20 0.75:blues=0 0 -0.50 0.15

-vf selectivecolor=cyans=-0.33 0.45 0.33 -0.15

-vf selectivecolor=reds=0 -0.15 -0.15 0:yellows=0 0 -0.20 0.10:cyans=-0.33 0.25 0.33 -0.15

DC Extended Universe:
-vf selectivecolor=reds=0 -0.15 -0.15 0:yellows=0 0 -0.2 0.1:cyans=-0.33 0.33 0.33 -0.20

Harry Potter series:
-vf selectivecolor=cyans=-0.33 0.33 0.66 -0.2:greens=0.15 0.15 -0.15 0

Lord of the Rings and The Hobbit:
-vf selectivecolor==reds=0 -0.15 -0.15 0.15:yellows=0 0 -0.2 0:greens=-0.25 0.25 0 -0.15:cyans=0 0.50 0.50 -0.33

Mavel Cinematic Universe (CMU):
-vf selectivecolor=reds=0 -0.2 -0.2 0.1:yellows=0 0 -0.2 0.05:cyans=-0.50 0.50 0.50 -0.30

-vf selectivecolor=reds=0 -0.1 -0.1 0.1:yellows=0 0 -0.1 0.05:cyans=0 0.1 0.1 -0.05

Using selective color in FFmpeg causes rendering on my current PC to slow down for each color that is modified as reflected by my overall CPU usage. I currently believe that this is a memory bandwidth issue on my machine but will not know until I upgrade to something with a bit more power.

How to extract and convert closed caption files the hard way.

I will be using the terms “closed captions” and “subtitles” interchangeably in this post because it isn’t always possible to know if the source binary image based SUP file you have has either closed captions, which include both descriptive text and dialog, or subtitles, which contain only dialog, in them.

I’ve been watching a few TV shows on Hulu and believe that at least Cloak & Dagger as well as Stichers had their subtitles ripped from an m2ts file using either HdBr Stream Extractor v9 or MeGUI, likely from Blu-ray, and converted using Subtitle Edit. How do I believe that this is the case?

More often than not I will see a sentence that is in italics that will have two or more words touching each other near the middle of the sentence because the distance between the letters in pixels is much smaller than normal letters. Why it doesn’t happen as often across the entire sentence is beyond me at this time, then again I have a massive replace list. Ten and eleven pixels work well for most Blu-ray content. Subtitle Edit likes DVD subtitles to be around 6-8 pixels apart because the letters are lower resolution. Your mileage will vary.

You can adjust Subtitle Edit to look for letters closer together or further apart based on the number of pixels you tell it are in a space, but this is a global setting for each input file and cannot be modified specifically to adjust for italics because everything is in an image based format, specifically a SUP file. For example if you modify it to look for letters/blocks closer together then you will likely have a lot of individual characters instead of words. If you modify it to have letters/blocks further apart you will merge a lot of words together.

Thiscanbea badthing. I t c a n a l s o b e a b a d t h in g.

Subtitle edit has two methods for OCR.
1) Tesseract. This does a decent job but I no longer use it as it has problems with some fonts and italics.

2) Binary Image Compare. Blu-ray MPEG-TS and DVD’s MPEG-PS containers use images for playback of video closed captioning. This is what I use and what I think that Hulu also uses. It is also the recommended option for Subtitle Edit.

Binary Image Compare has to be trained to look at the many different fonts that you can come across right down to the letter, number, punctuation, and symbol level. The process to teach it what each letter looks like, typically multiple times for the same exact letter early on, is onerous and will crush your soul. When it comes across a “block” of information it asks you what it is and if it is italic or not. You can expand the block to fit quotes and the like, but you cannot shrink it from what it originally detects. Sometimes it will detect “rt” as a single block so you have to add “rt” as a letter. This is most common with italics, but depending on the font it can also affect normal letters and numbers.

When Subtitle Edit comes across a word it doesn’t know it will ask you to do one of a few things.

A) Add to names/noise list (case sensitive)
This is for things like Hogwarts or WebRTC.

B) Add to user dictionary
This is for adding words that are case insensitive that are not in it’s default dictionary.

C) Add pair to OCR replace list
A lower case L looks the same as an upper case “I” in most sans-serif fonts. You will need to use this to fix things like.
Iower (with uppercase i's)
lower (with lower case L's)

This can also fix words that are too close together.
This can be a

D) Google it.

Best practice is to have Subtitle Edit just rack up the words it doesn’t know so you can bang the majority of the duplicates out after the first full run. Also set the Max. error% value to 1.0 percent for higher accuracy. Run again to catch some more and then run it until there is nothing left to fix. Don’t be surprised if a few more words pop up on the second or third pass.

I’ve added a lot of characters and words to the database from multiple TV series and movies as some of them use unique fonts and have both unique words and names in them. I use the website frequently because what Subtitle Edit provides in it’s interface is very limited.

In some cases Subtitle Edit will fail to show a letter or will detect it incorrectly. If you see this then you can simply click on the line of text that has the problem in the main window, navigate to the specific character, and update it accordingly. I do not recommend updating an I to be an l or an l to be an I. If you do that you will be playing whack-a-mole forever. Set it, forget it, and add it to the “Add pair to OCR replace list” on the fly as it fails onward.

Do not be surprised if your subtitles are not properly aligned. I currently use either Easy Subtitles Synchronizer or Subtitle Edit to fix this problem depending upon my mood. In a few rare cases I had to tear down the text based subtitle that Subtitle Edit created, remove the portions that didn’t line up at all, and then add them back by hand. Always make a backup before editing. Your mileage may vary.

And last but not least, don’t forget punctuation, specifically when it is in an MPEG-PS VOB file from a DVD. A buddy of mine who gives lectures on advanced SED usage helped me with this almost indecipherable, at least to me, SED filter because regex in Subtitle Edit is insufficient for my needs.

s/\([[:alpha:]]\) ,/\1,/g
s/\([[:alpha:]]\) ,\([[:alpha:]]\)/\1'\2/g
s/\([[:alpha:]]\)' \([[:alpha:]]\)/\1'\2/g
s/\([[:alnum:]]\) \./\1./g
s/ , /, /g
s/ . /. /g

Subtitle Edit, in my experience, is not suitable for automation and requires a lot of hand’s on work to get things right. If Hulu is using Subtitle Edit then I feel that they either likely don’t know better, assume that subtitles that they receive are perfect, or they don’t give a shit. I’m not sure which one is worse.

Please don’t take my word about subtitle automation being sub-optimal. Give the following a whirl and compare it against what you created via Subtitle Edit’s GUI.

"C:\Program Files\Subtitle Edit\SubtitleEdit" /convert "inputfile.sup" SubRip