1. Andrew's Corner »
  2. Linux Explorations... :

FFmpeg and 'Fist of Fury'

I have spent quite some time making backups of my personal DVD collection, backups suitable for playback on my home built AMD FX-8150 computer and its fancy new widescreen monitor. I have developed a technique that works quite well for me and this computer and eventually decided to write up an example of how I encode such movies. I am writing this in real time as I carefully work through the ripping and transcoding of one of my favourite martial arts movies 'Fist of Fury', making all the required decisions, setting the encoding running and writing it all up in my web editor as I go. For those who do not wish to read this admittedly quite long page in the same way I have provided some additional navigation:

Some might be interested to know that this page is the long promised sequel to an older page called 'The Matrix, Mencoder and Matroska' which I guess at least confirms my penchant for tricky page titles! If this slightly self indulgent page proves useful to you feel free to send an email to me from the link at the base of this page.

Working on the video...

I should mention at that outset that I am running Slackware Linux on this computer and perhaps this explains the close attention to fine detail and the obsessive care with the commandline that I will demonstrate! Nevertheless I initially head straight for a gui as I use SMPlayer to investigate the title structure of the DVD. My good friends Trial and Error guide me and I eventually select track 6 and rip this title and its associated chapters to disk:

tccat -i /dev/dvd -d 2 -T 6,-1,1 > fist.vob

It takes tccat a while to create the vob file so I can mention here, while I am waiting, that I used to use vobcopy for this step but found the increased possibilities that came with Transcode and friends a very big drawcard so vobcopy is on the backburner for the moment. Anyway tccat has finally produced a 4 gig vob file and the next step is to use the git FFmpeg's cropfilter to calculate cropping parameters to lose the black borders top and bottom. The crop filter is used as follows:

ffmpeg -y -ss 00:10:00 -i fist.vob -vf cropdetect=24:16:0 \
       -vframes 300 -f rawvideo -an /dev/null

The generated numbers 720:416:0:80 (which represent width, height, x-axis and y-axis) can then be used with MPlayer's 'rectangle' filter to assess just how aggressive the cropping will appear when the video is eventually extracted:

mplayer -vf rectangle=720:416:0:80 -ss 00:10:00 fist.vob

The movie looks fine at this size and when played at full screen on my widescreen monitor it is more than watchable. Now is the time again to break out the git FFmpeg which I have used with such alliterative power in the title of this page! The following commandline, which uses a very good quality x264 preset, is a very slight variation of one that I have learnt from Lou on the Ubuntu Forums:

ffmpeg -i fist.vob -vf crop=720:416:0:80 \
       -c:v libx264 -preset slow -tune film -crf 22 -threads 0 -an \

Bear in mind that this FFmpeg command is based on FFmpeg version N-39902-g788a60d, downloaded and compiled on April 17th 2012, and will probably not work on older versions. On my new 8-core computer this FFmpeg command only takes 28 minutes running at about 100 fps, unlike the overnight job for my old laptop!

Working on the audio...

The video stream is done and looks fine in SMPlayer, colours are crisp and movement nice and clean with an eventual file size of 500 mb. Now to turn to the audio which dare I mention it is not the finest feature of 'Fist of Fury'! FFmpeg shows the following audio streams in the big vob file:

Stream #0:1[0x80]: Audio: ac3, 48000 Hz, 5.1(side), s16, 448 kb/s
Stream #0:2[0x81]: Audio: ac3, 48000 Hz, 5.1(side), s16, 448 kb/s
Stream #0:3[0x82]: Audio: ac3, 48000 Hz, stereo, s16, 192 kb/s

FFplay will play these streams individually, just note that the stream numbers actually start with zero for this playback, so they can be easily identified and the following is what I discover from these streams:

ffplay -ss 200 -ast 0 fist.vob # English language stream
ffplay -ss 200 -ast 1 fist.vob # Cantonese language stream
ffplay -ss 200 -ast 2 fist.vob # English language commentary

The English language version is pretty horrible and I will admit that I have a loathing of English audio streams with foreign language movies so I will extract the Cantonese stream and also the English commentary by Bey Logan which sounds quite entertaining. I am caught a little with the English commentary as it is made over the English audio stream! First to extract the 2 streams:

ffmpeg -i fist.vob -map 0.2 cantonese_quiet.wav && \
ffmpeg -i fist.vob -map 0.3 commentary_quiet.wav

I have always found that the audio extracted from DVDs is a little too quiet for my taste and for the little speakers on my laptop so usually I increase the volume. FFmpeg has a nice option to increase the volume but my old friend SoX can analyse the 2 audio streams and recommend an increase in volume that guarantees no distortion. A very full set of statistics on both sound streams can be found by running the following command:

sox cantonese_quiet.wav --show-progress -n stat 
sox commentary_quiet.wav --show-progress -n stat

It is a long term project of mine to dive into these statistics a little more fully as for the moment all that I will really use is the volume increase and to tell the truth I usually don't even use the full value given by SoX:

cantonese_quiet.wav                  commentary_quiet.wav

Samples read:         586773504      Samples read:         586773504
Length (seconds):   6112.224000      Length (seconds):   6112.224000
Scaled by:         2147483647.0      Scaled by:         2147483647.0
Maximum amplitude:     0.471802      Maximum amplitude:     0.285278
Minimum amplitude:    -0.401672      Minimum amplitude:    -0.291656
Midline amplitude:     0.035065      Midline amplitude:    -0.003189
Mean    norm:          0.010827      Mean    norm:          0.016347
Mean    amplitude:    -0.000000      Mean    amplitude:    -0.000000
RMS     amplitude:     0.018709      RMS     amplitude:     0.027477
Maximum delta:         0.318237      Maximum delta:         0.297607
Minimum delta:         0.000000      Minimum delta:         0.000000
Mean    delta:         0.002975      Mean    delta:         0.001445
RMS     delta:         0.006416      RMS     delta:         0.003810
Rough   frequency:         2619      Rough   frequency:         1059
Volume adjustment:        2.120      Volume adjustment:        3.429

A fascinating difference between the 2 streams? So now to boost the volume reasonably conservatively on both streams using the 'Volume adjustment' figures:

sox -v 1.5 --show-progress cantonese_quiet.wav cantonese_loud.wav && \
sox -v 2.5 --show-progress commentary_quiet.wav commentary_loud.wav 

A quick test shows that the volume is quite acceptable on my laptop, certainly it should not be any louder, and finally to finish off the work on the audio streams by encoding with NeroAacEnc. This gives far superior performance to other aac encoders and is now my preferred option over aac produced by faac, over mp3 or even over my old friend ogg vorbis.

neroAacEnc -q 0.55 -if cantonese_loud.wav -of cantonese.mp4 && \
neroAacEnc -q 0.55 -if commentary_loud.wav -of commentary.mp4

A quick playback of these completed audio files confirms that they are just what I am after for this small project, but that uses up all of the time I have available for this project today so I leave work on the subtitles for tomorrow.

Working on the subtitles...

Converting subtitles to srt from DVD is a somewhat laborious task but well worth doing if not only for the buzz of creating your own srt files and also for having the ability to correct the often fractured English seen in many subtitles. First to test the DVD with MPlayer for available subtitles:

mplayer dvd://6 -vo null -ao null -frames 0 -v 2>&1 | grep sid
subtitle ( sid ): 0 language: en

The single subtitle makes at least the opening stages very easy, normal practice is to identify the exact stream by adding the sid number to 0x20 so in this case I can extract the subtitle from the vob file with tccat as follows:

cat fist.vob | tcextract -x ps1 -t vob -a 0x20 > subtitle_stream_en.ps1

The time has come now to start using some of the utilities that come as part of subtitleripper, and the first is subtitle2pgm which converts the raw subtitle stream to a series of images. These images must be read by some OCR software a little later so I always start with a small time-limited test run:

subtitle2pgm -c 255,255,0,255 -e 00:05:00,10 -i subtitle_stream_en.ps1 -o Fist_of_Fury

This step must be done with some diligence as the -c option selects the vital grey scale for the images, the text on these images must be solid and well differentiated or the OCR software will fail quite badly. The documentation that comes with subtitle2pgm suggests the following combinations to experiment with:

-c 255,0,255,255
-c 255,255,0,255
-c 255,255,255,0
-c 0,255,255,255

It is a simple matter of trial and error but I have found the best colours for these subtitles in my own commandline and the next command creates the literally hundreds of images for the OCR software to tackle. They will be compressed to save a little on space:

subtitle2pgm -c 255,255,0,255 -g 2 -i subtitle_stream_en.ps1 -o Fist_of_Fury

A little over 600 images in the case of 'Fist of Fury' and the final test comes as the script calls for gocr to look at all of these images and produce the actual text for the subtitles. I have found that time and care taken with this step is time well spent and my best discovery was the use of the -v option, as demonstrated below, that calls up an image viewer (xv in my case) whenever gocr is puzzled and asks for correction:

pgm2txt -v -f en Fist_of_Fury

I would be interested to hear from those who use other image viewers for this purpose, the setting can be found in the configuration section of the pgm2txt script. Finally yet another utility from subtitleripper that writes the actual srt file:

srttool -s -i Fist_of_Fury.srtx -o Fist_of_Fury.srt

That is basically the easy part done! Now it is time to go through Fist_of_Fury.srt with my favourite text editor correcting the errors that gocr has let through. This particular effort has been more than usually accurate, I suspect because of the new version of gocr that I have installed, and I have only spent 2 hours carefully correcting the 46 page document generated by the above commands. That is enough for this day, tomorrow for the fun bit of muxing and tagging the final product.

Muxing and tagging...

The hard work has been done and there are only a few choices to make, the first being which container to park all of these files in. For this particular project I am using mp4 and MP4Box but another good choice would be matroska and the amazing gui MKVMerge. The syntax is fairly straightforward and the muxing is pretty fast:

mp4box -add "fist.h264:fps=25.00:name=h264 Video Stream" \
       -add "cantonese.mp4:lang=zho:name=AAC Cantonese Audio" \
       -add "commentary.mp4:lang=eng:name=AAC English Commentary Audio" \

Note that I do not add in the subtitle stream as MP4Box actually converts it to another format, the name of which temporarily eludes me, and this has shown some odd subtitle effects in my experience. I prefer the subtitles external anyway for ease of further editing, mind you another option would be to use a Matroska container and then it would be an easy matter to move the subtitles in and out with MKVMerge gui for editing. Yet another project for another day! It remains now to add some tags to the completed file and for this purpose I use AtomicParsley:

AtomicParsley Fist_of_Fury.mp4 --metaEnema --title "Fist of Fury" \
              --encodingTool "FFmpeg N-39902-g788a60d& x264 0.122.2183 c522ad1& NeroAacEnc" \
              --year 1972 --stik Movie --freefree --overWrite

The movie is now complete, weighs in at about 700 megabytes, plays well on my computer screen and has absorbed a lot of spare time over the last 4 days. And now I shall leave this page and actually watch this great martial arts movie!

And in conclusion .....

I owe Lou a special 'thank you!' for introducing me to FFmpeg and x264 encoding in the first place, and also Mosu for his page that first demonstrated to me the painstaking art of producing srt subtitles from DVDs. Please feel free to contact me with any errors of fact that you have found on this page, any errors of opinion will probably remain uncorrected. In the meantime I am getting ready to start work on my next project of ripping and transcoding my copy of the great Spanish language film 'El laberinto del fauno', what about you?