Editing Subtitles in Linux

I have been a world movie and regional movies lover for decades. Subtitles are the essential tool that have enabled me to enjoy the best movies in various languages and from various countries.

If you enjoy watching movies with subtitles, you might have noticed that sometimes the subtitles are not synced or not correct.

Did you know that you can edit subtitles and make them better? Let me show you some basic subtitle editing in Linux.

Editing subtitles in Linux

Extracting subtitles from closed captions data

Around 2012, 2013 I came to know of a tool called CCEextractor. As time passed, it has become one of the vital tools for me, especially if I come across a media file which has the subtitle embedded in it.

CCExtractor analyzes video files and produces independent subtitle files from the closed captions data.

CCExtractor is a cross-platform, free and open source tool. The tool has matured quite a bit from its formative years and has been part of GSOC and Google Code-in now and then.

The tool, to put it simply, is more or less a set of scripts which work one after another in a serialized order to give you an extracted subtitle.

You can follow the installation instructions for CCExtractor on this page.

After installing when you want to extract subtitles from a media file, do the following:

ccextractor <path_to_video_file>

The output of the command will be something like this:

$ ccextractor $something.mkv
CCExtractor 0.87, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
————————————————————————–
Input: $something.mkv
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]
—————————————————————–
Opening file: $something.mkv
File seems to be a Matroska/WebM container
Analyzing data in Matroska mode
Document type: matroska
Timecode scale: 1000000
Muxing app: libebml v1.3.1 + libmatroska v1.4.2
Writing app: mkvmerge v8.2.0 ('World of Adventure') 64bit
Title: $something
Track entry:
Track number: 1
UID: 1
Type: video
Codec ID: V_MPEG4/ISO/AVC
Language: mal
Name: $something
Track entry:
Track number: 2
UID: 2
Type: audio
Codec ID: A_MPEG/L3
Language: mal
Name: $something
Track entry:
Track number: 3
UID: somenumber
Type: subtitle
Codec ID: S_TEXT/UTF8
Name: $something
99% | 144:34
100% | 144:34
Output file: $something_eng.srt
Done, processing time = 6 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

It basically scans the media file. In this case, it found that the media file is in malyalam and that the media container is an .mkv container. It extracted the subtitle file with the same name as the video file adding _eng to it. 

CCExtractor is a wonderful tool which can be used to enhance subtitles along with Subtitle Edit which I will share in the next section.

Interesting Read: There is an interesting synopsis of subtitles at vicaps which tells and shares why subtitles are important to us. It goes into quite a bit of detail of movie-making as well for those interested in such topics.

Editing subtitles with SubtitleEditor Tool

You probably are aware that most subtitles are in .srt format . The beautiful thing about this format is and was you could load it in your text editor and do little fixes in it.

A srt file looks something like this when launched into a simple text-editor:

1
00:00:00,959 –&gt; 00:00:13,744
"THE CABINET
OF DR. CALIGARI"
2
00:00:40,084 –&gt; 00:01:02,088
A TALE of the modern re-appearance of an 11th Century Myth
involting the strange and mysterious influence
of a mountebank monk over a somnambulist.

The excerpt subtitle I have shared is from a pretty Old German Movie called The Cabinet of Dr. Caligari (1920)

Subtitleeditor is a wonderful tool when it comes to editing subtitles. Subtitle Editor is and can be used to manipulate time duration, frame-rate of the subtitle file to be in sync with the media file, duration of breaks in-between and much more. I’ll share some of the basic subtitle editing here.

First install subtitleeditor the same way you installed ccextractor, using your favorite installation method. In Debian, you can use this command:

sudo apt install subtitleeditor

When you have it installed, let’s see some of the common scenarios where you need to edit a subtitle.

Manipulating Frame-rates to sync with Media file

If you find that the subtitles are not synced with the video, one of the reasons could be the difference between the frame rates of the video file and the subtitle file.

How do you know the frame rates of these files, then?

To get the frame rate of a video file, you can use the mediainfo tool. You may need to install it first using your distribution’s package manager.

Using mediainfo is simple:

$ mediainfo somefile.mkv | grep Frame
 Format settings                          : CABAC / 4 Ref Frames
 Format settings, ReFrames                : 4 frames
 Frame rate mode                          : Constant
 Frame rate                               : 25.000 FPS
 Bits/(Pixel*Frame)                       : 0.082
 Frame rate                               : 46.875 FPS (1024 SPF)

Now you can see that framerate of the video file is 25.000 FPS. The other Frame-rate we see is for the audio. While I can share why particular fps are used in Video-encoding, Audio-encoding etc. it would be a different subject matter. There is a lot of history associated with it.

Next is to find out the frame rate of the subtitle file and this is a slightly complicated.

Usually, most subtitles are in a zipped format. Unzipping the .zip archive along with the subtitle file which ends in something.srt. Along with it, there is usually also a .info file with the same name which sometime may have the frame rate of the subtitle.

If not, then it usually is a good idea to go some site and download the subtitle from a site which has that frame rate information. For this specific German file, I will be using Opensubtitle.org

As you can see in the link, the frame rate of the subtitle is 23.976 FPS. Quite obviously, it won’t play well with my video file with frame rate 25.000 FPS.

In such cases, you can change the frame rate of the subtitle file using the Subtitle Editor tool:

Select all the contents from the subtitle file by doing CTRL+A. Go to Timings -> Change Framerate and change frame rates from 23.976 fps to 25.000 fps or whatever it is that is desired. Save the changed file.

synchronize frame rates of subtitles in Linux

Changing the Starting position of a subtitle file

Sometimes the above method may be enough, sometimes though it will not be enough.

You might find some cases when the start of the subtitle file is different from that in the movie or a media file while the frame rate is the same.

In such cases, do the following:

Select all the contents from the subtitle file by doing CTRL+A. Go to Timings -> Select Move Subtitle.

Move subtitles using Subtitle Editor on Linux

Change the new Starting position of the subtitle file. Save the changed file.

Move subtitles using Subtitle Editor in Linux

If you wanna be more accurate, then use mpv to see the movie or media file and click on the timing, if you click on the timing bar which shows how much the movie or the media file has elapsed, clicking on it will also reveal the microsecond.

I usually like to be accurate so I try to be as precise as possible. It is very difficult in MPV as human reaction time is imprecise. If I wanna be super accurate then I use something like Audacity but then that is another ball-game altogether as you can do so much more with it. That may be something to explore in a future blog post as well.

Manipulating Duration

Sometimes even doing both is not enough and you even have to shrink or add the duration to make it sync with the media file. This is one of the more tedious works as you have to individually fix the duration of each sentence. This can happen especially if you have variable frame rates in the media file (nowadays rare but you still get such files).

In such a scenario, you may have to edit the duration manually and automation is not possible. The best way is either to fix the video file (not possible without degrading the video quality) or getting video from another source at a higher quality and then transcode it with the settings you prefer. This again, while a major undertaking I could shed some light on in some future blog post.

Conclusion

What I have shared in above is more or less on improving on existing subtitle files. If you were to start a scratch you need loads of time. I haven’t shared that at all because a movie or any video material of say an hour can easily take anywhere from 4-6 hours or even more depending upon skills of the subtitler, patience, context, jargon, accents, native English speaker, translator etc. all of which makes a difference to the quality of the subtitle.

I hope you find this interesting and from now onward, you’ll handle your subtitles slightly better. If you have any suggestions to add, please leave a comment below.

Similar Posts

  • Thanks a lot for a clear and simple introduction to subtitle editing.

    I had struggled for quite a while trying to sync subtitles for a movie which want to share with a friend not familiar with English.

  • Hello Shirish, Thank you for detailed explanation. What do you recommend us if we need to change the position of subtitle, top-left corner or bottom-right corner as well.
    Subtitle: subrip
    Video codec: libx264
    Audio codec: aac
    Any tool we can run in command line?

  • – A sync method that I use a lot: Identify (by sound) the starting time / frame of the first and the last spoken lines in the movie, than “Timings” –> “Scale”. (Obviously after I make sure the Framerate is right.);
    – For line-by-line syncing the Waveform is just great. Configure the Keyboard Shortcuts your own way and you won’t be touching the mouse. (Never used the Keyframes, thou – anytime I tried to generate them, Subtitleeditor froze).
    – All in all, a great piece of software; sadly the documentation is scarce.

  • This is the best way:

    `Synchronization > Visual sync`

    https://youtu.be/gqEhvccKygU

    If your mkv file has another language as defualt instead yours, you can change its metadata using command line tool mkvpropedit. You can get this tool by installing https://www.videohelp.com/software/MKVToolNix.

    So to remove flag-default from first audio track and set it to to second audio track:

    `mkvpropedit movie.mkv –edit track:a1 –set flag-default=0 –edit track:a2 –set flag-default=1`

    Sometimes you need to emove forced flag:

    `mkvpropedit -v movie.mkv -v –edit track:a2 –set flag-forced=0`

    Alternatively you can use gui MKVToolNix to remove undesired languages and recode movie file, but it will take longer than just changing metada.

    Regarding mpv, it accepts srt. So open txt subtitles in subtitleedit and save as srt.

    • I have used mkvtoolnix as well and is available in Debian as well. Although I’m not sure the one which I know and have used is different than the ones you are telling. https://mkvtoolnix.download/index.html. Have you been able to successfully able to build the mkvtoolnix from source (the one from videohelp) or it’s just the binary. I/We usually look at tools which can be built for source using free software tools.