A few weeks ago someone complained about screencasts being (ab)used as replacement for documentation. I think I saw it somewhere over at reddit or dzone. Well, that rant wasn't all that interesting, really. However, it did point out two real issues: they are useless to the deaf and you lose the advantages of text (searchable, quick scanning etc.). I can't agree more with those bits - it's something that bothered me for ages.
The former can be addressed with subtitles. The latter, however, is more complicated than that. Solving that problem would also solve the issue that screencasts don't really reveal much of their content to search engines. If you put countless hours of work into your screencasts it would be pretty sad if your audience is unable to find them.
Interestingly subtitles are also the solution to this problem. Once they are written (or copy/pasted from the script) you have a complete transcript together with the timing information. With those two pieces (text and time) you can create a transcript where each line can be clicked to seek over to that position.
Theory and practice are two entirely different things of course. Therefore I created a small demonstration:
Note: Flash 9.0 r115 aka 9,115,0 or better is required - it's H.264 in an MP4 container.
The big number in the middle counts the seconds (0-59). And the slate gray quad at the bottom jumps in 1/10th second steps. The video runs at 10 frames per seconds, which means the quad advances one step each frame. (If you did the math... yes, there are 600 frames altogether. Thanks to the awesome H.264 codec it's only 91kb in size.)
The transcript only lists the seconds. Pretty pointless, but it should illustrate the concept just fine. E.g. if you click on the "20, 21, [...] 29" line the player seeks to 0:20. If you take a look at the markup, you'll see that the transcript is there in plain sight. Pretty nice, isn't it?
The demonstration does lack subtitles, however. Unfortunately FlowPlayer doesn't support subtitles yet and I was also too lazy to convert my SSA subtitle file to SRT (Aegisub) and then to TTXT (MP4Box).
There is one issue though. You can only seek to the nearest keyframe. Unfortunately Flash doesn't behave like most players (i.e. from the keyframe in front of the seek position they silently decode everything up to that point). In this example it works flawlessly, because I inserted a keyframe every 100 frames, which means there is one at each seek position. Without accurately positioned keyframes seeking won't work this well.
Instead of placing a keyframe every 10 seconds you could place one every 5 seconds. This should work sort of alright (if you "snap down" the seek positions) without creating too much bloat. Well, it's not perfect since you end up with too few or too many keyframes at some places and there also won't be keyframes at strategically good places.
Instead of making the keyframes match the seek positions it should be also possible to do it the other way around. That is, take the keyframe positions and split the transcript up at those points.
I haven't found any command line tools for this task yet, but accordingly to Tinic Uro it's possible from the Flash side at least:
Since [mp4,.m4v,.m4a,.mov and .3gp] files contain an index unlike old FLV files, we can provide a list of save seek points, e.g. times you can seek to without having the play head jump around. You'll get this information through the onMetaData callback in an array with the name 'seekpoints'. On the downside, some files are missing this information which also means that these files are not seekable at all! This is very different from the traditional FLV file format which is rather based on the notion of key frames to determine the seek points.
They are only an option with the FLV container format, which doesn't seem to work well (or at all) with H.264 or (HE-)AAC streams. Using that outdated format together with outdated codecs isn't really the smartest thing to do nowadays.
2-pass encoding is very different. During the first pass the video is analyzed and the second pass does the actual encoding. Thanks to the statistics from the first pass the encoder knows the good places for a keyframe. Scene changes (a cut) are a good place for a keyframe for example. While keyframes are pretty big, they can deal far better with drastic changes. The result is a more sensible file with improved quality (at the same file size) for the small price of a longer lasting encoding.
The X.264 encoder for example allows 2-pass encoding. After the first pass the statistics are dumped into a plain text file. Now you could for example take a text editor and insert it manually (ugh) or use a small script to insert keyframes at all seek positions.
It looks like this is the only option right now if you want to add keyframes at specific positions. Well, I guess it's alright once the whole process is automated.
The scraw.ssa subtitle file was written with a text editor. Usually I use Aegisub for this kind of task.
The matching markup was generated by SSAMarkup.java. Yet another hideous program. Apparently I didn't care much about String performance in this case. The parsing is also a bit on the ugly side of things, but it should be pretty solid.
The whole concept can be also taken to the extreme. If the screencast is rather long, that is. You can for example spice up the transcript with headlines for the sections and paragraphs for each step. Of course there will be spans all over the place, but you get a very nicely structured text document, which should be rather nice to read/navigate.
With some formatting magic it could also do wonders for interviews. Often subtitles alone would help a lot since many interviews are recorded in a noisy environment. And if you're really unlucky the audio signal is clipped, which makes you bleed out of your ears in no time.
Well, that's it for today. I hope the presented ideas are of any use to you. I for one would really love seeing screencasts like this. The benefits are pretty overwhelming from my point of view. It's of course a lot of additional work, but it's not that much if you're already using a script or if you would have made subtitles either way.
Comments
Post new comment