Tuesday, May 24, 2011
Crowd-captioning, yes please!
In 2009 YouTube announced their captioning API for developers who wanted to create tools to help users add captions to their videos.
I've tried the captioning utility of YouTube a few times on my videos, of course the speech recognition service had trouble with words like Quechua and morpheme but in my perspective it took me only 20min to correct the captions and thereby give the Google Speech Recognizer more data about my domain specific terms. I know it took 5 recordings of "such that" in logical denotations on my Android phone for it to recognize what I was saying.. 5 recordings isn't bad for training. It also provided a much more accurate transcription on the second video, hopefully due to having the first video to use to train the model both in terms of phonetic transcriptions for the typical participants in my username's videos and also in terms of the lexicon of my username.
While watching the Ali G video posted earlier, I was thinking about how our Ling 101 or Language and Mind/Society students would benefit from captions. The "iLanguage" of the Ali G character is quite interesting, in addition to informing about some basic principals in linguistic investigation, we can even give the students a bonus question to spot the inconsistencies in Ali G's fake grammar. The transcript feature is pretty awesome, essentially you can click your way through the video. I don't really want to copy the video and post it under my username... but at the moment thats what I can come up with for a most low effort solution to getting captions on the video.