Skip to main content

Crowd-captioning, yes please!


In 2009 YouTube announced their captioning API for developers who wanted to create tools to help users add captions to their videos.

I've tried the captioning utility of YouTube a few times on my videos, of course the speech recognition service had trouble with words like Quechua and morpheme but in my perspective it took me only 20min to correct the captions and thereby give the Google Speech Recognizer more data about my domain specific terms. I know it took 5 recordings of "such that" in logical denotations on my Android phone for it to recognize what I was saying.. 5 recordings isn't bad for training. It also provided a much more accurate transcription on the second video, hopefully due to having the first video to use to train the model both in terms of phonetic transcriptions for the typical participants in my username's videos and also in terms of the lexicon of my username.

While watching the Ali G video posted earlier, I was thinking about how our Ling 101 or Language and Mind/Society students would benefit from captions. The "iLanguage" of the Ali G character is quite interesting, in addition to informing about some basic principals in linguistic investigation, we can even give the students a bonus question to spot the inconsistencies in Ali G's fake grammar. The transcript feature is pretty awesome, essentially you can click your way through the video. I don't really want to copy the video and post it under my username... but at the moment thats what I can come up with for a most low effort solution to getting captions on the video.


Popular posts from this blog

10.6.8 update spells Joy for Minimacs everywhere

If, after updating to 10.6.8 you get into a reboot loop, never fear the update is the same as every other update, except there is a step involving replacing the kernel.

This is very easy to do if you either (a) download it and save it on your Minimac before you update to 10.6.8, or (b) you have a mac formated USB key that you can copy it onto after your Minimac starts looping.

Here is the super-condensed minimal effort path to get you into Minimac heaven... (no not a dead Minimac, a running one), at least until Lion comes out.

On another computer (preferably a Mac or Ubuntu)
Download the legacy kernel[mirror]Put it on a Mac formated USB key * On the Reboot Looping Minimac
Hold down Shift as you bootAt the boot loader screen type (once you start typing it will apear in black letters on the bottom of the screen)  recovery=yes, -x Once it has finished loading, plug in the USB keyCopy the legacy_kernel-10.6.8.bz2 to your MinimacDouble click on it to unzip itMove the legacy_kernel-10.6.8 to …

English Noun Incorporation?

I was at a talk today with some Ojibwe data where invariably the claim that "English doesn't have incorporation" or at least incorporation of objects came up. We have "vacume clean" but generally we only incorporate the instrument. I remember a similar discussion coming up a few years ago in 2007 and I asked myself about apple picking. My colucators said, sure, but you can't say apple pick right? I thought about it a bit and came up with a linear string of words that might get google results. I remember I searched for "we apple picked" and found a few results, indicating to me that some people say it, generally when discussing their weekends. So, having my Android with me at the talk I googled again. This time I found a lot more examples than before, 394 to be exact, all of the first page clear examples with native speakers, speaking naturally.

I've heard this claim can be traced back to Baker 1988. When I got home I googled the claim "…