Skip to main content


Showing posts from 2011

Redmine on Ubuntu 10.10

Most tutorials for setting up Redmine on Ubuntu end with using Webbrick to test, but I wanted a production setup. This is a reconstruction of what finally worked (assuming no prior installs of ruby or rails, but a fully set up LAMP stack):

$ sudo apt-get install ruby-dev redmine$ sudo gem install passenger$ sudo apt-get install apache2-dev libapr1-dev libaprutil1-dev $ echo "export PATH=/var/lib/gems/1.8/bin:$PATH" >> ~/.bashrc $ sudo /var/lib/gems/1.8/bin/passenger-install-apache2-module $ sudo a2enmod rewrite
$ sudo vim /etc/apache2/sites-enabled/000-default
LoadModule passenger_module /var/lib/gems/1.8/gems/passenger-3.0.11/ext/apache2/ PassengerRoot /var/lib/gems/1.8/gems/passenger-3.0.11 PassengerRuby /usr/bin/ruby1.8 ...snip...<VirtualHost *:80> ServerName DocumentRoot /usr/share/redmine/public <Directory /usr/share/redmine/public> AllowOverride all Options -MultiViews </Directory> </Virtua…

Bad Aliens cutified with

Bare Singulars and Bare Plurals

Since my final paper for our Advanced Semantics Seminar on Plurality, generics and bare singulars (including incorporated nouns) and bare plurals have been near and dear to my heart.

Naturally a post on generic comparisons on the Language Log quickly got my attention. Liberman argues using generic plurals toys with the gap between statistically significant generalizations and the grammatical genericity/generalizations
 that the results are presented in a way that misleads the public — and in some cases, the use of generic plurals seems to mislead the scientists themselves. He sites a number of examples from by Sarah-Jane Leslieabout "Generics and Generalization"
"Ticks carry Lyme Disease", although only a minority of ticks do so (14% in one study). "Mosquitoes carry West Nile Virus", though the highest infection rate found in the epicenter of a recent epidemic was estimated at 3.55 per thousand (and the rate was essentially zero outside of the epicenter).&q…

Fighting the Unicode Fight

Almost anytime I have to build a new corpus the Unicode Fight returns. I lived many Unicode Fight free years when Linux became 100% Unicode, but now I'm using Mac OSX.

The default file.encoding for Mac is MacRoman. I've tried a whole variety of Googling to find the keywords to find out the proper way (using the System Preferences) to set the default to UTF-8 to no avail. I really hate Google's new (~6mos ago) search algorithm that tries to guess what we mean to ask, and doesn't include all keywords we query. It makes it near impossible to find anything long-tail-ish.

This is when it started working in Java/Groovy:

created a file /etc/launchd.conf and put this into it:
 setenv JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8 For general purposes:
added this to my ~/.vimrc set encoding=utf-8
set fileencoding=utf-8 added this to my /etc/bashrc export LC_CTYPE=en_CA.UTF-8
export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8 Changed my Terminal > Preferences > Encodings to only UTF-8, and…

Bootstrapping Android Best Practices

Today I'll be giving a talk at Android Montreal:

Bootstrapping Android Best Practices

Like human language, programming languages are
One part syntax, One part vocabulary, and One part culture/socio-linguistics.  Too often when learning a new language we focus on syntax and vocabulary, but not enough on culture/best practices. Sure in our courses we might learn that the French like wine and baguettes, and wear berets, but on the ground its not really that simple (n'est pas?). In this tutorial we "immerse" ourselves in the culture of two projects to simultaneously learn syntax, vocab and best practices for getting things done in Android Development.

We have selected a few repositories, 2 which show best practices, and 2 pairs of pidgins vs best practices which show not fully formed Android development.

Beginner-Friendly Best Practice Learning GroundsMyTracks
Beginners: Using the GPSBeginners: LocalizationBeginners: Choosing the right tool in your toolbox for saving

One busy month

I have been really busy with my other app and conferences so I didn't push out any new releases of AuBlog, and it shows in my active installs. The active installs are staying pretty constant.

I have a couple of projects to work on before I can release version 2 of AuBlog which I would like to make GUI-free and really focus down on a couple of core features. I'm targeting January...

Total active installs

Recording voice, eye-gaze and touch on Android tablets

We presented our codebase which records voice, eye gaze and touch on Android tablets at the Academy of Aphasia annual meeting a few weeks ago. Our poster is here

2011      (with A. Marquis and A. Achim) "Aphasia Assessment on Android: recording voice, eye-gaze and touch for the BAT," Academy of Aphasia 49th Annual Meeting, Montréal.

I wanted to make it as easy as possible to reuse our code so I made a couple of videos to walk through the project and explain it in non-technical terms.

The first video talkes about the Android side which simply collects the video, audio and touch data.

The second video talks about the "server side" where a lot of the open source repositories are used and the really exciting data extraction and analysis takes place.

The third video gives an overview of how to get the code.

The fourth video is a lot longer than the others because it shows how you can adapt the project to your own experiment and also how you can use GitHub to manag…

"Bébés Bescherelle" aka recent proof that morphosyntax is acquired as young as 11 months

Bébés Bescherelle !Catégories :Sciences humainesRecherche et créationDiplômésProfesseurs Par Pierre-Etienne Caza Depuis une quarantaine d'années, les spécialistes du développement du langage affirment que les verbes sont complexes à apprendre. C'est pour cette raison qu'ils apparaîtraient si tard - autour de l'âge de 18 mois - dans la parole des enfants. Mais cela ne veut pas dire que ces derniers n'ont pas commencé à décoder les subtilités de la conjugaison bien avant. «Les enfants sont en mesure de reconnaître les terminaisons verbales dès l'âge de 11 mois», affirme Alexandra Marquis, qui publie cet automne les résultats étonnants de ses recherches doctorales dans la revue Cognition, en collaboration avec sa directrice de recherche, la professeure Rushen Shi, du Département de psychologie. Cette recherche, la première au monde qui démontre la capacité d'analyser des mots conjugués chez les bébés si jeunes, est née d'une remise en question d&#…

Researchers with Open Data or Open Source are more likely to be cited

At the ETAP2 (Experimental and Theoretical Advances in Prosody) conference a strange thing happened while I was presenting my poster. A guy came over and spent about a half an hour talking to me about Open Data and Open Source. I got the sense that he was recruiting for something, but I assumed he was probably a professor looking for grad students.

After looking him up on the internet I discovered Heather Piwowar a PostDoc in Data One, a project sponsored at NASA to encourage researchers to keep their data (and their research) open and available. From what I can see they have some publications which show that if you keep your data open, and your source open, you're far more likely to be cited, which makes sense, people can open your data and look at it. By opening your data, you bring interest to your data and your research.

I'm trying to put my finger on why we as linguists are not completely confident in opening our data. I think one part of it is that we think someone else …

I'm a just a dude like any other programmer

After listening toShould Google+ require you to use your real name? one fine sunny bike ride, I was left wondered if maybe my justification for anonymity might be more common than the authors might think. My wondering stopped there, until this evening when I giggled at my one of GitHub messages. 
There are many nefarious reasons to use a handle. Some people hide behind anonymity to post nasty comments on YouTube, troll in general, say abusive things or start mass riots in countries where freedom of speech isn't common.  But not all reasons for anonymity are nefarious, some are just about having a level playing field. As "cesine" I've quietly listened on user groups while others suggest new barbie wall papers, and flirty penguins to bring in some of the female persuasion over to Linux, etc. I once made an eye fluttering Tux and put it on my website, wondering if they might catch on that there was something different about the operator of that server.  
Anonymity for …

Precision vs. Recall defined for linguists

Precision and recall are some interesting examples of terminology from computer science which will help linguists know how to divide tasks best done by a linguist, from those best done by a script or some sort of automation; in other words, when it needs to be perfect, and when good enough, is good enough.

Recall means getting back all the examples in your data which display that factor. You can get high recall by writing a script which returns a lot of results. There is always a second step, to go through the examples yourself as a human to filter out the extraneous examples. Getting high recall is generally a good first step when you start your research (think: google web search, you really want to know all the authors that have written on your topic...)

Precision means getting data that you can run stats on and get statistical significance. High precision means all the results are what you were looking for. Getting high precision is important to make any claims or generalizations, …

Watchmes for AuBlog

I made some quick-n-dirty Watchmes

How to use AuBlog for blogging via typing

How to user AuBlog for blogging via dictations

The machine transcriptions are hilarious, and not very useful. AuBlog uses an Open Source machine transcription software (Sphinx). It needs to be trained to your "iLanguage" (vocabulary) to return quality results...

App Stats 20 days - Conclusion: Calling for Malay localizers :)

Let's take a look at my Android Market stats during open beta testing for Iteration I Aug 9-29 : User Interface

I have a total of 15 active users according to the Android Market, most of the users have either Android 2.2 (Froyo) or 2.3.3 (Gingerbread).

Android versions:


My first published app!

I published my first app yesterday on the market, it's called AuBlog.


It's designed to let me blog while biking. I figured some others might be interested too so I put it on the market. I didn't advertise anywhere, I'll post about the organic statistics as they come in.
Posted using my Android

Le P'Tit Train du Nord Bike Trip

Day 1 - 109km @ 19.06km/hCreated by My Tracks on Android.Total Distance: 109.23 km
Total Time: 9:53:21
Moving Time: 5:43:53
Average Speed: 11.05 km/h
Average Moving Speed: 19.06 km/h
Max Speed: 39.00 km/h
Min Elevation: 61 m
Max Elevation: 423 m (1388 ft)
Elevation Gain: 1825 m (5986 ft)
Max Grade: 24 %
Min Grade: -26 %
Recorded: Fri Jul 29 08:10:01 EDT 2011
Activity type: cycling

Day 2 - Part 1 : 62km @ 19.23km/hCreated by My Tracks on Android.Total Distance: 61.74 km
Total Time: 5:08:52
Moving Time: 3:12:41
Average Speed: 11.99 km/h
Average Moving Speed: 19.23 km/h
Max Speed: 31.00 km/h
Min Elevation: 172 m
Max Elevation: 325 m
Elevation Gain: 788 m
Max Grade: 17 %
Min Grade: -18 %
Recorded: Sat Jul 30 07:36:59 EDT 2011

Day 2 - Part 2 : 27km @ 22.59km/hr Created by My Tracks on Android.Total Distance: 27.15 km
Total Time: 2:16:41
Moving Time: 1:12:07
Average Speed: 11.92 km/h
Average Moving Speed: 22.59 km/h
Max Speed: 35.00 km/h
Min Elevation: 177 m
Max Elevation: 30…

Phantom of the Floppera

Thanks to Google+ my circles of my circles share lovely things.

George Whiteside made a pretty fantastic (d)iskette (O)rgan which he posted on YouTube as Phantom of the Floppera [Download source code]. He discusses some of the frequency properties of floppies.

between the 5¼" and 3½" drives, the 3½" drives produced the most audible low frequencies, and the 5¼" drives produced the cleanest and highest notes. I hypothesized that the older drives were more over-engineered than their modern descendents, allowing them to operate farther out of spec, i.e. play higher notes. The newer drives are flimsier by comparison, and literally rattled better at low frequencies.

10.6.8 update spells Joy for Minimacs everywhere

If, after updating to 10.6.8 you get into a reboot loop, never fear the update is the same as every other update, except there is a step involving replacing the kernel.

This is very easy to do if you either (a) download it and save it on your Minimac before you update to 10.6.8, or (b) you have a mac formated USB key that you can copy it onto after your Minimac starts looping.

Here is the super-condensed minimal effort path to get you into Minimac heaven... (no not a dead Minimac, a running one), at least until Lion comes out.

On another computer (preferably a Mac or Ubuntu)
Download the legacy kernel[mirror]Put it on a Mac formated USB key * On the Reboot Looping Minimac
Hold down Shift as you bootAt the boot loader screen type (once you start typing it will apear in black letters on the bottom of the screen)  recovery=yes, -x Once it has finished loading, plug in the USB keyCopy the legacy_kernel-10.6.8.bz2 to your MinimacDouble click on it to unzip itMove the legacy_kernel-10.6.8 to …

Lake Champlain Bike Trip

A 7 day trip of about 620k (took a day off to visit family) ending with a free open air concert by Offenbach!

After 4 years and no week long bike trips, I finally took a vacation. I biked from Montreal down the Chambly canal to the Richelieu river and around Lake Champlain. The easiest and most beautiful section of the lake (Alburg Springs to Burlington 78k) was substituted by a hilly segment after visiting family (East Montpelier to Burlington 87k).

Lessons Learned
* Lakes are rivers which couldn't drain because they are surrounded by hills; roads surrounding lakes are NOT flat.
* Biking on a canal or a river is much, much flatter than biking around a lake.
* Biking around Lake Champlain Difficulty: medium to hard.
* Biking on a canal difficulty: easy.
* Recommendation for biking around Lake Champlain: have a support vehicle which drives around your camping gear, don't haul it up the mountains unless that's your goal (biking uphill with weight).

Bottom Line
* Before thi…