Monday, September 27, 2004

Are algorithms always objective?

I have long questioned the bias, intentional or otherwise, of search engines. It irks me that Google so frequently claims their algorithm is objective, when most information we generate is subjective, despite our best intentions. I have little doubt that Google intends to be objective, but so probably does Dan Rather.

A recent article from USC explores how an algorithm can be more biased than human editors, and the folks at Search Engine Watch weigh in as well. I hope the notion that algorithms are not inherently objective continues to catch on.

Thursday, September 23, 2004

In Pisa this November

I'm visting the University of Pisa for the month of November. Paolo Ferragina and Antonio Gulli are my hosts.

If any Nutch or Lucene folks would like to meet while I'm there, please send me a note.

Tuesday, September 07, 2004

Foo Camp

I'll be at Foo Camp this weekend. I'm riding my bike up again. I'll arrive around 10am Saturday and stay through Sunday morning. Please send me a note if you'll be there too and would like to chat.

Monday, September 06, 2004

Creative Commons Search

Creative Commons has announced its Nutch-based search engine. It crawls CC-licensed pages, indexing license properties, making them searchable. I did most of the initial development, using it as a motivating case when adding metadata support to Nutch. Now I've handed it off to Mike Linksvayer at the Creative Commons. Battelle already blogged it, showing that he has the scoop on even the developers!

This is cool in several ways. It demonstrates how easily Nutch can be extended to do stuff that would be hard to do with any other search engine. (This is all of the CC-specific code.) It's also cool since it helps folks find content they can reuse, like songs that can be sampled, art that can be clipped and text that can be excerpted.