[Developers] Work queues, leader boards, etc

Stefano Mazzocchi stefano at metaweb.com
Fri Feb 20 20:09:10 UTC 2009


Tom Morris wrote:

[snip]

> The general point of understanding the context that the user is
> operating in is an important one though.  I was playing with
> Geographer yesterday and really question its premise.  I got presented
> with a whole raft of things that were a) mispelled, but in GNIS with
> their correct spelling, b) U.S. census places, which clearly have geo
> information associated with them somewhere, c) small NH towns which
> are in GNIS or d) were Roman names for modern day British cities and
> towns.  These problems each have solutions which don't have anything
> to do with users dragging pushpins around maps of areas that they
> aren't familiar with and the fact that they now have geocoordinates is
> just going to mask the real problem.  Gazeteers have been well
> understood since print days.  Freebase needs to be making better use
> of existing data sources like GNIS.

There are several different points that you raise here and I feel 
compelled to address them separately:

1) "Gazeteers exist, so use them": I don't think there will ever be the 
day that Freebase will run out of external databases to harvest or 
cross-reference data against. Your point is valid, and I received the 
same exact criticism from people internally, but I feel it looks at the 
problem only and exclusively from the data quality angle while I'm more 
interested in the contribution-enticing angle of the problem.

Generally speaking, it's easier and more natural for computer scientists 
(and I know because I'm one) to think in terms of 'better programs' 
rather than 'better social dynamics'. Believe me: I have to fight that 
natural tendency every day too :-)

But it boils down to this: my ability to contribute personally (directly 
or via bots that do it for me) doesn't scale as fast as an effective and 
healthy social dynamic of contribution would.

2) User's time is all but a scarce resource when you consider the scale 
of the problem at hand and the scale of the potential audience. If you 
don't find an app compelling because the data doesn't interest you (or 
because you know you could do a better job writing a bot to convert 
existing data, so it feels you're wasting your energy with the task at 
hand) then you might not be part of the target audience for that 
particular app.

Writing an app that will appeal to everybody, from data lovers to 
elementary school kids is not part of the goal (at least for now).

3) "The fact that they now have geocoordinates is just going to mask the 
real problem": this is the focal issue for me.

My very personal opinion on this matter (and, beware: many people inside 
and outside the company disagree with me on this) is that an incorrect 
statement is better than no statement at all.

Allow me to show you why I think that way: suppose a scenario where a 
person needs some information about a topic and stumbles upon a page on 
freebase.com (or similar app driven by Freebase data that allows the 
user to contribute back)

This topic has a location associated with it. In one case, the location 
is incorrectly georeferenced, in another there is no georeferencing 
information. When there is georeferencing information, the app shows the 
location on a map (like www.freebase.com does). The map is empty if 
there is no lat/long values.

Which one of the two scenarios is more likely to entice a casual 
contribution?

First, the users can be separated in two groups: those who realize the 
information is incorrect and those who don't. The probability of 
enticing contribution from those who don't is, obviously, zero (aka, 
"mask the real problem", as you say).

But what about the probability of contributions from the other group in 
the two different cases?

Unfortunately, we don't have hard evidence about this (yet), but my 
personal experience tells me that I'm much more drawn to fix things that 
to add new information.

Here is a breakdown of my personal perception:

1) fixing things doesn't require ontological thinking: I trust that the 
data is in the right place, it's just wrong.

2) The easier it is for me to estimate the burden of contribution (drag 
pushpin on map instead vs. entering lat/long by hand) the higher my 
probability of contribution.

3) if the data is there but wrong, and the path to contribution is 
understood, the task at hand is "fix one thing".

The reward *and* the cognitive burden are easy to estimate. That means 
that I can easily take a few seconds to diverge from my task at hand and 
take small but rewarding pleasure about my contribution.

4) if the data is *not* there, the task (at least to me) feels like "fix 
one in a collection of things that miss this kind of data".

I was not in that mood for that, I just needed information about this 
topic. I don't even want to get started on something that requires me to 
change what I was doing and think differently. The reward estimate is 
low (one less thing in a million without a coordinates is hardly 
exciting when you're not in the mood for that) and the cognitive burden 
much harder to estimate (how do I know I'm not putting the data in the 
wrong place? I'm certainly not in the mood for schema lookup and 
ontological harmonization thinking)

                                 - o -

My work on Geographer (and more to come) not only wants to participate 
in the prototyping of appealing data-entry apps around Freebase but 
wants also to be a way to spawn constructive re-thinking and 
re-evaluation on assumptions that sometimes we take for granted and that 
emerged in completely different environments (say, libraries or museums) 
and solidified around completely different production/consumption dynamics.

Don't get me wrong: your criticism is valid, constructive and well 
meaning and, please, keep it coming because it's extremely valuable. But 
I hope you understand better where I'm coming from with Geographer.

-- 
Stefano Mazzocchi                              Application Catalyst
Metaweb Technologies, Inc.                      stefano at metaweb.com
-------------------------------------------------------------------



More information about the Developers mailing list