[Developers] Freebase and the Politics of Groups
Robert Cook
robert at metaweb.com
Sun Sep 16 20:40:34 UTC 2007
Clay is a great writer on this subject, and these particular essays
have held up well over time.
Below are some answers to your questions/problems:
On Sep 16, 2007, at 12:35 PM, Rich Morin wrote:
> While re-reading a couple of Clay Shirky's essays, I
> started wondering how well Freebase's current policies
> will handle unintentional and/or malicious degradations
> of the data.
>
> Clearly, the Metaweb developers have thought about the
> matter; the permissions structure (patterned after Unix)
> gives them a powerful tool for controlling access. I
> haven't seen any mention of an explicit "rollback", but
> I assume that something of this nature exists.
>
> However, this leaves Metaweb (and/or the user community)
> with the problem of administering permissions, etc. The
> experience of Wikipedia indicates that this is possible,
> but that it can involve a significant amount of effort.
Fortunately, these permissions are limited to schemas, which will
change much less frequently than other data. The limiter for schemas
isn't the bottleneck of people with permissions, but, rather, people
who are experienced enough with data modeling to make the right
choices. Our hope is to find 100 of these people over time, who
should be able to handle a large flow of schema changes.
We currently don't have plans to lock down instances the way
Wikipedia does, but if we do, it should be easier to maintain them
given the greater structure inherent in the Metaweb infrastructure.
>
> Moreover, I'd contend that Freebase differs quite a bit
> from Wikipedia in the ways in which it will be used. If
> I'm reading a Wikipedia article, I may be able to guess
> that some misinformation has been inserted. However, if
> I'm doing data mining on Freebase, the misinformation may
> be obscured by the processing.
I think this depends on the nature of error or vandalism. With a
structured data set, you can discover inconsistencies automatically
(e.g. people who died in the 19th century are unlikely to have acted
in a film, etc.)
Regarding Paul's response to this email, we will soon be rolling out
the ability to see changes to a topic in a more human-readable way,
as well as changes to all instances of a type, and all changes made
by a particular user to any topic.
One report that we are considering is: "Show me all of the changes to
topics in this domain, inversely-ordered by the reputation of the
user who made the change", where reputation is determined crudely by
how long they have been active and how many long-standing
contributions they have made that haven't been reversed. This would
bring forward edits that are more likely to be vandalism or erroneous.
Also as you suggested above, we have mechanisms to roll back all
changes by a particular user and detect large scale changes in real
time.
None of these techniques will solve the problem of a high reputation
user entering disinformation, which ultimately will limit the kinds
of data that can be created in a "post-hoc moderated" system like
Wikipedia or Freebase. (Although I have a friend who works at a
University teaching hospital who says doctors there regularly use
Wikipedia as a reference for diseases, which, despite my boundless
enthusiasm for Wikipedia's quality control processes, makes me quite
nervous.)
> I'd be pleased to see responses, but be sure to read these
> two essays before weighing in...
>
> A Group Is Its Own Worst Enemy
> http://shirky.com/writings/group_enemy.html
>
> Social Software and the Politics of Groups
> http://shirky.com/writings/group_politics.html
>
> -r
> --
> http://www.cfcl.com/rdm Rich Morin
> http://www.cfcl.com/rdm/resume rdm at cfcl.com
> http://www.cfcl.com/rdm/weblog +1 650-873-7841
>
> Technical editing and writing, programming, and web development
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
More information about the Developers
mailing list