[Data-modeling] Proposed changes to pseudonyms

Christopher R. Maden crism at metaweb.com
Sat Jan 19 02:13:09 UTC 2008


The current model for pseudonyms in Freebase is not working, and we 
propose to change it this coming week.  Please comment or object here if 
you care about this issue.

THE STATUS QUO:

Currently, a Pseudonym is a type of its own.  It is linked to the 
Person, Musical Group, or other thing that it represents.  For instance, 
Moby is a Pseudonym of Richard Melville Hall, a Person and Musical Artist.

Creative works produced under a pseudonym (currently, this only applies 
to Musical Album and Musical Release) are credited to the actual 
creator, with a “Released by Pseudonym” property that points to an 
instance of Pseudonym.  The compilation album _Go: The Very Best of Moby 
Remixed_ is credited to Richard Melville Hall, released by the pseudonym 
Moby.

This model is not working out.  Some creative artists, such as Sting and 
Bob Dylan, are far better known by their pseudonyms than by their real 
names; even worse, at some point the real name and the pseudonym were 
merged, so we now have nonsensical data like _Dream of the Blue 
Turtles_, recorded by Sting, released by the pseudonym Sting.  It also 
does not address the careful collector’s desire for accurate crediting 
like Robert Heinlein vs. Robert A. Heinlein vs. Robert Anson Heinlein 
vs. R.A. Heinlein, but it does not make sense to establish each of those 
variations as a discrete instance of Pseudonym.

THE NEW PROPOSAL:

We will eliminate the Pseudonym type.  A single person (band, etc.) will 
have a single topic in Freebase; the community can decide on what the 
primary name is, and other working identities will be present as 
aliases.  (It is likely that “Moby” would a name for the topic with 
aliases of “Richard Melville Hall” and “Voodoo Child.”)  Existing 
instances of Pseudonym will be merged into their identities.

A new supporting type for Creative Work will be created.  It will have a 
machine-readable string property called “Credited to,” which will have 
the literal string present on the work (the spine of the CD, the film 
credits, the book cover).  _Go: The Very Best..._ will be linked to Moby 
as the performer, with the “Credited to” property set to “Moby.”  This 
is redundant as long as the performer’s name is “Moby,” but if someone 
changes his display name to R. Melville Hall, the credit will remain 
correct.

The drawback to this proposal is that it becomes a little harder to find 
everything credited to a shared pseudonym such as Franklin W. Dixon. One 
can filter all books by the “Credited to” property, as long as they are 
all spelled correctly, but one would not be able to simply click on 
“Franklin W. Dixon” and then see all the books credited to “him.” 
However, the complications and confusion of the current model seem to 
outweigh this drawback.

It is possible that the Pseudonym may re-emerge later to handle these 
interesting cases after we better understand the data involved, but this 
simpler model seems like a cleaner way to approach the problem at this 
point.

Thanks,
Chris
-- 
Christopher R. Maden
Data Architect
Metaweb Technologies, Inc.
<URL: http://www.metaweb.com/ >


More information about the Data-modeling mailing list