[Data-modeling] Proposed changes to pseudonyms
Christopher R. Maden
crism at metaweb.com
Sat Jan 19 02:13:09 UTC 2008
The current model for pseudonyms in Freebase is not working, and we
propose to change it this coming week. Please comment or object here if
you care about this issue.
THE STATUS QUO:
Currently, a Pseudonym is a type of its own. It is linked to the
Person, Musical Group, or other thing that it represents. For instance,
Moby is a Pseudonym of Richard Melville Hall, a Person and Musical Artist.
Creative works produced under a pseudonym (currently, this only applies
to Musical Album and Musical Release) are credited to the actual
creator, with a “Released by Pseudonym” property that points to an
instance of Pseudonym. The compilation album _Go: The Very Best of Moby
Remixed_ is credited to Richard Melville Hall, released by the pseudonym
Moby.
This model is not working out. Some creative artists, such as Sting and
Bob Dylan, are far better known by their pseudonyms than by their real
names; even worse, at some point the real name and the pseudonym were
merged, so we now have nonsensical data like _Dream of the Blue
Turtles_, recorded by Sting, released by the pseudonym Sting. It also
does not address the careful collector’s desire for accurate crediting
like Robert Heinlein vs. Robert A. Heinlein vs. Robert Anson Heinlein
vs. R.A. Heinlein, but it does not make sense to establish each of those
variations as a discrete instance of Pseudonym.
THE NEW PROPOSAL:
We will eliminate the Pseudonym type. A single person (band, etc.) will
have a single topic in Freebase; the community can decide on what the
primary name is, and other working identities will be present as
aliases. (It is likely that “Moby” would a name for the topic with
aliases of “Richard Melville Hall” and “Voodoo Child.”) Existing
instances of Pseudonym will be merged into their identities.
A new supporting type for Creative Work will be created. It will have a
machine-readable string property called “Credited to,” which will have
the literal string present on the work (the spine of the CD, the film
credits, the book cover). _Go: The Very Best..._ will be linked to Moby
as the performer, with the “Credited to” property set to “Moby.” This
is redundant as long as the performer’s name is “Moby,” but if someone
changes his display name to R. Melville Hall, the credit will remain
correct.
The drawback to this proposal is that it becomes a little harder to find
everything credited to a shared pseudonym such as Franklin W. Dixon. One
can filter all books by the “Credited to” property, as long as they are
all spelled correctly, but one would not be able to simply click on
“Franklin W. Dixon” and then see all the books credited to “him.”
However, the complications and confusion of the current model seem to
outweigh this drawback.
It is possible that the Pseudonym may re-emerge later to handle these
interesting cases after we better understand the data involved, but this
simpler model seems like a cleaner way to approach the problem at this
point.
Thanks,
Chris
--
Christopher R. Maden
Data Architect
Metaweb Technologies, Inc.
<URL: http://www.metaweb.com/ >
More information about the Data-modeling
mailing list