[Data-modeling] how much work done on modeling of personal names -- even for surname + given name?
Robert Cook
robert at metaweb.com
Wed Mar 4 18:23:26 UTC 2009
We've talked about this in the past but were overwhelmed by the edge
cases and didn't see that much value for all of the work. That may
have changed. This was the thinking:
We could add a "surname" property to /people/person which would
expect /people/family_name
We could add another property, "given name(s)" to person which would
expect /givennames/given_name
- this probably should be moved to /people/
- if a person has more than one known given name, then they would
be ordered appropriately
We would then link people to the appropriate family and given names.
This could be done semi-algorithmically, but it would probably suffer
from a lot of errors. Questionable mappings (if they could in fact be
detected) could be queued up for human review.
The properties could be added to /people/person and probably some of
the most obvious name mappings could happen (particularly if there are
people in the community willing to do it.) It would probably burn a
couple million primitives.
This raises a bigger question -- what other things would people like
to do with it? See distributions of given names over time? Of
geographic distribution of family names?
R
On Mar 4, 2009, at 9:46 AM, Ed Laurent wrote:
> Not sure if/how Metaweb is handling this but I think it's a great
> idea. This will be especially useful for biblographic import/export
> down the road. A suggestion is to create a "Person name" type
> composed of machine readable string properties and then writing a
> script to populate the properties from words between spaces of Topic
> names of "Person". There isn't a formal standard that I've seen but
> most Western people are listed as PersonalName MiddleName (if
> available or MiddleInitial) SurName. Contingencies would need to be
> made for "Dr.", "Van", etc. and a validation system could probably
> be set up using Typewriter. This suggestion is similar to
> "Scientific name" of "Organism classification".
>
> -Ed
>
>
> On Wed, Mar 4, 2009 at 12:18 PM, Raymond Yee <raymond.yee at gmail.com>
> wrote:
> I would like to be able to sort topics of /people/person (e.g.,
> http://www.freebase.com/view/people/person) by the person's surname
> but
> don't see any generic properties for a person's surname and given
> name.
> Am I missing something here or is there no such field in the commons.
>
> I realize that modeling personal names is non-trivial (see
> http://en.wikipedia.org/wiki/Personal_name for example) -- but
> having a
> way to indicate surnames+first names would be very helpful for many
> topics.
>
> I looked around on the data-modeling list archives but couldn't find
> any
> discussion around this issue.
>
> What should I do to have sorting on surnames?
>
> -Raymond
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090304/77e11c0b/attachment.htm
More information about the Data-modeling
mailing list