[Data-modeling] how much work done on modeling of personal names -- even for surname + given name?

Tom Morris tfmorris at gmail.com
Wed Mar 4 18:40:11 UTC 2009


On Wed, Mar 4, 2009 at 1:23 PM, Robert Cook <robert at metaweb.com> wrote:
> We've talked about this in the past but were overwhelmed by the edge cases
> and didn't see that much value for all of the work.  That may have changed.
>  This was the thinking:
> We could add a "surname" property to /people/person which would
> expect /people/family_name
> We could add another property, "given name(s)" to person which would
> expect /givennames/given_name
>    - this probably should be moved to /people/
>    - if a person has more than one known given name, then they would be
> ordered appropriately

You can't decouple given names and surnames and get accurate results.
All elements of a single name need to be tied together and then the
aliasing of multiple names layered on top of that.  Otherwise you
can't correctly model Mary Smith and Mrs. Mary Smith Jones.  She's
never known as Mary Smith Smith.

You'll also want a place to put honorific prefixes (Dr., Prof., Gen.,
Rev., etc) and generational suffixes (Jr, Sr, III), particularly if
you're going to be constructing "full" names out of their component
pieces.  Nicknames are used in most English speaking cultures.  Other
cultures have similar things like German ruf names or call names.

> This raises a bigger question -- what other things would people like to do
> with it?  See distributions of given names over time?  Of geographic
> distribution of family names?

This seems like the domain of applications.  Once it's possible to get
the data, people can do whatever they want with it, but to start, with
just being able to sort, like Raymond wants to do, or provide accurate
data for that government form with the Surname field would be a big
step forward.

Tom


More information about the Data-modeling mailing list