[Data-modeling] how much work done on modeling of personal names -- even for surname + given name?

glenn mcdonald gmcdonald at itasoftware.com
Thu Mar 5 18:31:03 UTC 2009


I'll just reforward this once, but I've done this name-structure  
research, too, and concluded that trying to model the actual  
complexity is a tar pit. Thus my results-oriented suggestion, which I  
encourage you to reconsider after you've realized what you're in for  
otherwise!


Begin forwarded message:

> From: glenn mcdonald <gmcdonald at itasoftware.com>
> Date: 4 March 2009 5:16:19pm EST
> To: Freebase data modeling mailing list <data-modeling at freebase.com>
> Subject: Re: [Data-modeling] how much work done on modeling of  
> personal names -- even for surname + given name?
> Reply-To: Freebase data modeling mailing list <data- 
> modeling at freebase.com>
>
> Here's another approach: instead of isolating just surname, make your
> new property be /people/person/sortname. You can pre-populate this
> (both in bulk for current data and automatically for new data) by
> taking the regular names and flipping the last words to the beginning
> (special-casing the obvious suffixes), but then any individual
> sortname can be edited to override this. The big advantage of this is
> that it keeps you out of the quagmire of modeling all the internal
> semantic complexity of worldwide naming patterns, but still allows
> you to model the sorting, which is the thing you most often care  
> about.
>
> name: Marcus Wagner
> sortname: Wagner, Marcus
>
> name: Dr. Marcus Wagner, Jr.
> sortname: Wagner, Jr., Dr. Marcus
>
> name: Gabriel José de la Concordia García Márquez
> sortname: García Márquez, Gabriel José de la Concordia
>
> name: 相川 七瀬 (Aikawa Nanase in kanji)
> sortname: あいかわななせ (Aikawa Nanase in furigana)
>
> name: The Grinch
> sortname: Grinch, The
>
> glenn
>
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling


More information about the Data-modeling mailing list