[Data-modeling] how much work done on modeling of personal names -- even for surname + given name?
Scott Meyer
sm at metaweb.com
Thu Mar 5 20:09:49 UTC 2009
Tom Morris wrote:
> I agree that modeling in the abstract is a fool's game, but I'm a
> little confused by the assertion that there's no structured name data
> available. I run into it at every turn. Here's a database that I was
> looking at loading last night
> http://bioguide.congress.gov/scripts/biodisplay.pl?index=K000107 with
> surname and given names stored separately. Here's the same person in
> another database that has the surname split out
> http://dbpedia.org/page/John_F._Kennedy Pretty much any
> corporate/organizational/governmental database with personal names is
> going to have some type of structure.
Right, and it is irksome to give up that structure to load things
into Freebase, however, how useful would it be to have a mixture
of more and less structured data?
If we created the "obvious" CVT with first_name and last_name and
used that (in addition to the current name) in cases where we have
structured data, how valuable would that be?
Would you constrain query results to people having structured name
data? If not, then you're stuck trying to figure out a sort order,
if so, you may miss results.
-Scott
More information about the Data-modeling
mailing list