[Developers] looking for people
John Giannandrea
jg at metaweb.com
Mon Mar 30 15:57:17 UTC 2009
Tom Morris wrote:
> (honorifics, middle names/initials, generational indicators)
> - indexing birth and death dates - they're a key identifying factor
Both good ideas. We have some code for dealing with names but it is
part of our reconciliation efforts and not in the standard APIs as far
as I know yet.
We dont currently index non text (e.g. dates) in api/service/search
but will consider it.
> - better extraction and/or indexing of names from Wikipedia articles
> - they almost always have the full name of the person in the lead
> sentence and sometimes have aliases as well.
We do currently index wikipedia text as part of the search API. So a
search for Thomas Leo will find Tom Clancy as the first match.
I agree we should do a better job of extracting wikiepdia text as
specific aliases also.
> I'm not sure what data sets you guys use for testing, but the
> Congressional Bios might make an interesting addition if you don't
> already have them. It's about 12,000 names, all of which should exist
> in Freebase.
Thanks for the suggestion, we will check it out.
-jg
More information about the Developers
mailing list