[Developers] freebase and dbpedia
Robert Cook
robert at metaweb.com
Mon Nov 10 17:14:46 UTC 2008
Also, I should point out that most wikipedia templates are quite noisy
and pose serious problems for automatic extraction. This makes sense
-- information in these templates is intended as static content in a
document, not as structured machine-readable data. There are major
variations in formatting of dates, number values (instance by instance
variations in metric and imperial for example), parenthetical asides,
multivalue subfields, etc. It surprises me that some of these
templates are as clean as they are -- we really are lucky to enjoy the
efforts of so many compulsive organizers in the wikipedia community.
But these clean templates are the exception, not the rule.
Freebase has taken a conservative approach to data quality so our data
growth from wikipedia has been gradual. We are getting better at
mapping these templates, and, indeed, the Freebase corpus itself is
now being used to increase the certainty of extraction and thus the
yield. In the last few weeks, we have improved the quality of our
extraction algorithms and added hundreds of new template and category
mappings and are continuing to do so.
R
On Nov 10, 2008, at 3:02 AM, John Giannandrea wrote:
>
> Ravi Iyer wrote:
>> I'm considering using the guid links to merge the data, but I
>> thought I'd ask what plans were already in place to import the
>> dbpedia data into freebase first.
>
> The dbpedia RDF specifically links to the Freebase data via an
> owl:sameAs predicate.
>
> The reason you see more RDF assertions at http://dbpedia.org/page/Bobby_Rahal
> than at http://rdf.freebase.com/rdf/guid.9202a8c04000641f800000000031a897
> is that dbpedia is exporting more raw data such as wikipedia
> categories and all the
> infobox properties at a low level.
>
> Freebase only has the data that has been specifically hand mapped to
> freebase schema.
> For example, because we dont have a racecar driver type yet, we have
> not mapped available data
> like p:firstWin • 1982 (xsd:integer) into freebase data. Over
> time we expect more of
> this data to get mapped automatically into corresponding freebase
> types.
>
> -jg
>
>
>
>
>
>
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
More information about the Developers
mailing list