[Developers] Built With Freebase: Creative Commons Car Pictures

Paul Houle paul at ontology2.com
Fri Apr 3 20:19:56 UTC 2009


Tom Morris wrote:
>
> o.e. ?
>   
ontological engineering
>
> I'd be interested in:
>
> - A relative comparison of the strengths and weaknesses of Freebase as
> compared to DBpedia, if there's more to it than what's described above
>   
    A short answer is that depends on the problem domain.  Freebase 
Commons has very good types for certain things,  mediocre types for some 
things,  and pretty much absent types for other things.  There are some 
areas where people have been great about filling in infoboxes in 
wikipedia,  and other areas where they've done poorly.  There was a good 
blog posting a while back about how a guy who was researching presidents 
was only able to get good infobox data for about half of them from 
dbpedia.  My impression is that more than 50% of topics in Freebase are 
untyped,  however,  in most cases a person could put an existing type 
easily.

    At a wider view,  dbpedia's drawing from a large collection of > 10 
gigafacts.  Today's information extraction technology isn't up to the 
task of extracting them,  but the existence of large generic databases 
may lead to rapid progress.  If the government was interested in 
lowering the unemployment level,  it could get the job done for less 
than it cost to bail out AIG.  Dbpedia's serious problem at the moment,  
in my mind,  is that it's going to need a cyc-like mechanism to attach 
context to triples if it's going to deal with the large amount of 
information about fiction that's in wikipedia.  Many wikipedians and the 
W3C think the problem can be dealt by applying context to topics,  but I 
disagree.

    Freebase,  on the other hand,  has pulled in very high quality and 
structured "database"-style data,  like the fueleconomy.gov data.  It's 
an alternative approach to the semweb that feels object-relational and 
is a particularly excellent match for any kind of data that would fit 
well in a relational db.  It's got a sort of "context" in the quads but 
it's not all like Cyc.
> - Feedback/suggestions on facilities that Freebase could implement to
> make this type of resolution project easier (and capture the results
> automatically).  Things like the candidate identification that your
> crawl did, as well as a way to crowd source fixing the false positives
> and confirming marginal scoring resolutions.
>
>   
    Well,  the first step of the "o.e." I'm talking about is to 
construct taxonomic skeletons for dbpedia and freebase and lay them side 
by side:  I'll be able to say something more informed at that point.


More information about the Developers mailing list