[Developers] libraries/techniques for extracting data from the Wikipedia to feed to freebase

David Roberts dvdr18 at gmail.com
Thu Feb 26 05:37:03 UTC 2009


There's an issue that I've submitted for uploading BODR chemical
element/isotope data into freebase if you're interested:
https://bugs.freebase.com/browse/DA-636

--
David Roberts
http://purl.org/david



2009/2/26 Raymond Yee <raymond.yee at gmail.com>:
> Anyone out there have a lot of experience scraping the Wikipedia for
> facts?   The applications are many, but some examples I have in mind
> right now include:
>
> 1) extracting data about chemical elements -- e.g. boiling points of
> elements
>
> 2) American politicians at the federal, state, and municipal levels
>
> 3) visual artists and their works
>
> One thing that has surprised me about freebase has been the patchiness
> of the data in it -- I wanted to plot all the boiling point of elements
> vs atomic numbers -- but a lot of the elements are missing bps -- if you
> go to
>
> http://is.gd/kVb1
>
> and hit "Read>>"  you'll get a list of elements w/o boiling points -- as
> of 2009-02-26T04:53:34.3750Z (that is).
>
> So what I'd like to do is to use a set of Wikipedia parsers to extract
> data that I find useful and push them into Freebase for some projects I
> have in mind.  My quick experience with DBPedia is that it's not better
> for chemical elements either -- but I might just be misunderstanding it.
>
> Does freebase have any tools it can release that we can adapt for
> specific purposes to push more data into freebase?
>
> Thanks,
> -Raymond
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
>


More information about the Developers mailing list