[Developers] WEX (wpid/fbid lookup) database

Alexander Marks al at metaweb.com
Fri Nov 7 09:37:31 UTC 2008


Using /wikipedia/en instead of /wikipedia/en_id in that sample code will give you the names of the wikipedia article, plus the names of all of its redirects (they will be MQL key-quoted, however, so let me know if you need help decoding them). The latest quad dump is always linked here: http://download.freebase.com/datadumps/ under the "2. Link Export" heading.

Al

----- Original Message -----
From: "Sam Halliday" <sam.halliday at gmail.com>
To: "For discussions about MQL, Freebase API and apps built on Freebase" <developers at freebase.com>
Sent: Friday, November 7, 2008 12:52:42 AM GMT -08:00 US/Canada Pacific
Subject: Re: [Developers] WEX (wpid/fbid lookup) database

Good to know about the WEX coming soon!

Sorry, typo in my original e-mail... it's the wpname I am most  
interested in, not the wpid. If future freebase dumps included wpname  
and the wikipedia redirects, I'd be very very happy.

For future reference, where is the freebase-datadump-quadruples.tsv  
file? I can't find it in the freebase data dump.

On 6 Nov 2008, at 23:11, Alexander Marks wrote:

> Hey Sam. We're working on a new WEX dump that will be out soon. Note  
> that you can, however, generate the guid->wpid map that you want  
> using the quad dump with something like this python script:
>
>  dump = open("freebase-datadump-quadruples.tsv", "r")
>  out = open("guid2wpid.tsv", "w")
>  for line in dump:
>      src, prop, dst, val = line.split("\t")
>      if prop == "/type/object/key" and dst == "/wikipedia/en_id":
>          out.write("%s\t%s\n" % (src, val))
>
> which will give you a format like this:
>
>  /guid/9202a8c04000641f8000000000009e89 3746
>  /guid/9202a8c04000641f8000000000032ded 25493
>  ...
>
> As for Wikipedia article redirects, you will always need the WEX  
> dump for that, although replacing /wikipedia/en_id with /wikipedia/ 
> en above might get you part of the data you want. The /wikipedia/en  
> Freebase namespace contains both the name of the Wikipedia article,  
> and the name of all redirects to that article.
>
> Hope that helps,
>
> Al
>
> ----- Original Message -----
> From: "Sam Halliday" <sam.halliday at gmail.com>
> To: developers at freebase.com
> Sent: Tuesday, November 4, 2008 2:30:09 AM GMT -08:00 US/Canada  
> Pacific
> Subject: [Developers] WEX (wpid/fbid lookup) database
>
> Hi all,
>
> I noticed that freebase released a new database dump for October, but
> no corresponding WEX dump. My interests are not in the WEX part of the
> latter database, but in the piece that links wpids to fbuids and
> gathers all the Wikipedia redirects... I am confused why this is not a
> part of the freebase download in the first place.
>
> I used freebase for a project on the assumption that new data would be
> available on a quarterly basis. Is this not the case for the WEX data?
>
> I'd also like to request that the wpid and redirect tables be included
> in the freebase data dumps in the future, and not exclusively in the
> WEX data. I understand that the WEX can be built from the wikipedia
> data, but the wpid/fbuid lookup cannot... it must be freebase that do
> this.
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers

_______________________________________________
Developers mailing list
Developers at freebase.com
http://lists.freebase.com/mailman/listinfo/developers


More information about the Developers mailing list