[Developers] /wikipedia/en_id not giving results for redirected ids
Shug Boabby
shug.boabby at gmail.com
Thu Jul 24 21:35:10 UTC 2008
Given that the WPIDs are no more reliable than the names, and that the
Names also have human-readable bonus points, I don't think there is
any reason to use the WPIDs at all. My project has to use either WPID
or Wikipedia Names as IDs (unfortunately I cannot use Freebase IDs)...
and I've gone for the Name that is the end of the redirect trail at
that moment in time, plus a Freebase lookup (both local and remote are
supported) to try and get some future proofing incase names change.
I would ask that future WEX dumps document that the name is (sort of)
the Wikipedia name... and I would still like somebody to clarify for
me how to get the actual "wikipedia/en" name, given the article.name
from the WEX dumps (spaces to underscores, but what else?).
2008/7/24 Alec Flett <alecf at metaweb.com>:
> It really depends on what you're trying to do.
>
> I think in general the previous advice has been based on people having
> Freebase IDs and mappping them back to Wikipedia, for which the en_id
> is appropriate.
>
> If you want to go the other way, it sounds like names make more sense.
>
> Alec
>
> On Jul 24, 2008, at 2:30 AM, Shug Boabby wrote:
>
>> But this advice contradicts what Freebase have previously advised...
>> to use the WPID instead of the Wikipedia Name. Now you're saying that
>> for future proofing, only the Name can be used? I'd strongly suggest
>> that Freebase store the redirected WPIDs as well as the names.
>>
>> Also... what is the formula to change between the Wikipedia Names in
>> the WEX files and the names in the Freebase API? Is it simply a case
>> of changing spaces to underscores, or is there other magic? I'd also
>> strongly suggest that this be made consistent in future dumps.
>>
>> 2008/7/24 Kurt Bollacker <kurt at metaweb.com>:
>>>
>>> I guess Brian beat me to that suggestion.
>>>
>>>
>>> On Wed, Jul 23, 2008 at 11:01:33PM +0000, Kurt Bollacker wrote:
>>>>
>>>> On Wed, Jul 23, 2008 at 11:40:38PM +0100, Shug Boabby wrote:
>>>>> Hi all,
>>>>>
>>>>> The following query will return the Freebase GUID for a Wikipedia
>>>>> article with the given WPID (this corresponds to the "Spock" page).
>>>>>
>>>>> {
>>>>> "guid" : null,
>>>>> "key" : {
>>>>> "namespace" : "/wikipedia/en_id",
>>>>> "value" : "53571"
>>>>> }
>>>>> }
>>>>>
>>>>> However, the following returns null (this ID corresponds to the
>>>>> "Mr_Spock" page, which redirects to "Spock").
>>>>>
>>>>> {
>>>>> "guid" : null,
>>>>> "key" : {
>>>>> "namespace" : "/wikipedia/en_id",
>>>>> "value" : "3462975"
>>>>> }
>>>>> }
>>>>>
>>>>> Why does this happen? How do I fix it?
>>>>
>>>> If you store "Mr_Spock" beside "3462975", you could use:
>>>>
>>>> {
>>>> "a:key" : {
>>>> "namespace" : "/wikipedia/en_id",
>>>> "value" : null
>>>> },
>>>> "guid" : null,
>>>> "key" : {
>>>> "namespace" : "/wikipedia/en",
>>>> "value" : "Mr_Spock"
>>>> }
>>>> }
>>>>
>>>> Which returns:
>>>>
>>>> {
>>>> "a:key" : {
>>>> "namespace" : "/wikipedia/en_id",
>>>> "value" : "53571"
>>>> },
>>>> "guid" : "#9202a8c04000641f8000000000068479",
>>>> "key" : {
>>>> "namespace" : "/wikipedia/en",
>>>> "value" : "Mr_Spock"
>>>> }
>>>> }
>>>>
>>>> You now get the numeric wpid of the actual article and the freebase
>>>> GUID.
>>>>
>>>>
>>>>
>>>> Kurt :-)
>>>>
>>> _______________________________________________
>>> Developers mailing list
>>> Developers at freebase.com
>>> http://lists.freebase.com/mailman/listinfo/developers
>>>
>> _______________________________________________
>> Developers mailing list
>> Developers at freebase.com
>> http://lists.freebase.com/mailman/listinfo/developers
>
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
>
More information about the Developers
mailing list