[Developers] Bulk download from sandbox.freebase.com

Kavitha Srinivas ksrinivs at gmail.com
Tue Oct 9 18:20:05 UTC 2007


Hello
    We wrote some Javascript to read every instance of a topic using  
cursors from sandbox.freebase.com.  We managed to read successfully a  
few times from the cursor using this technique.  However, after that  
we keep getting timeouts in the read.  We amended our script to (a)  
try multiple times, which did not work because it appeared to get  
stuck at the same point, (b) try using limits of 10 to get data, but  
then this will simply not scale.  Any help is appreciated.

Thanks!
Kavitha


On Oct 8, 2007, at 3:53 PM, Tim Kientzle wrote:

> If you can come up with a good way to do this
> and performance really turns out to be a problem,
> we *may* be able to host it internally and provide
> the dump as a downloadable file.  (Have the dump
> regenerated once a week or so, perhaps?)
>
> No promises, but it's a possibility.
>
> You should, of course, try to make it work
> externally first.  Our internal connections
> are faster, but may not be as much faster as you
> think.
>
> TBKK
>
> Shawn Simister wrote:
>> I'm eager to see how this turns out. Even with a cursor returning 100
>> instances at a time it could take a while to get every instance in
>> Freebase. Maybe there's some way that Metaweb could run your query
>> locally and just let us download the JSON results as one file.  
>> Then we
>> can convert those query results to RDF locally. You might also  
>> consider
>> contacting the DBpedia.org  folks. If I remember correctly, they were
>> also interested in getting RDF dumps of the Freebase data.
>>
>> Shawn
>>
>> John Giannandrea wrote:
>>> Kavitha Srinivas wrote:
>>>
>>>> Here's what we tried -- we tried to get any instance of /common/ 
>>>> topic
>>>> and dump all of its links to other instances of /common/topic.   
>>>> When
>>>> we try this with no explicit limits set, this gives us some  
>>>> randomly
>>>> selected instances (within a default limit, which I guess is 100).
>>>>
>>>
>>> To succeed at this you will need to use the "cursor"  feature of MQL
>>> documented here.
>>>
>>> http://www.freebase.com/view/helptopic?id=%
>>> 239202a8c04000641f800000000544e139#cursors
>>>
>>> -jg
>>>
>>>
>>> _______________________________________________
>>> Developers mailing list
>>> Developers at freebase.com
>>> http://lists.freebase.com/mailman/listinfo/developers
>>>
>>>
>>
>>
>> --------------------------------------------------------------------- 
>> ---
>>
>> _______________________________________________
>> Developers mailing list
>> Developers at freebase.com
>> http://lists.freebase.com/mailman/listinfo/developers
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers



More information about the Developers mailing list