[Developers] Bulk download from sandbox.freebase.com
Kavitha Srinivas
ksrinivs at gmail.com
Tue Oct 9 18:20:05 UTC 2007
Hello
We wrote some Javascript to read every instance of a topic using
cursors from sandbox.freebase.com. We managed to read successfully a
few times from the cursor using this technique. However, after that
we keep getting timeouts in the read. We amended our script to (a)
try multiple times, which did not work because it appeared to get
stuck at the same point, (b) try using limits of 10 to get data, but
then this will simply not scale. Any help is appreciated.
Thanks!
Kavitha
On Oct 8, 2007, at 3:53 PM, Tim Kientzle wrote:
> If you can come up with a good way to do this
> and performance really turns out to be a problem,
> we *may* be able to host it internally and provide
> the dump as a downloadable file. (Have the dump
> regenerated once a week or so, perhaps?)
>
> No promises, but it's a possibility.
>
> You should, of course, try to make it work
> externally first. Our internal connections
> are faster, but may not be as much faster as you
> think.
>
> TBKK
>
> Shawn Simister wrote:
>> I'm eager to see how this turns out. Even with a cursor returning 100
>> instances at a time it could take a while to get every instance in
>> Freebase. Maybe there's some way that Metaweb could run your query
>> locally and just let us download the JSON results as one file.
>> Then we
>> can convert those query results to RDF locally. You might also
>> consider
>> contacting the DBpedia.org folks. If I remember correctly, they were
>> also interested in getting RDF dumps of the Freebase data.
>>
>> Shawn
>>
>> John Giannandrea wrote:
>>> Kavitha Srinivas wrote:
>>>
>>>> Here's what we tried -- we tried to get any instance of /common/
>>>> topic
>>>> and dump all of its links to other instances of /common/topic.
>>>> When
>>>> we try this with no explicit limits set, this gives us some
>>>> randomly
>>>> selected instances (within a default limit, which I guess is 100).
>>>>
>>>
>>> To succeed at this you will need to use the "cursor" feature of MQL
>>> documented here.
>>>
>>> http://www.freebase.com/view/helptopic?id=%
>>> 239202a8c04000641f800000000544e139#cursors
>>>
>>> -jg
>>>
>>>
>>> _______________________________________________
>>> Developers mailing list
>>> Developers at freebase.com
>>> http://lists.freebase.com/mailman/listinfo/developers
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> ---
>>
>> _______________________________________________
>> Developers mailing list
>> Developers at freebase.com
>> http://lists.freebase.com/mailman/listinfo/developers
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
More information about the Developers
mailing list