[Freebase-discuss] [BULK] Re: Crawling Freebase for BTC 2012
tsheldon at google.com
Mon Apr 30 19:41:22 UTC 2012
url params and user-agents (and better yet source ip) would be great to
have from this.
Unfortunately the 403 replies actually won't have a x-metaweb-tid on them
as they're blocked before they reach that system.
If you want to send your source-ip directly to me
(tsheldon[at]google[dot]com), I can help diagnose why you got blocked
On Mon, Apr 30, 2012 at 11:17 AM, Michael Masouras <masouras at google.com>wrote:
> Hi Andreas, do you have anything we can identify your request by
> (User-Agent, url param) ?
> Your responses should have an x-metaweb-tid header - if you can pass a few
> of those that were problematic it can help us chase the problem down.
> On Sun, Apr 29, 2012 at 11:56 PM, Andreas Harth <andreas at harth.org> wrote:
>> Hi guys,
>> On 16/04/12 14:37, Andreas Harth wrote:
>>> I think I'll just go ahead, start the crawl and see how far I get.
>> for the Billion Triple Challenge crawl I was able to do 80694
>> lookups before hitting 403's (ACCESS DENIED). Single thread
>> access, below 100k lookups/day.
>> Please advise.
>> Best regards,
>> You are receiving this message because you are subscribed to the
>> Freebase-discuss mailing list.
>> To post a message to the list: Freebase-discuss at freebase.com
>> To unsubscribe, view archives, etc: http://lists.freebase.com/**
> You are receiving this message because you are subscribed to the
> Freebase-discuss mailing list.
> To post a message to the list: Freebase-discuss at freebase.com
> To unsubscribe, view archives, etc:
Freebase Operations | Google Inc. | San Francisco
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Freebase-discuss