[Developers] 503 error when trying to do a large write to Sandbox
Alexander Botero-Lowry
alex at foxybanana.com
Fri Feb 22 06:47:36 UTC 2008
I de-typed the old mess, and I did import in 25 exchange rate blocks.
>>> res = metaweb.readall([{'type':'/user/alexbl/default_domain/exchange_rate', 'target_of_exchange':None, 'amount':None, 'date_of_rate':None, 'source_of_exchange':'US $'}])
>>> len(res)
2516
[22:45] alex at temperantia: ~> wc -l USD-AUD-90-99.txt
2516 USD-AUD-90-99.txt
Seems like doing it in small blocks solved the problem. I will start importing
the other data a bit later tonight. At what point is it a good idea to start
doing these imports on freebase proper instead of sandbox?
Alex
On Thu, Feb 21, 2008 at 01:03:38PM -0800, Bryan Cheung wrote:
> Alex,
>
> I believe that there are problems with having multiple queries in 1
> envelope. There has been some discussion about it here:
>
> http://lists.freebase.com/pipermail/developers/2008-January/001197.html
>
> Also, you may want to look at using "connect":"unless_connected"
> instead of "create":"unless_exists" in the MQL Write Grammar section
> of the Freebase API Documentation:
>
> http://www.freebase.com/view/guid/9202a8c04000641f800000000544e111)
>
> Essentially, "if you use unless_connected, then Metaweb looks for a
> matching object that is already connected. If it cannot find one, it
> creates a new one and connects it... ...Note that unless_connected
> only makes sense in nested clauses".
>
> Bryan
>
>
> On Feb 21, 2008, at 11:38 AM, Alexander Botero-Lowry wrote:
>
> > On Thu, Feb 21, 2008 at 11:02:56AM -0800, Jason Douglas wrote:
> >> I'm out of my depth here, but could your issue be related to this
> >> thread?
> >>
> >> Comparison operation in write?
> >> http://lists.freebase.com/pipermail/developers/2008-January/001198.html
> >>
> > I don't think so. I was doing a [{...},{...},{...}] style query.
> >
> > Alex
> >> -jason
> >>
> >>
> >> On Feb 20, 2008, at 4:31 PM, Alexander Botero-Lowry wrote:
> >>
> >>> Hi,
> >>>
> >>> Monday night I tried to import 2561 records as a single query on
> >>> Sandbox.
> >>> After fixing some bugs in how I was formatting my date, I finally
> >>> got the
> >>> query to execute but I received a 503. I tried the query a few more
> >>> times
> >>> and then checked sandbox to make sure nothing had happened and low
> >>> and
> >>> behold some of the entries showed up! At that point I realized
> >>> that I
> >>> wasn't sure all of them were there so I rewrote my importer script
> >>> to
> >>> do it in 100 block increments, and I got an error that there were 2
> >>> unique entries, which I looked up and determined was the result of
> >>> create=unless_exists not being able to disambiguate! So it seems
> >>> like
> >>> somehow, even though i was using create=unless_exists the entry got
> >>> added twice. I'm not exactly sure how transactions work internally
> >>> so I can't really speculate further on how that happened.
> >>>
> >>> Follows is my importer:
> >>>
> >>> #!/usr/bin/env python
> >>>
> >>> import metaweb
> >>>
> >>> USERNAME=''
> >>> PASSWORD=''
> >>>
> >>> MONTH_MAP = {'Jan':1, 'Feb':2, 'Mar':3, 'Apr':4, 'May':5, 'Jun':6,
> >>> 'Jul':7, 'Aug':8, 'Sep':9, 'Oct':10, 'Nov':11,
> >>> 'Dec':12}
> >>> TYPEID = '/user/alexbl/default_domain/exchange_rate'
> >>> SOURCE_CURR = {'name':'US $', 'type':'/finance/currency'}
> >>> TARGET_CURR = {'name':'Australian dollar', 'type':'/finance/
> >>> currency'}
> >>>
> >>> def generate_query(a):
> >>> data = a.split()
> >>> # FIXME: find a better way to do this
> >>> rate_date = data[0].split('-')
> >>> rate_date[2] = '19'+rate_date[2]
> >>> rate_date[1] = "%02d" % (MONTH_MAP[rate_date[1]])
> >>> rate_date[0] = "%02d" % int(rate_date[0])
> >>> data[0] = '-'.join(reversed(rate_date))
> >>>
> >>> q = {'create':'unless_exists',
> >>> 'id':None,
> >>> 'type':[TYPEID],
> >>> 'source_of_exchange':SOURCE_CURR,
> >>> 'target_of_exchange':TARGET_CURR,
> >>> 'amount':float(data[1]),
> >>> 'date_of_rate':data[0]
> >>> }
> >>> return q
> >>>
> >>> if __name__ == '__main__':
> >>> query = [ generate_query(a) for a in file('USD-
> >>> AUD-90-99.txt') ]
> >>> credentials = metaweb.login(USERNAME, PASSWORD)
> >>> for a in range(100, len(query), 100):
> >>> result = metaweb.write(query[a:a+100], credentials)
> >>> for r in result:
> >>> print r['create'], r['id']
> >>>
> >>>
> >>> Before I added the stepper, it was simply directly doing what's in
> >>> side the
> >>> for loop with the query list.
> >>>
> >>> The data format is like:
> >>> 2-Jan-90
> >>> 0.7855
> >>> 3-Jan-90
> >>> 0.7818
> >>> ...
> >>>
> >>> I will most likely write a script to query the ids and then detype
> >>> them and then
> >>> do an import again with the 100 block steps version to see if that's
> >>> the only problem.
> >>>
> >>> Luckily this was all on sandbox :)
> >>>
> >>> Thanks,
> >>> Alex
> >>>
> >>> _______________________________________________
> >>> Developers mailing list
> >>> Developers at freebase.com
> >>> http://lists.freebase.com/mailman/listinfo/developers
> >>
> >> _______________________________________________
> >> Developers mailing list
> >> Developers at freebase.com
> >> http://lists.freebase.com/mailman/listinfo/developers
> >>
> > _______________________________________________
> > Developers mailing list
> > Developers at freebase.com
> > http://lists.freebase.com/mailman/listinfo/developers
>
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
>
More information about the Developers
mailing list