[Developers] corruption in first column offreebase-datadump-quadruples.tsv.bz2 for 2008/07/01

Arthur van Hoff AVH at zing.net
Sat Jul 19 19:57:36 UTC 2008


It might have to do with newlines in some of the fields. I've had a
similar problem dumping some podcast entries.

> -----Original Message-----
> From: developers-bounces at freebase.com [mailto:developers-
> bounces at freebase.com] On Behalf Of Kurt Bollacker
> Sent: Saturday, July 19, 2008 12:40 PM
> To: For discussions about MQL,Freebase API and apps built on Freebase
> Subject: Re: [Developers] corruption in first column offreebase-
> datadump-quadruples.tsv.bz2 for 2008/07/01
> 
> 
> My, that does look like a serious bug.  We'll look into it ASAP.
> 
> 								Kurt :-)
> 
> 
> 
> 
> 
> On Sat, Jul 19, 2008 at 10:29:41AM -0700, Robin H. Johnson wrote:
> > There's some weird non-GUID data in the first column.
> > Easiest way to find it is:
> > # grep '^/guid/[^9]' freebase-datadump-quadruples.tsv.20080701 -C1
> >
> > These strings appear in the first column, and aren't referenced
> anywhere else.
> >
> > The output from the above command follows:
> >
> > /guid/9202a8c04000641f80000000087fcb24	/type/content/length
> 	2932
> > /guid/\nTried to play the station on Z
> 	/user/zsi_editorial/editorial/comment/quality		66
> > /guid/\nTried to play the station on Z
> 	/user/zsi_editorial/editorial/comment/reviewed		2008-06-
> 29T12:17:35.0000Z
> > /guid/9202a8c04000641f8000000000000624	/type/object/type
> 	/type/property
> > --
> > /guid/9202a8c04000641f80000000087fd16c
> 	reverse_of:/location/location/geolocation
> 	/guid/9202a8c04000641f80000000087fd082
> > /guid/\nPlaylist download failed!" 920
> 	/user/zsi_editorial/editorial/comment/reviewed		2008-06-
> 23T08:08:37.0006Z
> > /guid/\nPlaylist download failed!" 920
> 	/user/zsi_editorial/editorial/comment/quality		59
> > /guid/9202a8c04000641f8000000000000413	/type/usergroup/member
> 	/guid/9202a8c04000641f80000000042b2e87
> > --
> > /guid/9202a8c04000641f80000000087fd0a5
/common/document/in_reply_to
> 	/guid/9202a8c04000641f80000000087903f1
> > /guid/\nfor her Fresh Brew by becoming	/type/object/name
/lang/en
> 	Caffeinated Ponderings
> > /guid/9202a8c04000641f8000000000000258	/type/property/schema
> 	/guid/9202a8c04000641f8000000000000023
> > --
> > /guid/9202a8c04000641f80000000087fd134
> 	/business/employment_tenure/from		1978
> > /guid/check Blog for updates" 9202a8c0	/type/object/name
/lang/en
> 	Sea Kayak Podcasts.com
> > /guid/9202a8c04000641f800000000000020b
> 	/type/property/master_property
> 	/guid/9202a8c04000641f800000000000000a
> > --
> > /guid/9202a8c04000641f80000000087fd08e
> 	/metropolitan_transit/transit_stop/transit_lines
> 	/guid/9202a8c04000641f80000000087fd018
> > /guid/\nat System.Web.UI.Page.ProcessR
> 	/user/zsi_editorial/editorial/comment/quality		12
> > /guid/\nat System.Web.UI.Page.ProcessR
> 	/user/zsi_editorial/editorial/comment/reviewed		2008-06-
> 23T03:28:44.0000Z
> > /guid/9202a8c04000641f8000000000000333	/type/object/type
> 	/type/usergroup
> > --
> > /guid/9202a8c04000641f80000000087fc8e5
> 	/business/employment_tenure/person
> 	/guid/9202a8c04000641f80000000087fc8e7
> > /guid/\nImage does not appear on devic
> 	/user/zsi_editorial/editorial/comment/reviewed		2008-06-
> 23T08:15:02.0004Z
> > /guid/\nImage does not appear on devic
> 	/user/zsi_editorial/editorial/comment/quality		73
> > /guid/9202a8c04000641f8000000000000418
> 	reverse_of:/community/discussion_thread/topic
> 	/guid/9202a8c04000641f8000000007b1af05
> > --
> > /guid/9202a8c04000641f80000000087fd033	/type/object/type
> 	/common/topic
> > /guid/\nThe freebase entry for Wallstr
> 	/user/zsi_editorial/editorial/comment/reviewed		2008-07-
> 01T06:38:23.0000Z
> > /guid/\nThe freebase entry for Wallstr
> 	/user/zsi_editorial/editorial/comment/quality		77
> > /guid/9202a8c04000641f800000000000038f
/type/permission/controls
> 	/guid/9202a8c04000641f8000000000000398
> > --
> > /guid/9202a8c04000641f80000000087fcfa4
/type/content/text_encoding
> 	/guid/9202a8c04000641f800000000000388e
> > /guid/\nPlayback on the device stutter
> 	/user/zsi_editorial/editorial/comment/quality		49
> > /guid/\nPlayback on the device stutter
> 	/user/zsi_editorial/editorial/comment/reviewed		2008-06-
> 27T06:30:45.0007Z
> > /guid/9202a8c04000641f800000000000049f
/type/permission/controls
> 	/guid/9202a8c04000641f80000000000008f3
> > --
> > /guid/9202a8c04000641f80000000087fd070
> 	/metropolitan_transit/transit_stop/transit_lines
> 	/guid/9202a8c04000641f80000000087fd018
> > /guid/\nE'Mu" 9202a8c04000641f80000000	/type/object/key
> 	/guid/9202a8c04000641f8000000001143432	ARTIST212836
> > /guid/\nE'Mu" 9202a8c04000641f80000000	/type/object/key
> 	/guid/9202a8c04000641f800000000114342d	735ae537-9825-40b0-
> af80-3e342ddd5a55
> > /guid/9202a8c04000641f800000000000012b	/type/object/key
/boot
> 	has_left_order
> > --
> > /guid/9202a8c04000641f80000000087fc680
> 	reverse_of:/metropolitan_transit/transit_stop/service_hours
> 	/guid/9202a8c04000641f8000000000cafd67
> > /guid/\n-Sales Agents" 9202a8c04000641	/type/object/name
/lang/en
> 	Professional Development for Women and Minorities
> > /guid/9202a8c04000641f8000000000000343	/type/object/name
/lang/en
> 	Domain owners
> >
> >
> > --
> > Robin Hugh Johnson
> > Gentoo Linux Developer & Infra Guy
> > E-Mail     : robbat2 at gentoo.org
> > GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85
> 
> 
> 
> > _______________________________________________
> > Developers mailing list
> > Developers at freebase.com
> > http://lists.freebase.com/mailman/listinfo/developers
> 
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers


More information about the Developers mailing list