[Developers] corruption in first column of freebase-datadump-quadruples.tsv.bz2 for 2008/07/01

Kurt Bollacker kurt at metaweb.com
Sat Jul 19 19:39:34 UTC 2008


My, that does look like a serious bug.  We'll look into it ASAP.

								Kurt :-)





On Sat, Jul 19, 2008 at 10:29:41AM -0700, Robin H. Johnson wrote:
> There's some weird non-GUID data in the first column.
> Easiest way to find it is:
> # grep '^/guid/[^9]' freebase-datadump-quadruples.tsv.20080701 -C1
> 
> These strings appear in the first column, and aren't referenced anywhere else.
> 
> The output from the above command follows:
> 
> /guid/9202a8c04000641f80000000087fcb24	/type/content/length		2932
> /guid/\nTried to play the station on Z	/user/zsi_editorial/editorial/comment/quality		66
> /guid/\nTried to play the station on Z	/user/zsi_editorial/editorial/comment/reviewed		2008-06-29T12:17:35.0000Z
> /guid/9202a8c04000641f8000000000000624	/type/object/type	/type/property	
> --
> /guid/9202a8c04000641f80000000087fd16c	reverse_of:/location/location/geolocation	/guid/9202a8c04000641f80000000087fd082	
> /guid/\nPlaylist download failed!" 920	/user/zsi_editorial/editorial/comment/reviewed		2008-06-23T08:08:37.0006Z
> /guid/\nPlaylist download failed!" 920	/user/zsi_editorial/editorial/comment/quality		59
> /guid/9202a8c04000641f8000000000000413	/type/usergroup/member	/guid/9202a8c04000641f80000000042b2e87	
> --
> /guid/9202a8c04000641f80000000087fd0a5	/common/document/in_reply_to	/guid/9202a8c04000641f80000000087903f1	
> /guid/\nfor her Fresh Brew by becoming	/type/object/name	/lang/en	Caffeinated Ponderings
> /guid/9202a8c04000641f8000000000000258	/type/property/schema	/guid/9202a8c04000641f8000000000000023	
> --
> /guid/9202a8c04000641f80000000087fd134	/business/employment_tenure/from		1978
> /guid/check Blog for updates" 9202a8c0	/type/object/name	/lang/en	Sea Kayak Podcasts.com
> /guid/9202a8c04000641f800000000000020b	/type/property/master_property	/guid/9202a8c04000641f800000000000000a	
> --
> /guid/9202a8c04000641f80000000087fd08e	/metropolitan_transit/transit_stop/transit_lines	/guid/9202a8c04000641f80000000087fd018	
> /guid/\nat System.Web.UI.Page.ProcessR	/user/zsi_editorial/editorial/comment/quality		12
> /guid/\nat System.Web.UI.Page.ProcessR	/user/zsi_editorial/editorial/comment/reviewed		2008-06-23T03:28:44.0000Z
> /guid/9202a8c04000641f8000000000000333	/type/object/type	/type/usergroup	
> --
> /guid/9202a8c04000641f80000000087fc8e5	/business/employment_tenure/person	/guid/9202a8c04000641f80000000087fc8e7	
> /guid/\nImage does not appear on devic	/user/zsi_editorial/editorial/comment/reviewed		2008-06-23T08:15:02.0004Z
> /guid/\nImage does not appear on devic	/user/zsi_editorial/editorial/comment/quality		73
> /guid/9202a8c04000641f8000000000000418	reverse_of:/community/discussion_thread/topic	/guid/9202a8c04000641f8000000007b1af05	
> --
> /guid/9202a8c04000641f80000000087fd033	/type/object/type	/common/topic	
> /guid/\nThe freebase entry for Wallstr	/user/zsi_editorial/editorial/comment/reviewed		2008-07-01T06:38:23.0000Z
> /guid/\nThe freebase entry for Wallstr	/user/zsi_editorial/editorial/comment/quality		77
> /guid/9202a8c04000641f800000000000038f	/type/permission/controls	/guid/9202a8c04000641f8000000000000398	
> --
> /guid/9202a8c04000641f80000000087fcfa4	/type/content/text_encoding	/guid/9202a8c04000641f800000000000388e	
> /guid/\nPlayback on the device stutter	/user/zsi_editorial/editorial/comment/quality		49
> /guid/\nPlayback on the device stutter	/user/zsi_editorial/editorial/comment/reviewed		2008-06-27T06:30:45.0007Z
> /guid/9202a8c04000641f800000000000049f	/type/permission/controls	/guid/9202a8c04000641f80000000000008f3	
> --
> /guid/9202a8c04000641f80000000087fd070	/metropolitan_transit/transit_stop/transit_lines	/guid/9202a8c04000641f80000000087fd018	
> /guid/\nE'Mu" 9202a8c04000641f80000000	/type/object/key	/guid/9202a8c04000641f8000000001143432	ARTIST212836
> /guid/\nE'Mu" 9202a8c04000641f80000000	/type/object/key	/guid/9202a8c04000641f800000000114342d	735ae537-9825-40b0-af80-3e342ddd5a55
> /guid/9202a8c04000641f800000000000012b	/type/object/key	/boot	has_left_order
> --
> /guid/9202a8c04000641f80000000087fc680	reverse_of:/metropolitan_transit/transit_stop/service_hours	/guid/9202a8c04000641f8000000000cafd67	
> /guid/\n-Sales Agents" 9202a8c04000641	/type/object/name	/lang/en	Professional Development for Women and Minorities
> /guid/9202a8c04000641f8000000000000343	/type/object/name	/lang/en	Domain owners
> 
> 
> -- 
> Robin Hugh Johnson
> Gentoo Linux Developer & Infra Guy
> E-Mail     : robbat2 at gentoo.org
> GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85



> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers



More information about the Developers mailing list