[Developers] corruption in first column of freebase-datadump-quadruples.tsv.bz2 for 2008/07/01
Alexander Marks
al at metaweb.com
Sun Jul 20 02:09:56 UTC 2008
Thanks for spotting this. I've grep'd out those 17 corrupt records and updated the tsv at our downloads site. Kurt has identified the origin of the problem, so the next one will have a complete fix. For anyone with this version, no need to re-download, just use: grep '^/guid/9' freebase-datadump-quadruples.tsv > freebase-datadump-quadruples.tsv.fixed
Al
----- Original Message -----
From: "Robin H. Johnson" <robbat2 at gentoo.org>
To: developers at freebase.com
Sent: Saturday, July 19, 2008 10:29:41 AM (GMT-0800) America/Los_Angeles
Subject: [Developers] corruption in first column of freebase-datadump-quadruples.tsv.bz2 for 2008/07/01
There's some weird non-GUID data in the first column.
Easiest way to find it is:
# grep '^/guid/[^9]' freebase-datadump-quadruples.tsv.20080701 -C1
These strings appear in the first column, and aren't referenced anywhere else.
The output from the above command follows:
/guid/9202a8c04000641f80000000087fcb24 /type/content/length 2932
/guid/\nTried to play the station on Z /user/zsi_editorial/editorial/comment/quality 66
/guid/\nTried to play the station on Z /user/zsi_editorial/editorial/comment/reviewed 2008-06-29T12:17:35.0000Z
/guid/9202a8c04000641f8000000000000624 /type/object/type /type/property
--
/guid/9202a8c04000641f80000000087fd16c reverse_of:/location/location/geolocation /guid/9202a8c04000641f80000000087fd082
/guid/\nPlaylist download failed!" 920 /user/zsi_editorial/editorial/comment/reviewed 2008-06-23T08:08:37.0006Z
/guid/\nPlaylist download failed!" 920 /user/zsi_editorial/editorial/comment/quality 59
/guid/9202a8c04000641f8000000000000413 /type/usergroup/member /guid/9202a8c04000641f80000000042b2e87
--
/guid/9202a8c04000641f80000000087fd0a5 /common/document/in_reply_to /guid/9202a8c04000641f80000000087903f1
/guid/\nfor her Fresh Brew by becoming /type/object/name /lang/en Caffeinated Ponderings
/guid/9202a8c04000641f8000000000000258 /type/property/schema /guid/9202a8c04000641f8000000000000023
--
/guid/9202a8c04000641f80000000087fd134 /business/employment_tenure/from 1978
/guid/check Blog for updates" 9202a8c0 /type/object/name /lang/en Sea Kayak Podcasts.com
/guid/9202a8c04000641f800000000000020b /type/property/master_property /guid/9202a8c04000641f800000000000000a
--
/guid/9202a8c04000641f80000000087fd08e /metropolitan_transit/transit_stop/transit_lines /guid/9202a8c04000641f80000000087fd018
/guid/\nat System.Web.UI.Page.ProcessR /user/zsi_editorial/editorial/comment/quality 12
/guid/\nat System.Web.UI.Page.ProcessR /user/zsi_editorial/editorial/comment/reviewed 2008-06-23T03:28:44.0000Z
/guid/9202a8c04000641f8000000000000333 /type/object/type /type/usergroup
--
/guid/9202a8c04000641f80000000087fc8e5 /business/employment_tenure/person /guid/9202a8c04000641f80000000087fc8e7
/guid/\nImage does not appear on devic /user/zsi_editorial/editorial/comment/reviewed 2008-06-23T08:15:02.0004Z
/guid/\nImage does not appear on devic /user/zsi_editorial/editorial/comment/quality 73
/guid/9202a8c04000641f8000000000000418 reverse_of:/community/discussion_thread/topic /guid/9202a8c04000641f8000000007b1af05
--
/guid/9202a8c04000641f80000000087fd033 /type/object/type /common/topic
/guid/\nThe freebase entry for Wallstr /user/zsi_editorial/editorial/comment/reviewed 2008-07-01T06:38:23.0000Z
/guid/\nThe freebase entry for Wallstr /user/zsi_editorial/editorial/comment/quality 77
/guid/9202a8c04000641f800000000000038f /type/permission/controls /guid/9202a8c04000641f8000000000000398
--
/guid/9202a8c04000641f80000000087fcfa4 /type/content/text_encoding /guid/9202a8c04000641f800000000000388e
/guid/\nPlayback on the device stutter /user/zsi_editorial/editorial/comment/quality 49
/guid/\nPlayback on the device stutter /user/zsi_editorial/editorial/comment/reviewed 2008-06-27T06:30:45.0007Z
/guid/9202a8c04000641f800000000000049f /type/permission/controls /guid/9202a8c04000641f80000000000008f3
--
/guid/9202a8c04000641f80000000087fd070 /metropolitan_transit/transit_stop/transit_lines /guid/9202a8c04000641f80000000087fd018
/guid/\nE'Mu" 9202a8c04000641f80000000 /type/object/key /guid/9202a8c04000641f8000000001143432 ARTIST212836
/guid/\nE'Mu" 9202a8c04000641f80000000 /type/object/key /guid/9202a8c04000641f800000000114342d 735ae537-9825-40b0-af80-3e342ddd5a55
/guid/9202a8c04000641f800000000000012b /type/object/key /boot has_left_order
--
/guid/9202a8c04000641f80000000087fc680 reverse_of:/metropolitan_transit/transit_stop/service_hours /guid/9202a8c04000641f8000000000cafd67
/guid/\n-Sales Agents" 9202a8c04000641 /type/object/name /lang/en Professional Development for Women and Minorities
/guid/9202a8c04000641f8000000000000343 /type/object/name /lang/en Domain owners
--
Robin Hugh Johnson
Gentoo Linux Developer & Infra Guy
E-Mail : robbat2 at gentoo.org
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
More information about the Developers
mailing list