[Developers] Announcing freebase-utils - data reconciliation and loading utilities

Tom Morris tfmorris at gmail.com
Thu Jul 30 17:06:10 UTC 2009


I've created a Google Code project to hold some of the utilities that
I put together for various projects.

   http://code.google.com/p/freebase-utils/

The initial code includes:

1. Loader for the U.S. National Register of Historic Places database
from the National Park Service.  It demonstrates the use of dbfpy to
read xBase files, parsing KML files, some name munging and
reconciliation techniques (including querying Wikipedia), and, of
course, a wide variety of MQL read and write queries.

2. Congressional Biography ID loader - This focuses more on various
types of personal name transformations to find the correct person in
Freebase.

These were written as one-time throwaway programs (yes, I'm old enough
to know better), so they're a bit of a mess, but I've started the work
to refactor some of the useful bits out so they can be reused.  If
nothing else, they provide complete working examples that were
actually used in production.

In addition to the refactoring work, I'll be adding information to the
wiki describing the process of doing these types of loads.

All the code has been released under the Eclipse Public License so it
can be used for commercial work as long as you contribute back your
improvements to the common code.

I plan to add to this collection over time and am happy to have others
contribute, so if you have code that you think would be a useful
addition, let me know.

Tom


More information about the Developers mailing list