[Data-modeling] associating OMB codes with US government agencies -- best way to model?

Raymond Yee raymond.yee at gmail.com
Wed Jul 15 14:57:50 UTC 2009


Hi everyone,

I've been doing work in trying to model the US federal government.  I 
have found that that in the US Federal Budget (and in other contexts 
such as the the reporting of Recovery money), government agencies and 
their sub-agencies ("bureaus") are associated with an "agency code" and 
an "bureau code".  For example, the Department of Agriculture has an 
agency code of 005 and one of its bureaus, the National Agricultural 
Statistics Service, is denoted by an agency code of 005 and a bureau 
code of 15.  (See 
http://www.gpoaccess.gov/usbudget/fy10/pdf/db_guide.pdf for more 
details.)  

I've done some reconciliation work to map these "OMB codes" (names after 
the Office of Management and Budget) to the Freebase ID of their 
respective government agencies.  I have an OPML file 
(http://labs.dataunbound.com/doc/2009/06/OMB_A_11_C_reconciled.v0.1.xml 
)  that shows my current mapping  and a Yahoo! UI treeview rendition of 
the data at 
(http://labs.dataunbound.com/doc/2009/06/govt.treeview.v0.1.html 

Now I'd like to upload those codes to Freebase, but would like to get 
some advice about how to model this data.  I've set out two schemas in 
the sandbox:

1) The first is found at 
http://www.sandbox-freebase.com/type/schema/base/stimulustracking/agency_with_omb_code 
-- I created an "Agency with OMB Code" type, which has an included type 
of /government/government_agency and the following additional 
properties:  agency_code, bureau_code, and label properties.  You'll see 
from 
http://www.sandbox-freebase.com/view/base/stimulustracking/agency_with_omb_code?domain=%2Fbase%2Fstimulustracking 
that I would then associate USDA as a "Agency with OMB Code".   I 
loosely call this the "is-a" approach -- USDA "is-a" "Agency with OMB Code".

2) The second is found at 
http://www.sandbox-freebase.com/type/schema/base/stimulustracking/omb_code 
-- I created a "OMB Code" type, which has the properties:  agency_code, 
bureau_code, label, and corresponding agency.  The corresponding agency 
property is a link with an expected value type of 
/government/government_agency.  Currently, an OMB code object would be a 
/common/topic (though this might be a bad idea....).  An example:  
http://www.sandbox-freebase.com/view/guid/9202a8c04000641f800000000cad7bd9/-/base/stimulustracking 
, which links to USDA through the corresponding agency property.  I 
think of this as the "has-a" approach:  USDA "has-a" OMB code

One thing to note:   not every US federal government agency/bureau has 
an OMB code and not every code corresponds to a particular agency -- but 
I'd like to still record all the OMB codes I can find.  e.g., there is 
an OMB code for the "energy programs" of the  Department of Energy.  My 
understanding is that "energy programs" doesn't correspond to an 
administrative entity but to a collection of programs in DOE.

My question:  is #1 or #2 better?  Neither?  Can you tell a yet better 
model? I realize that I face this type of modeling issue a lot -- for 
example,   BWV numbers are associated with the compositions of J.S. Bach 
-- and I started to model this in 
http://www.freebase.com/type/schema/base/jsbach/bach_composition -- to 
extent /music/composition to /base/jsbach/bach_composition and add a BWV 
property.  But is this the best way to do this modeling?

Thanks!
-Raymond




More information about the Data-modeling mailing list