[Data-modeling] "Duck Types" in Freebase

Paul Houle paul at ontology2.com
Fri Aug 7 18:14:47 UTC 2009


     Something I've seen in how types are used in Freebase and how the 
system pushes me to design schemas is a pattern similar to the "Duck 
Typing" used in many languages.  Rather than working out what sort of 
existing type is correct for the value of the property,  it often makes 
sense to define a new type.

    I was designing a set of types for describing nuclear reactors and 
this came up.  Many nuclear reactors contain a "Moderator",  which is a 
substance that slows neutrons down,  increasing the interaction 
cross-sections,  and reducing the amount of fissile material required to 
make a critical mass.

    My first instinct in modeling is to say,  "A NuclearModerator isA 
Substance" so that the system allows you to say that

http://www.freebase.com/view/en/graphite

    is used as a moderator in a reactor,  but you can't say that

http://www.freebase.com/view/en/frank_zappa

    is used as a moderator. Now,  Freebase doesn't have a "Substance" 
type;  instead it has "Chemical Element",  "Chemical Compound", 
"Ingredient", "Nutrient" and other nonoverlapping types.  In 
particular,  some moderators are compounds,

http://www.freebase.com/view/en/heavy_water

    and others are elements or allotropes of elements,  so there's no 
existing "master type" that contains all of the substances which could 
be moderators.

    The right thing to do in this case is to create a new type with a 
name like "NuclearReactorModerator",  and force the "Moderator" property 
to be of that type.

    Practically,  this works quite well.   After adding a few reactor 
instance,  you'd have about 5-10 or so items under 
NuclearReactorModerator.  If we find a new kind of reactor that uses a 
new moderator,  it's not hard to add a type.  In the meantime,  the new 
type lets FB provide useful autocompletion for the type,  a list of 
possible moderator materials,  and an annotation on the moderator 
materials that they have that use.

    The one area where I feel a little uncomfortable is that I'd like an 
official and consistent answer on how I should say

"Reactor X has no moderator"

    which is the case in

http://www.freebase.com/view/en/liquid_metal_cooled_reactor

    I'd like something a little more definitive than leaving the field 
blank,  since FB's "Open World Assumption" means that lots of fields are 
going to be left blank because nobody bothered to fill them in.  Many 
people link to

http://www.freebase.com/view/en/independent

but there's also

http://www.freebase.com/view/guid/9202a8c04000641f800000000bc15fef

and

http://www.freebase.com/view/guid/9202a8c04000641f800000000bd0b576

and

http://www.freebase.com/view/guid/9202a8c04000641f800000000bb2bb90

and

http://www.freebase.com/view/guid/9202a8c04000641f800000000bd5dd72

Something ought to be done about this...











More information about the Data-modeling mailing list