[Data-modeling] "Duck Types" in Freebase
Paul Houle
paul at ontology2.com
Fri Aug 7 18:14:47 UTC 2009
Something I've seen in how types are used in Freebase and how the
system pushes me to design schemas is a pattern similar to the "Duck
Typing" used in many languages. Rather than working out what sort of
existing type is correct for the value of the property, it often makes
sense to define a new type.
I was designing a set of types for describing nuclear reactors and
this came up. Many nuclear reactors contain a "Moderator", which is a
substance that slows neutrons down, increasing the interaction
cross-sections, and reducing the amount of fissile material required to
make a critical mass.
My first instinct in modeling is to say, "A NuclearModerator isA
Substance" so that the system allows you to say that
http://www.freebase.com/view/en/graphite
is used as a moderator in a reactor, but you can't say that
http://www.freebase.com/view/en/frank_zappa
is used as a moderator. Now, Freebase doesn't have a "Substance"
type; instead it has "Chemical Element", "Chemical Compound",
"Ingredient", "Nutrient" and other nonoverlapping types. In
particular, some moderators are compounds,
http://www.freebase.com/view/en/heavy_water
and others are elements or allotropes of elements, so there's no
existing "master type" that contains all of the substances which could
be moderators.
The right thing to do in this case is to create a new type with a
name like "NuclearReactorModerator", and force the "Moderator" property
to be of that type.
Practically, this works quite well. After adding a few reactor
instance, you'd have about 5-10 or so items under
NuclearReactorModerator. If we find a new kind of reactor that uses a
new moderator, it's not hard to add a type. In the meantime, the new
type lets FB provide useful autocompletion for the type, a list of
possible moderator materials, and an annotation on the moderator
materials that they have that use.
The one area where I feel a little uncomfortable is that I'd like an
official and consistent answer on how I should say
"Reactor X has no moderator"
which is the case in
http://www.freebase.com/view/en/liquid_metal_cooled_reactor
I'd like something a little more definitive than leaving the field
blank, since FB's "Open World Assumption" means that lots of fields are
going to be left blank because nobody bothered to fill them in. Many
people link to
http://www.freebase.com/view/en/independent
but there's also
http://www.freebase.com/view/guid/9202a8c04000641f800000000bc15fef
and
http://www.freebase.com/view/guid/9202a8c04000641f800000000bd0b576
and
http://www.freebase.com/view/guid/9202a8c04000641f800000000bb2bb90
and
http://www.freebase.com/view/guid/9202a8c04000641f800000000bd5dd72
Something ought to be done about this...
More information about the Data-modeling
mailing list