[Data-modeling] Denormalised data

Scott Meyer sm at metaweb.com
Wed Apr 8 17:10:03 UTC 2009


Philip Kendall wrote:
> Both /cvg/computer_videogame and /cvg/game_version deliberately contain
> developer, publisher and release date fields.
> 
> There's a certain lack of documentation around these fields, but the
> convention seems to be that the developer/publisher/date of the original
> version of the game go in the /cvg/computer_videogame properties, and the
> developer/publisher/date of any conversions/ports go in the
> /cvg/game_version properties.
> 
> However, this leads to a question as to what is "best practice" when
> filling in the /cvg/game_version for the original version of the game:
> should they be filled in with the same values as were put in
> /cvg/computer_videogame (thus meaning that the two could get out of
> sync) or should they be left empty (thus meaning that any apps have to
> know about this structure)?
> 
> Any views?

Yeah, fix the data model.

This need for documentation, convention, gardening bots to enforce
convention, etc., is exactly why denormalization (duplication of data)
is a problem.  The problem for application developers is "How do I
ask about all versions of a game?"  Can I just grab all the versions
or do I need to make a special case for "the original version"?  Currently,
the answer is the worst possible one: get all versions and the merge
the "original version" information from the video game topic (carefully!)
into the list of versions as it may or may not be there.

Since /cvg/game_version is a CVT, it seems like the reasonable thing
to do to represent an "original version" is to create a new property,
/cvg/computer_videogame/original_version, which also refers to something
of type /cvg/game_version. Typically this would refer to a cvt which is
also referred to by by the /cvg/computer_videogame/versions property
so the cost is one extra primitive multiplied by 17,000 video games.

If you want the "original version" just ask for it, no sorting of versions or
special cases. If you want all versions (including the original version) that is
just a property too.

-Scott


More information about the Data-modeling mailing list