[Data-modeling] refactoring sports schema: sports league phylogeny pattern
Dan Milbrath
dmilbrath at metaweb.com
Thu Jul 24 22:11:03 UTC 2008
So, I'm taking a fresh look at the types in the sports domain and other
related team sport domains (basketball, baseball, hockey, ...)
First up: phylogeny patterns for sports leagues/conferences/divisions
Previously I'd modeled each of these separately (basketball division,
baseball league, etc...).
The motivation for this was that there are different names for different
sports, for instance:
Basketball
* League - NBA
* Conference - Eastern Conference, Western Conference
* Division - Atlantic Division, ...
* Team - Boston Celtics, ....
Baseball
* League - Major League Baseball
* League - American League, National League
* Division - American League West, ...
* Team - Detroit Tigers, ...
tylerkelley on the discussion boards asked that we use a
contains/contained by structure instead, as we do with location.
For that thread, see:
http://www.freebase.com/discuss/threads/guid/9202a8c04000641f8000000008745a60
It seemed worth a try, so I went ahead and renamed 'sports league' to
'sports association' -- this seemed better than 'organization' or
'league' as a general name.
I also added a contains/contained by property and included this type on
the existing types (basketball division, baseball league, ...)
It seems to work pretty well. See:
http://www.freebase.com/view/en/major_league_baseball
http://www.freebase.com/view/en/detroit_tigers
http://www.freebase.com/view/guid/9202a8c04000641f80000000004bc40b
http://www.freebase.com/view/en/boston_celtics
You can see that in the first case (baseball), I've hidden the old,
redundant properties that previously appeared on 'baseball league'. For
comparison, I've left it present on 'basketball division'.
I'd also point out that I left 'teams' as its own property because
'contains' might lead users to say 'Atlantic Division' contains 'Boston
Celtics'... and then 'Boston Celtics' contains 'Kevin Garnett' -- and
before you know it, people are being typed as associations. That said,
this nuance might be lost on people entering information... so its a
little risky.
Open questions
* I've had some feedback that contains/contained by are too generic and
may confuse people. Other suggestions for property names that will work
well across different sports?
* any objections to proceeding with this throughout the other sports? It
will involve including 'sports association' on all the types I'm
replacing and re-entering the data in the new properties.
* once done, any objections to hiding or removing the (now) redundant
properties on types like (basketball division, hockey conference, ...)
* is there any value in retaining these types if/when these properties
are removed?
More information about the Data-modeling
mailing list