[Data-modeling] refactoring sports schema: sports league phylogeny pattern

Dan Milbrath dmilbrath at metaweb.com
Thu Jul 24 22:11:03 UTC 2008


So, I'm taking a fresh look at the types in the sports domain and other 
related team sport domains (basketball, baseball, hockey, ...)
First up: phylogeny patterns for sports leagues/conferences/divisions

Previously I'd modeled each of these separately (basketball division, 
baseball league, etc...).
The motivation for this was that there are different names for different 
sports, for instance:

Basketball
* League - NBA
* Conference - Eastern Conference, Western Conference
* Division - Atlantic Division, ...
* Team - Boston Celtics, ....

Baseball
* League - Major League Baseball
* League - American League, National League
* Division - American League West, ...
* Team - Detroit Tigers, ...

tylerkelley on the discussion boards asked that we use a 
contains/contained by structure instead, as we do with location.
For that thread, see: 
http://www.freebase.com/discuss/threads/guid/9202a8c04000641f8000000008745a60

It seemed worth a try,  so I went ahead and renamed 'sports league' to 
'sports association' -- this seemed better than 'organization' or 
'league' as a general name.
I also added a contains/contained by property and included this type on 
the existing types (basketball division, baseball league, ...)

It seems to work pretty well. See:
http://www.freebase.com/view/en/major_league_baseball
http://www.freebase.com/view/en/detroit_tigers

http://www.freebase.com/view/guid/9202a8c04000641f80000000004bc40b
http://www.freebase.com/view/en/boston_celtics

You can see that in the first case (baseball), I've hidden the old, 
redundant properties that previously appeared on 'baseball league'. For 
comparison, I've left it present on 'basketball division'.

I'd also point out that I left 'teams' as its own property because 
'contains' might lead users to say 'Atlantic Division' contains 'Boston 
Celtics'... and then 'Boston Celtics' contains 'Kevin Garnett' -- and 
before you know it, people are being typed as associations. That said, 
this nuance might be lost on people entering information... so its a 
little risky.


Open questions
* I've had some feedback that contains/contained by are too generic and 
may confuse people. Other suggestions for property names that will work 
well across different sports?
* any objections to proceeding with this throughout the other sports? It 
will involve including 'sports association' on all the types I'm 
replacing and re-entering the data in the new properties.
* once done, any objections to hiding or removing the (now) redundant 
properties on types like (basketball division, hockey conference, ...)
* is there any value in retaining these types if/when these properties 
are removed?








More information about the Data-modeling mailing list