[Data-modeling] "area" property of "/location/location" type

Robert Cook robert at metaweb.com
Wed Apr 2 18:40:00 UTC 2008


Again, I think this is an example of two distinct use cases:  1) The  
definitive value that's there for people who really don't want to be  
bothered with all of the detail (most of the world); and 2) All of the  
possible variants, be they measurements taken at a different time,  
with a different methodology or by different people that's there for  
the enthusiasts (those who care enough about the data to ensure it's  
complete and accurate).  Take for example the over-modeling I did on  
Earthquakes:

http://www.freebase.com/view/en/loma_prieta_earthquake

The Magnitude is blown out into a CVT and can have multiple magnitude  
values on different scales from different sources.  This is certainly  
interesting to seismologists and geo-geeks, but for somebody who wants  
an ordered list of the top 30 most destructive earthquakes, the detail  
simply gets in the way.

The same could be said for the "current" vs "complete" list of board  
members of a company, members of a sports team, products of a  
manufacturer, or population of a country.  Jeff T pointed out that I  
went as far as creating a specific time-series property to capture the  
changing name of a company:

<http://www.freebase.com/view/guid/9202a8c04000641f8000000005b7ab1f>
(See "Previous names")

So far my (kind of lame) solution is to use two distinct properties.   
This is an OK stopgap, but, like all denormalizations, it's confusing  
and semantically sub-optimal.  The "simple" property really should be  
generated by a query on the complete property, although it's not clear  
how that could support the "top 30 earthquakes" example above in a  
performant way.  For now, though, we have to balance when to have the  
simple representation and when we should add a CVT enriched property  
that can hold all of the possible representations.

Thoughts?

R

On Apr 2, 2008, at 10:47 AM, Jeff Prucher wrote:

>> -----Original Message-----
>> From: data-modeling-bounces at freebase.com
>> [mailto:data-modeling-bounces at freebase.com] On Behalf Of
>> Kirrily Robert
>> Sent: Wednesday, April 02, 2008 10:04 AM
>> To: Freebase data modeling mailing list
>> Subject: Re: [Data-modeling] "area" property of
>> "/location/location" type
>>
>>
>> ----- "Jonathan W. Lowe" <jlowe at giswebsite.com> wrote:
>>> The "/location/location" type has an "area" property that accepts
>>> multiple values.  Should this property instead be restricted to one
>>> value?
>>>
>>> Unless someone can identify a location having more than one valid
>>> area, I recommend a schema change that restricts
>>> /location/location/area to one value.
>>
>> Time series!  Uhhh... forget I said that.  Please.  PLEASE?
>
> Well, that and the fact that different sources can have different,  
> equally
> valid, area measurements for the same location at the same time,  
> depending
> on methodology. But neither of those reasons is why the current  
> property is
> non-unique; the real reason is that I forgot to check the box. Me, I'm
> tempted to just make it unique until we actually have some time- 
> series data
> to input. Right now, it's practically all WP infobox loads, which  
> implies
> reasonably current area.  (There are 24 locations with multiple  
> values, all
> of which appear to be differences in rounding, conversion, or using  
> the
> different measurement units, which will have to be cleaned up before  
> we can
> make the property unique.)
>
> Jeff P
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling



More information about the Data-modeling mailing list