[Data-modeling] Upcoming Schema Changes
Robert Cook
robert at metaweb.com
Wed Feb 13 01:51:03 UTC 2008
Jeff -- I personally agree that a more general mechanism is
preferable, something that allows time series and "the definitive
value at the moment" should coexist without creating two properties
with a denormalization. In the meantime, I'm afraid that the
representation will probably have to be more adhoc since the
capability isn't yet there. Indeed, the presence of so many special
cases would force development priorities into this more general
direction, as well as provide guidance for a better solution.
The real problem, I think, is the lack use cases. We can imagine the
power of a general system, but in practice, for example, I haven't
seen too much time series data. The experiment of "places lived" has
so far not proven very successful at collecting much data.
R
On Feb 12, 2008, at 5:40 PM, Jeff Thompson wrote:
> I agree that "denormalization" (separate properties for the current
> value and the historical series) is
> attractive simply because we can do it with the existing code. But
> is that the only attraction?
> The question I'm pushing is: If we have a time-series of "places
> lived" for Robert Cook, why has
> no one yet called for a separate property for "current living place"
> for Robert Cook? Is that because
> they are happy to manually view the time series on the Freebase
> page, or manually cook up (no pun intended)
> a property-specific query to get the current value from the time
> series? I'm trying to shift the burden
> of proof on the question...
>
> Christopher R. Maden wrote:
>> Jeff Thompson <jeff at thefirst.org> wrote:
>>> Concerning option #2, can MQL answer the following queries right
>>> now:
>>> * Who is the spouse of Nicole Kidman? (i.e., the latest non-couch-
>>> jumping spouse)
>>> http://www.freebase.com/view/en/nicole_kidman
>>> * Where did Barack Obama get his degree? (i.e., the latest degree)
>>> http://www.freebase.com/view/en/barack_obama
>>> * Where does Robert Cook live? (i.e., now)
>>> http://www.freebase.com/view/guid/9202a8c04000641f80000000008427e9
>>
>> Kind of. One can structure a query for all of Nicole Kidman’s
>> domestic relationships which have a start date but no end date.
>> Similarly, one can ask for the first of Obama’s degrees, reverse-
>> sorted by date.
>>
>> However, this does not handle incomplete knowledge; if, for
>> instance, we know that Obama received these three degrees, but only
>> a date for one of them. And the client doesn’t currently know how
>> to distinguish time-series properties from others. It would be
>> possible, as Robert suggests, to add something like the property
>> hints that currently let the client know about disambiguating
>> properties; these would suggest that a property is expected to be
>> time-valued. That would, in turn, require standardizing on the way
>> of representing the dates themselves...
>>
>> In short, this is a feature that we want, and have been batting
>> around for some time, but we need some help defining the use cases
>> so we can best meet your needs. Is the denormalization that Robert
>> proposes prohibitively problematic? Is a certain amount of data
>> contortion acceptable, to fit a standardized representation of time-
>> series data? Are there other approaches we haven’t considered?
>>
>> ~Chris
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
More information about the Data-modeling
mailing list