[Data-modeling] Upcoming Schema Changes
Jeff Thompson
jeff at thefirst.org
Thu Feb 14 20:17:01 UTC 2008
Yes I'd agree this is how we have to go in the near term.
May I ask your thinking on something related. When you mention "a more general
mechanism", usually this is handled by a rules language. The
Freebase philosophy is to keep the user interface, MQL, and front-end data model at
the common sense (philosophically naive) level so that lots of people can
contribute. And then, to layer on the more sophisticated view
at another level for the experts. This abstraction will have to be done with a
rules language. (For example, a rule to infer the casual relation between events based
on sub events, or a rule to infer that Bob has a valved heart because Bob is a Person and
persons are mammals and mammals have valved hearts, etc.) Do you envision a version of MQL
doing the heavy lifting on this, or do you expect the expert to load the Freebase data
into another system that can be queried with a more sophisticated rules language?
Robert Cook wrote:
> Jeff -- I personally agree that a more general mechanism is
> preferable, something that allows time series and "the definitive
> value at the moment" should coexist without creating two properties
> with a denormalization. In the meantime, I'm afraid that the
> representation will probably have to be more adhoc since the
> capability isn't yet there. Indeed, the presence of so many special
> cases would force development priorities into this more general
> direction, as well as provide guidance for a better solution.
>
> The real problem, I think, is the lack use cases. We can imagine the
> power of a general system, but in practice, for example, I haven't
> seen too much time series data. The experiment of "places lived" has
> so far not proven very successful at collecting much data.
>
> R
>
> On Feb 12, 2008, at 5:40 PM, Jeff Thompson wrote:
>
>> I agree that "denormalization" (separate properties for the current
>> value and the historical series) is
>> attractive simply because we can do it with the existing code. But
>> is that the only attraction?
>> The question I'm pushing is: If we have a time-series of "places
>> lived" for Robert Cook, why has
>> no one yet called for a separate property for "current living place"
>> for Robert Cook? Is that because
>> they are happy to manually view the time series on the Freebase
>> page, or manually cook up (no pun intended)
>> a property-specific query to get the current value from the time
>> series? I'm trying to shift the burden
>> of proof on the question...
>>
>> Christopher R. Maden wrote:
>>> Jeff Thompson <jeff at thefirst.org> wrote:
>>>> Concerning option #2, can MQL answer the following queries right
>>>> now:
>>>> * Who is the spouse of Nicole Kidman? (i.e., the latest non-couch-
>>>> jumping spouse)
>>>> http://www.freebase.com/view/en/nicole_kidman
>>>> * Where did Barack Obama get his degree? (i.e., the latest degree)
>>>> http://www.freebase.com/view/en/barack_obama
>>>> * Where does Robert Cook live? (i.e., now)
>>>> http://www.freebase.com/view/guid/9202a8c04000641f80000000008427e9
>>> Kind of. One can structure a query for all of Nicole Kidman’s
>>> domestic relationships which have a start date but no end date.
>>> Similarly, one can ask for the first of Obama’s degrees, reverse-
>>> sorted by date.
>>>
>>> However, this does not handle incomplete knowledge; if, for
>>> instance, we know that Obama received these three degrees, but only
>>> a date for one of them. And the client doesn’t currently know how
>>> to distinguish time-series properties from others. It would be
>>> possible, as Robert suggests, to add something like the property
>>> hints that currently let the client know about disambiguating
>>> properties; these would suggest that a property is expected to be
>>> time-valued. That would, in turn, require standardizing on the way
>>> of representing the dates themselves...
>>>
>>> In short, this is a feature that we want, and have been batting
>>> around for some time, but we need some help defining the use cases
>>> so we can best meet your needs. Is the denormalization that Robert
>>> proposes prohibitively problematic? Is a certain amount of data
>>> contortion acceptable, to fit a standardized representation of time-
>>> series data? Are there other approaches we haven’t considered?
>>>
>>> ~Chris
>> _______________________________________________
>> Data-modeling mailing list
>> Data-modeling at freebase.com
>> http://lists.freebase.com/mailman/listinfo/data-modeling
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
>
>
>
More information about the Data-modeling
mailing list