[Data-modeling] Upcoming Schema Changes
Ed Laurent
spatial.db at gmail.com
Tue Feb 12 06:51:09 UTC 2008
>From a user point of view, separate current and historical properties may be
more intuitive if there was only one or two of these properties associated
with a topic. Any more and it gets pretty clunky. I am a little concerned
about data quality if users are required to re-enter data when it shifts
from current to historical.
My preference is for a new MQL function that searches for the most recent
date (or most recent of start and end dates for time spans that may contain
missing data) along with better documentation for how to enter time series
data. Recent discussions have highlighted a need for better handling of
dated data across the board. However, I'm not the one doing the
programming...
-Ed
On Feb 12, 2008 1:32 AM, Robert Cook <robert at metaweb.com> wrote:
> I've been working on loading the current CIA World Factbook, and like
> you I'm convinced that there will ultimately be a lot of utility for
> properties containing time series values. The real question is what
> to do when an API user who wants to see a single value -- the most
> recent one -- without a complicated query? These two use cases (time
> series vs. the canonical up-to-date value) are a little at odds, and
> it seems like there are at least a couple of possible solutions:
>
> 1. Create a second time-series property for population. This would be
> a denormalization, where the most recent value is also copied into the
> current population property.
>
> 2. Add functionality to MQL to ask for the "most recent" value in a
> time series if it's queried as a single value. Of course, MQL would
> have to know about dated integers and dated floating point values,
> along with any other dated CVT. There would have to be some
> additional indicator on the property at the schema level that
> indicated it was a time series so that MQL would know when to affect
> this behavior.
>
> #1 could be done right now, whereas something like #2 would probably
> be non trivial. There may be better ideas.
>
> Do you see a serious problem adding "historical population" on
> Statistical Region to hold your time-series data?
>
> R
>
> On Feb 11, 2008, at 9:47 PM, Jeff Thompson wrote:
>
> > Bryan Cheung wrote:
> >> The type company currently has properties which have an expected type
> >> of money value. However, these are monetary values associated with a
> >> given point in time. As such, we have modeled a type dated money
> >> value
> >> to represent a money value at a specific date. The expected values of
> >> the monetary properties of type company should be updated to use the
> >> type dated money value.
> >>
> >> * /business/company/net_income
> >> * /business/company/operating_income
> >> * /business/company/revenue
> >
> > (This is my cue to beat an old horse.) Will you only allow a single
> > Dated
> > Money Value to show the net_income for 2008. That is, when we know
> > the net_income
> > for 2009, will you *add* this information to the data for 2008 to
> > keep the
> > running history, or will you delete the information for 2008?
> >
> > I ask because the population property for San Francisco has a Dated
> > Integer for 2006
> > (the most recent census). I add added the census population for
> > 2000, but someone
> > deleted it, probably assuming it was a mistake.
> >
> > - Jeff
> >
> >
> > _______________________________________________
> > Data-modeling mailing list
> > Data-modeling at freebase.com
> > http://lists.freebase.com/mailman/listinfo/data-modeling
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20080212/9a221328/attachment-0001.htm
More information about the Data-modeling
mailing list