[Data-modeling] Modeling uncertainty
Iain Sproat
iainsproat at gmail.com
Tue Feb 17 06:53:50 UTC 2009
I have been entering data for early medieval
chiefs<http://www.freebase.com/view/en/tato>and this is a minefield of
uncertainty (there's a reason it's called the
dark ages!). Most dates are given as circa, and even where dates are given,
these will conflict with other information given in another reference (the
stories of medieval chroniclers tend not to be too reliable and disagree
with one another).
UI-wise it would be great to be able to put in a date of c 0510, or 0614 -
0620, and it automatically be noted as an uncertain date with a date range.
To avoid confusion the uncertainty part should be hidden by the UI until an
uncertain date is entered explicitly.
Schema wise I would think it be best to swap the Date type used in all
schema for a CVT which included a most likely date and an earliest likely
date and a latest likely date.
On a similar matter, I'd like to see a way in the UI to provide references
for all links so that someone can query the source material which provided
the data. e.g. to attribute a date given to Origo Gentis
Langobardorum<http://www.freebase.com/view/en/origo_gentis_langobardorum>
or Historia gentis
Langobardorum<http://www.freebase.com/view/en/historia_gentis_langobardorum>
.
Is this the purpose of the 'attribution' property in the
link<http://www.freebase.com/type/schema/type/link>type?
Iain (sprocketonline)
On Tue, Feb 17, 2009 at 4:16 AM, Scott Blomquist <scott at blomqui.st> wrote:
> I meant "Is there any reason why we should not make the range of
> undertainty..."
>
>
> On Mon, Feb 16, 2009 at 4:15 PM, Scott Blomquist <scott at blomqui.st> wrote:
>
>> Very interesting suggestion. Is there any reason we should make the range
>> of uncertainty able to be +x/-y instead of +/-x? I.e. I can imagine a case
>> where someone's date of birth is known to be around a certain time, but it
>> could have been as much as 3 years earlier, but no more than 6 months later.
>> I realize that serves to make things more complex, but I think it's a more
>> accurate reflection of many kinds of date uncertainty. I think you're right
>> about using a number and unit of time, though. If we accept dates, I fear
>> that some data might get mistakenly entered thinking that this replaces the
>> estimated start or end dates in the Event type itself.
>>
>>
>> On Mon, Feb 16, 2009 at 3:24 PM, Kirrily Robert <kirrily at metaweb.com>wrote:
>>
>>> I wonder whether, for events at least, we could have a co-type for
>>> uncertainly dated events that specify a range of uncertainty. Jeff would
>>> kick me for this (if he weren't off having a baby) but call it something
>>> like "Uncertainly timed event" and have two properties, "Degree of
>>> uncertainty of start date", and "Degree of uncertainty of end date". These
>>> expect a CVT which is an integer and a unit of time, eg. 3 days, 6 months,
>>> 1000 years. For "Spring 1985" assuming it were the northern hemisphere, you
>>> could just put in a date of 1 May 1985 and allow 6 weeks' uncertainty, or
>>> thereabouts.
>>> This would leave you able to put an estimated date in the ordinary date
>>> fields on event, allowing it to appear neatly in timelines and whatnot, but
>>> also provide the information about the degree of uncertainty.
>>>
>>> K.
>>>
>>> On 16/02/2009, at 3:09 PM, Scott Blomquist wrote:
>>>
>>> I just encountered a scenario today that would benefit from the same
>>> solution as your date example. I found some events whose times I've only
>>> been able to pin down so far to "Spring 1985", and I don't think I have any
>>> good way to represent that in an event today.
>>>
>>> On Mon, Feb 16, 2009 at 10:28 AM, Tom Morris <tfmorris at gmail.com> wrote:
>>>
>>>> Is anyone doing work on modeling uncertainty? I'm specifically
>>>> interested in dates and locations.
>>>>
>>>> Location - If I'm told that something is "near" or "in the vicinity
>>>> of" a location, currently my choices are to either not record the fact
>>>> or to guess at a way to reduce the precision in a way that's still
>>>> accurate. I could say that something which is "near Boston" is "in
>>>> Massachusetts," but a) that might not be true and b) that's not the
>>>> information that I have.
>>>>
>>>> Dates - The simple case is "circa," but it would also be useful to
>>>> deal with both open and closed ranges (e.g. before 1945, after 1999,
>>>> or September 2008-December 2008). Currently the only type of range
>>>> that can be encoded is ones which can be made by truncating precision
>>>> (ie 2009 == 1 Jan 2009 - 31 Dec 2009).
>>>>
>>>> Tom
>>>> _______________________________________________
>>>> Data-modeling mailing list
>>>> Data-modeling at freebase.com
>>>> http://lists.freebase.com/mailman/listinfo/data-modeling
>>>>
>>>
>>>
>>>
>>> --
>>> http://scott.blomqui.st
>>> _______________________________________________
>>> Data-modeling mailing list
>>> Data-modeling at freebase.com
>>> http://lists.freebase.com/mailman/listinfo/data-modeling
>>>
>>>
>>> --
>>> Kirrily Robert
>>> Freebase Community Director
>>> kirrily at metaweb.com
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Data-modeling mailing list
>>> Data-modeling at freebase.com
>>> http://lists.freebase.com/mailman/listinfo/data-modeling
>>>
>>>
>>
>>
>> --
>> http://scott.blomqui.st
>>
>
>
>
> --
> http://scott.blomqui.st
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090217/99250abf/attachment-0001.htm
More information about the Data-modeling
mailing list