[Data-modeling] Events

Jeff Thompson jeff at thefirst.org
Thu Feb 14 22:23:56 UTC 2008


Jeff Prucher wrote:
>> ----- "Jeff Prucher" <jeff at metaweb.com> wrote:
>>> I agree. A use case is military conflicts, for which 
>> Wikipedia already 
>>> contains extractable super/sub-conflict relationships. (E.g. "the 
>>> invasion of normandy" is a part of "world war II"; "the battle of 
>>> omaha beach"
>>> is
>>> part of the invasion of normandy, etc.)
>> OK, that sounds sane.
>>
>> I was thinking this morning on the train about event 
>> locations.  If historic periods are just long-running events, 
>> then what is the location of the 20th century? Or of the 
>> Impressionist Movement in art?  
>>
>> One thing I found when populating some data in event and 
>> historic period was that it starts to run into problems of 
>> dated locations, too.  What was the location of the 
>> historical period "Classical Antiquity"?  Is "Italy" a 
>> sensible location to list there, given that Italy didn't 
>> exist as such until quite recently?  If, instead, we list 
>> locations as they were at the time, then it becomes very 
>> difficult to eg. search for events occuring in any given Balkan state.
> 
> Of course, since political boundaries are shifting constantly, would it even
> make sense to try to use only current locations for events? If Kosovo
> secedes from Serbia, would we have to go into all the events, no matter how
> ancient,  that we claim occurred in Serbia, and figure out which ones should
> be in modern Kosovo instead? 

This is a side-effect of shielding Freebase users from inference rules (since rules
languages are way scary). Freebase is "flattened".  Many topics have a property which can be inferred from
another property, but for the sake of simplicity, the *result* of the inference
is hand-entered and the implicit rule which it came from is not represented in the
system.  For example the Siblings relation can be inferred from Parent(s) and their children,
but these are directly hand coded (and can be inconsistent).  Consider also that San Francisco
is Contained By "San Francisco Bay Area", "Northern California" and "California".  Why mention
just these and not "United States" or "North America"?  Because these are the results of the transitive
"contains" inference that the Freebase users chose to flatten into property values.

Likewise, the name is of a spot on the Earth changes over time.  The fact
that it is called "Kosovo" is inferred from the geolocation of the event according to the name
it had at that moment in time, and according to who's point of view.
(Is Taiwan contained by China?  Yes or no, as inferred according to the claims of the
various Taiwan political parties.)  So, yes, when the basis of the inference changes, you
must go through all the properties in Freebase to hand code the results of the revised hidden
inference rule.  This works great in 90% percent of the cases, which is what Freebase is aimed at.



More information about the Data-modeling mailing list