[Data-modeling] Location question
Tom Morris
tfmorris at gmail.com
Mon Jul 13 19:14:35 UTC 2009
I don't think that you can get around the fact that something is
either part of a set or not part of a set. If the Northwest and the
Pacific Northwest and Tom's Rainy States contain different locations,
they've got to be different things.
The time-mediation thing is an orthogonal issue to my mind and one
which definitely needs to be addressed.
Union vs containment is mostly a matter of semantics, I think. I
don't have a problem with the United States "containing" not only the
48 contiguous states, but also Alaska, Hawaii, and the Territories,
unless there's a rule that a Location must be a single closed polygon.
The whole containment thing is currently a mess though because of a
hack put into the data model/data entry to work around query/client
limitations. Because there's no easy way to do recursive containment
queries, people have been told to enter multiple levels of containment
in the "contains" property. This makes it difficult to figure out
what the true containment hierarchy is (or at least a lot of extra
work).
Tom
On Mon, Jul 13, 2009 at 2:39 PM, Richard Newman<rnewman at twinql.com> wrote:
> Thanks for the insight, Ed. Details inline…
> On 12 Jul 2009, at 12:09 AM, Ed Laurent wrote:
>
> You ask a couple of great questions and you described the problem well.
>
> To summarize, there are three primary issues: 1) the "Location" type needs
> some work to more explicitly define relationships among locations, 2)
> locations are rarely defined concretely, and 3) there are often multiple
> topics for the same concept of a location that differ in one or more ways.
>
> Indeed. Arguably 3) is a consequence of 2).
>
> Issue 1: A couple weeks ago I started an "Enhanced location" base to build
> and test potential properties of the Location type to improve its spatial
> semantics. Currently, I have "Location intersection" and "Location union"
> types to describe locations as either occurring within the area of
> intersection of two or more overlapping locations or within the combined
> area of two or more adjacent or overlapping locations. That's about as far
> as I've gotten with these types and I welcome all feedback and assistance
> with them.
>
> Looks good. Those are certainly valuable things to have, though union does
> raise some questions about common use of contains — does the Pacific
> Northwest contain Washington et al, or is it the union of those states?
> Certainly Washington contains its cities, but probably the United States (in
> its role as a location) is a union of its territories. (Something to do with
> shared boundaries or strict containment, perhaps.)
>
> Issue 2: This will always be a problem. Perhaps there are standard
> descriptions of the entire world, but I am unaware of them (CIA World
> Factbook?), and they would necessarily be time mediated.
>
> Indeed. Even if there were standard descriptions, there would also be
> non-standard descriptions, such as for contested territories (or even basic
> disagreement!).
>
> One way to describe such standards would be to import all the location
> topics with their definitions and link them to their sources, which would
> have a property for the date. For example, location entities of a political
> map could be described using FGDC compliant metadata. I've started modeling
> FGDC compliant metadata in my MapCentral base but have not yet made it to
> Entity and Attribute Information, which could include the location topics
> under the Enumerated domain (I think). See 2001 NLCD for an example of how
> I'm describing maps.
>
> Source information would certainly be a good addition, though at some cost
> in complexity (the same problem thorough use of metadata always has). I'm
> afraid I'm out of my depth with the intricacies of GIS standards, so I'm
> little help there!
> I'm a little concerned that the use of attribution-style annotations to turn
> wooly definitions into concrete ones (as opposed to straightforward
> annotation of time-varying political maps from a single source, for example)
> will result in either bad data or an explosion of topics — Joe's Pacific
> Northwest — though I admit to having no good solution for this. Perhaps only
> the more concrete end of the spectrum should be encoded, and the rest should
> be a problem for humans and tools.
>
> Issue 3: A system is needed to organize similar but different location
> topics to describe how they are similar and/or different. I'm using the
> "Code category" type of my Land Cover base to do this for land cover
> classes. See "Spruce-Fir forest" for an example. Here, I have a generic
> concept of a land cover category and and linking it to all the defined land
> cover codes/classes that fall under that umbrella, which are linked to their
> classification systems, which in turn are linked to relevant publication(s).
> The "Classification code" type provides properties to describe how the land
> cover codes/classes are similar and/or different (i.e., equals, overlaps,
> contains, contained by).
>
> That seems a good way to approach classification. I'm still thinking through
> how a similar approach might apply to collections of locations; the thing
> that's tripping me up is trying to ensure that I (and others) could do what
> they want in Parallax and simple MQL queries, whilst still being vaguely
> correct.
> Perhaps a different kind of location — "conceptual location" — which is an
> umbrella concept for the various concrete instantiations, and itself links
> to locations? I'm thinking something like
> "Northwestern United States" (concept)
> - sometimes considered to consist of:
> Washington, Oregon, Idaho, Montana, Wyoming
> - variants:
> "Broad Northwestern United States" (or unnamed)
> - location union: Washington, Oregon, Idaho, Montana, Wyoming
> " Narrow Northwestern United States"
> - location union: Washington, Oregon
> That allows a user to find the concept they're looking for, then pick a
> specific variant of the concept (which can be annotated with a time).
> Maybe I need more coffee before addressing this again :)
> Thanks,
> -R
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
>
>
More information about the Data-modeling
mailing list