[Data-modeling] Location question
Richard Newman
rnewman at twinql.com
Mon Jul 13 18:39:49 UTC 2009
Thanks for the insight, Ed. Details inline…
On 12 Jul 2009, at 12:09 AM, Ed Laurent wrote:
> You ask a couple of great questions and you described the problem
> well.
>
> To summarize, there are three primary issues: 1) the "Location" type
> needs some work to more explicitly define relationships among
> locations, 2) locations are rarely defined concretely, and 3) there
> are often multiple topics for the same concept of a location that
> differ in one or more ways.
Indeed. Arguably 3) is a consequence of 2).
> Issue 1: A couple weeks ago I started an "Enhanced location" base to
> build and test potential properties of the Location type to improve
> its spatial semantics. Currently, I have "Location intersection" and
> "Location union" types to describe locations as either occurring
> within the area of intersection of two or more overlapping locations
> or within the combined area of two or more adjacent or overlapping
> locations. That's about as far as I've gotten with these types and I
> welcome all feedback and assistance with them.
Looks good. Those are certainly valuable things to have, though union
does raise some questions about common use of contains — does the
Pacific Northwest contain Washington et al, or is it the union of
those states? Certainly Washington contains its cities, but probably
the United States (in its role as a location) is a union of its
territories. (Something to do with shared boundaries or strict
containment, perhaps.)
> Issue 2: This will always be a problem. Perhaps there are standard
> descriptions of the entire world, but I am unaware of them (CIA
> World Factbook?), and they would necessarily be time mediated.
Indeed. Even if there were standard descriptions, there would also be
non-standard descriptions, such as for contested territories (or even
basic disagreement!).
> One way to describe such standards would be to import all the
> location topics with their definitions and link them to their
> sources, which would have a property for the date. For example,
> location entities of a political map could be described using FGDC
> compliant metadata. I've started modeling FGDC compliant metadata in
> my MapCentral base but have not yet made it to Entity and Attribute
> Information, which could include the location topics under the
> Enumerated domain (I think). See 2001 NLCD for an example of how I'm
> describing maps.
Source information would certainly be a good addition, though at some
cost in complexity (the same problem thorough use of metadata always
has). I'm afraid I'm out of my depth with the intricacies of GIS
standards, so I'm little help there!
I'm a little concerned that the use of attribution-style annotations
to turn wooly definitions into concrete ones (as opposed to
straightforward annotation of time-varying political maps from a
single source, for example) will result in either bad data or an
explosion of topics — Joe's Pacific Northwest — though I admit to
having no good solution for this. Perhaps only the more concrete end
of the spectrum should be encoded, and the rest should be a problem
for humans and tools.
> Issue 3: A system is needed to organize similar but different
> location topics to describe how they are similar and/or different.
> I'm using the "Code category" type of my Land Cover base to do this
> for land cover classes. See "Spruce-Fir forest" for an example.
> Here, I have a generic concept of a land cover category and and
> linking it to all the defined land cover codes/classes that fall
> under that umbrella, which are linked to their classification
> systems, which in turn are linked to relevant publication(s). The
> "Classification code" type provides properties to describe how the
> land cover codes/classes are similar and/or different (i.e., equals,
> overlaps, contains, contained by).
That seems a good way to approach classification. I'm still thinking
through how a similar approach might apply to collections of
locations; the thing that's tripping me up is trying to ensure that I
(and others) could do what they want in Parallax and simple MQL
queries, whilst still being vaguely correct.
Perhaps a different kind of location — "conceptual location" — which
is an umbrella concept for the various concrete instantiations, and
itself links to locations? I'm thinking something like
"Northwestern United States" (concept)
- sometimes considered to consist of:
Washington, Oregon, Idaho, Montana, Wyoming
- variants:
"Broad Northwestern United States" (or unnamed)
- location union: Washington, Oregon, Idaho, Montana, Wyoming
" Narrow Northwestern United States"
- location union: Washington, Oregon
That allows a user to find the concept they're looking for, then pick
a specific variant of the concept (which can be annotated with a time).
Maybe I need more coffee before addressing this again :)
Thanks,
-R
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090713/1eca6451/attachment.htm
More information about the Data-modeling
mailing list