[Data-modeling] help

Kirrily Robert kirrily at metaweb.com
Thu Jun 12 21:27:05 UTC 2008


Francois, you need to go to http://lists.freebase.com/mailman/listinfo/data-modeling 
  to unsubscribe, as described in the text below.

K.

On Jun 12, 2008, at 1:48 PM, François Lamotte wrote:

> On Thu, Jun 12, 2008 at 10:26 PM,  <data-modeling- 
> request at freebase.com> wrote:
>> Send Data-modeling mailing list submissions to
>>       data-modeling at freebase.com
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>       http://lists.freebase.com/mailman/listinfo/data-modeling
>> or, via email, send a message with subject or body 'help' to
>>       data-modeling-request at freebase.com
>>
>> You can reach the person managing the list at
>>       data-modeling-owner at freebase.com
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Data-modeling digest..."
>>
>>
>> Today's Topics:
>>
>>  1. Re: Tweaks to the food domain (Kirrily Robert)
>>  2. Re: Tweaks to the food domain (Kirrily Robert)
>>  3. Re: Thoughts on disease/treatment (Benjamin Good)
>>  4. Re: Tweaks to the food domain (Jeff Prucher)
>>  5. Re: Tweaks to the food domain (Tom Morris)
>>  6. Re: Tweaks to the food domain (John Giannandrea)
>>  7. Re: Thoughts on disease/treatment (Dan Ruderman)
>>  8. Re: Thoughts on disease/treatment (Benjamin Good)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Wed, 11 Jun 2008 16:24:38 -0700
>> From: Kirrily Robert <kirrily at metaweb.com>
>> Subject: Re: [Data-modeling] Tweaks to the food domain
>> To: Freebase data modeling mailing list <data-modeling at freebase.com>
>> Message-ID: <2805EFC2-6E39-4779-91B8-570F032793E6 at metaweb.com>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>> On Jun 11, 2008, at 3:17 PM, Jeff Prucher wrote:
>>
>>> Does anyone have any thoughts on the second part of my question,
>>> regarding
>>> the design of the whisk(e)y type?  I'm in favor of splitting a
>>> "distillery"
>>> type of off "whisky" (current instances are a mix of both), similar
>>> to the
>>> beer/brewery and wine/wine producer types.
>>
>>
>> +1
>>
>>
>> --
>> Kirrily Robert
>> Freebase Community Director
>> kirrily at metaweb.com
>> http://freebase.com/
>>
>>
>>
>>
>>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Wed, 11 Jun 2008 16:25:08 -0700
>> From: Kirrily Robert <kirrily at metaweb.com>
>> Subject: Re: [Data-modeling] Tweaks to the food domain
>> To: Freebase data modeling mailing list <data-modeling at freebase.com>
>> Message-ID: <11C71AAD-3246-4EBD-9595-36AF9338782C at metaweb.com>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>> On Jun 11, 2008, at 3:26 PM, Christopher R. Maden wrote:
>>
>>> Jeff Prucher wrote:
>>>> Does anyone have any thoughts on the second part of my question,
>>>> regarding
>>>> the design of the whisk(e)y type?  I'm in favor of splitting a
>>>> "distillery"
>>>> type of off "whisky" (current instances are a mix of both), similar
>>>> to the
>>>> beer/brewery and wine/wine producer types.
>>>
>>> +2
>>
>>
>> Hey!
>>
>> --
>> Kirrily Robert
>> Freebase Community Director
>> kirrily at metaweb.com
>> http://freebase.com/
>>
>>
>>
>>
>>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Wed, 11 Jun 2008 16:52:11 -0700
>> From: Benjamin Good <ben.mcgee.good at gmail.com>
>> Subject: Re: [Data-modeling] Thoughts on disease/treatment
>> To: Freebase data modeling mailing list <data-modeling at freebase.com>
>> Message-ID: <8F301606-D6B6-431A-A443-7EEFBFC69218 at gmail.com>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>> They do indeed have and OWL version of this anatomical beast, though
>> its native representation is Protege frames.  There is a link to a
>> "lite" version of this ontology here  http://obofoundry.org/cgi-bin/detail.cgi?id=fma_lite
>>
>> I mention this here on the main list rather than in the Medicine
>> discussion forum because its yet another example of a problem that I
>> think everyone in the semantic web community that is interested in
>> freebase (which really should be everyone) really needs an answer to.
>> What is a good protocol for moving OWL ontologies into and out of
>> freebase ?  Can this problem be dealt with in a domain/ontology
>> independent manner?
>>
>> I am facing this problem directly in my work on the entity  
>> describer (www.entitydescriber.org
>> ).  One of the greatest selling points I've had within the
>> bioinformatics community is that I can use it as a generic Gene
>> Ontology (GO) annotation service by pointing the topic lookup at the
>> GO group type (http://www.freebase.com/view/biology/
>> gene_ontology_group) (and that, as an indication of the already
>> extensive community, someone unknown to me had already imported the
>> GO).  However, there are a number of problems:
>>
>> 1) the GO import could be made much useful through better integration
>> with the rest of freebase.  For example, http://www.freebase.com/view/en/biological_reproduction
>> should very likely be linked to GO:reproduction (http://www.freebase.com/view/guid/9202a8c04000641f800000000520ead4
>> ).  How might this be automated ?
>>
>> 2) is there a mechanism to try to update the freebase data when the  
>> GO
>> is updated ?
>> 3) users of things like the GO really expect and depend on inferences
>> - especially hierarchical ones (isa, partof).  What is a good pattern
>> for representing hierarchy and executing queries across hierarchies
>> within the context of freebase?
>>
>> For the entity describer project we are currently debating whether to
>> establish (or find) a queriable repository of ontologies like the GO
>> for the purpose of answering queries that require inference or trying
>> to get everything done in the context of freebase.
>>
>> -Ben
>>
>>
>>
>>
>>
>>
>>
>>
>> On Jun 6, 2008, at 2:03 PM, Dan Ruderman wrote:
>>
>>> Hi Faye,
>>>
>>> Faye Li wrote:
>>>> I have a to-do item regarding anatomical structures affected by
>>>> diseases
>>>> on my whiteboard but didn't get to it this time. The type  
>>>> "Anatomical
>>>> structure" exists today without any properties (see
>>>> http://www.freebase.com/tools/schema/medicine/anatomical_structure)
>>>> and
>>>> the plan was to try to flesh out the schema there. Any properties
>>>> you'd
>>>> suggest?
>>>>
>>> The fundamental concept I would add is a
>>> contains/part-of hierarchy to the structures.  I found
>>> this web site which has thought a lot of this through and
>>> may have an ontology which is importable (e.g. they
>>> have an OWL version):
>>> http://sig.biostr.washington.edu/projects/fm/AboutFM.html
>>> Populating the hierarchy might be tricky, though, unless
>>> the anatomical names are consistent with the freebase ones.
>>> My biggest concern would be that the hierarchy would be
>>> different in different organisms (e.g. cerebral cortex is part of  
>>> the
>>> mammalian brain but not part of the reptilian brain), and
>>> I'm not sure if there is a ready source for that information.
>>> Because of these complexities I'd probably leave the properties
>>> as they are.
>>>
>>>
>>>> As for "Disease Cause" (which will be renamed etiology shortly,
>>>> with the
>>>> original name saved as an alias), I was thinking about refactoring
>>>> the
>>>> type. Each cause needs to be qualified with evidence level,  
>>>> something
>>>> along the lines of "evident, probable, possible". I was also
>>>> considering
>>>> adding an enumeration property/new type for etiology category that
>>>> would
>>>> list, "bacterial, viral, chemical, parasitic", etc. I would
>>>> appreciate
>>>> your expertise if you have time to talk about this offline.
>>>>
>>> My knowledge is somewhat limited, but I enjoy thinking about
>>> these things and would be glad to chat about them.
>>>
>>> Dan
>>>
>>> _______________________________________________
>>> Data-modeling mailing list
>>> Data-modeling at freebase.com
>>> http://lists.freebase.com/mailman/listinfo/data-modeling
>>
>>
>>
>> ------------------------------
>>
>> Message: 4
>> Date: Wed, 11 Jun 2008 17:02:45 -0700
>> From: "Jeff Prucher" <jeff at metaweb.com>
>> Subject: Re: [Data-modeling] Tweaks to the food domain
>> To: "'Freebase data modeling mailing list'"
>>       <data-modeling at freebase.com>
>> Message-ID: <003201c8cc1f$a4277180$bc01a8c0 at p4>
>> Content-Type: text/plain;       charset="US-ASCII"
>>
>> OK then!  Try this on for size:
>> http://sandbox.freebase.com/tools/schema/food/whisky
>>
>> (The big problem is really going to be reconciling with the topics  
>> derived
>> from Wikipedia -- most of the topics currently typed as "whisky"  
>> are either
>> distilleries or a combination article about both a whisky and its
>> distillery.)
>>
>> Jeff
>>
>>> -----Original Message-----
>>> From: data-modeling-bounces at freebase.com
>>> [mailto:data-modeling-bounces at freebase.com] On Behalf Of
>>> Christopher R. Maden
>>> Sent: Wednesday, June 11, 2008 3:26 PM
>>> To: Freebase data modeling mailing list
>>> Subject: Re: [Data-modeling] Tweaks to the food domain
>>>
>>> Jeff Prucher wrote:
>>>> Does anyone have any thoughts on the second part of my question,
>>>> regarding the design of the whisk(e)y type?  I'm in favor
>>> of splitting a "distillery"
>>>> type of off "whisky" (current instances are a mix of both),
>>> similar to
>>>> the beer/brewery and wine/wine producer types.
>>>
>>> +2
>>>
>>>
>>> --
>>> Christopher R. Maden
>>> Data Architect
>>> Metaweb Technologies, Inc.
>>> <URL: http://www.metaweb.com/ >
>>> _______________________________________________
>>> Data-modeling mailing list
>>> Data-modeling at freebase.com
>>> http://lists.freebase.com/mailman/listinfo/data-modeling
>>>
>>
>>
>>
>> ------------------------------
>>
>> Message: 5
>> Date: Wed, 11 Jun 2008 21:00:47 -0400
>> From: "Tom Morris" <tfmorris at gmail.com>
>> Subject: Re: [Data-modeling] Tweaks to the food domain
>> To: "Freebase data modeling mailing list" <data- 
>> modeling at freebase.com>
>> Message-ID:
>>       <c5f3f16f0806111800m27925fe2x5ef1e6b802801428 at mail.gmail.com>
>> Content-Type: text/plain; charset=ISO-8859-1
>>
>> I think the separation is useful, but are we talking about
>> distilleries or companies or brands.  Leaving aside for a minute the
>> fact that purists might not consider blended "whisky" to be whiskys  
>> at
>> all, have a look at http://www.scotlandwhisky.com/about/brands.  The
>> Johnnie Walker brand's "spiritual home" is the Cardhu distillery, but
>> Ballantine has no identifiable distillery associated with it.   
>> Neither
>> of them actually distill any spirits.
>>
>> Tom
>>
>> On Wed, Jun 11, 2008 at 8:02 PM, Jeff Prucher <jeff at metaweb.com>  
>> wrote:
>>> OK then!  Try this on for size:
>>> http://sandbox.freebase.com/tools/schema/food/whisky
>>>
>>> (The big problem is really going to be reconciling with the topics  
>>> derived
>>> from Wikipedia -- most of the topics currently typed as "whisky"  
>>> are either
>>> distilleries or a combination article about both a whisky and its
>>> distillery.)
>>>
>>> Jeff
>>
>>
>> ------------------------------
>>
>> Message: 6
>> Date: Wed, 11 Jun 2008 22:48:44 -0700
>> From: John Giannandrea <jg at metaweb.com>
>> Subject: Re: [Data-modeling] Tweaks to the food domain
>> To: Freebase data modeling mailing list <data-modeling at freebase.com>
>> Message-ID: <56BADC2D-3C60-4D21-97FC-2829A448148A at metaweb.com>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>>
>> Id suggest that the whisky is the brand and if its a blend it may or
>> may not list some of the contributing distilleries.
>> whisky as a food or beverage will presumably acquire manufacturer or
>> distributer properties from elsewhere.
>> -jg
>>
>> Tom Morris wrote:
>>> I think the separation is useful, but are we talking about
>>> distilleries or companies or brands.  Leaving aside for a minute the
>>> fact that purists might not consider blended "whisky" to be  
>>> whiskys at
>>> all, have a look at http://www.scotlandwhisky.com/about/brands.  The
>>> Johnnie Walker brand's "spiritual home" is the Cardhu distillery,  
>>> but
>>> Ballantine has no identifiable distillery associated with it.   
>>> Neither
>>> of them actually distill any spirits.
>>
>>
>> ------------------------------
>>
>> Message: 7
>> Date: Thu, 12 Jun 2008 11:16:35 -0700
>> From: Dan Ruderman <dan at appliedproteomics.com>
>> Subject: Re: [Data-modeling] Thoughts on disease/treatment
>> To: Freebase data modeling mailing list <data-modeling at freebase.com>
>> Message-ID: <48516803.3040207 at appliedproteomics.com>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Hi Ben,
>>
>>
>> Benjamin Good wrote:
>>> 1) the GO import could be made much useful through better  
>>> integration
>>> with the rest of freebase.  For example, http://www.freebase.com/view/en/biological_reproduction
>>>  should very likely be linked to GO:reproduction (http://www.freebase.com/view/guid/9202a8c04000641f800000000520ead4
>>> ).  How might this be automated ?
>>>
>> I think this lies at the core of the reconciliation problem, and may
>> require curation by hand unless we can find some organization which
>> has already placed GO in this broader context.
>>> 2) is there a mechanism to try to update the freebase data when  
>>> the GO
>>> is updated ?
>>>
>> I had uploaded what was current as of 2007-02-18.  I had attempted to
>> track version information through the "Data Source" property on
>> each GO group, but it looks like that did not work as I had expected.
>> It would probably not be that difficult to determine the changes in
>> the GO since then and add/deprecate groups (and hopefully reference
>> information about the new version somehow).
>>
>>> 3) users of things like the GO really expect and depend on  
>>> inferences
>>> - especially hierarchical ones (isa, partof).  What is a good  
>>> pattern
>>> for representing hierarchy and executing queries across hierarchies
>>> within the context of freebase?
>>>
>> One of the most attractive features of Freebase is the ease
>> with which hierarchies can be represented in terms of
>> properties.  I leveraged this concept when creating the
>> schema for GO.  This mechanism extends naturally to queries.
>> Could you be a bit more specific in stating the type of hierarchy
>> and queries which interest you?  Perhaps that could help the
>> list participants to generate some specific examples.
>>
>> best
>> Dan
>>
>>
>>
>>
>> ------------------------------
>>
>> Message: 8
>> Date: Thu, 12 Jun 2008 13:25:58 -0700
>> From: Benjamin Good <ben.mcgee.good at gmail.com>
>> Subject: Re: [Data-modeling] Thoughts on disease/treatment
>> To: Freebase data modeling mailing list <data-modeling at freebase.com>
>> Message-ID: <CE4EE457-DB07-4024-A296-AE3689F2C8FE at gmail.com>
>> Content-Type: text/plain; delsp=yes; format=flowed; charset=US-ASCII
>>
>>
>> On Jun 12, 2008, at 11:16 AM, Dan Ruderman wrote:
>>
>>> Hi Ben,
>>>
>>>
>>> Benjamin Good wrote:
>>>> 1) the GO import could be made much useful through better  
>>>> integration
>>>> with the rest of freebase.  For example, http://www.freebase.com/view/en/biological_reproduction
>>>> should very likely be linked to GO:reproduction (http://www.freebase.com/view/guid/9202a8c04000641f800000000520ead4
>>>> ).  How might this be automated ?
>>>>
>>> I think this lies at the core of the reconciliation problem, and may
>>> require curation by hand unless we can find some organization which
>>> has already placed GO in this broader context.
>>
>> I tend to agree.  The ideal case would be for an organization like  
>> the
>> GO consortium itself or the OBO foundry that it begot to take charge
>> of this task.  That being said, I bet there are quite a few example
>> where direct correspondence could be determined automatically at
>> import.  (This has got to be something the data harvesting teams at
>> freebase know quite a lot about).
>>
>>
>>>
>>>> 2) is there a mechanism to try to update the freebase data when the
>>>> GO
>>>> is updated ?
>>>>
>>> I had uploaded what was current as of 2007-02-18.  I had attempted  
>>> to
>>> track version information through the "Data Source" property on
>>> each GO group, but it looks like that did not work as I had  
>>> expected.
>>> It would probably not be that difficult to determine the changes in
>>> the GO since then and add/deprecate groups (and hopefully reference
>>> information about the new version somehow).
>>>
>>>> 3) users of things like the GO really expect and depend on  
>>>> inferences
>>>> - especially hierarchical ones (isa, partof).  What is a good  
>>>> pattern
>>>> for representing hierarchy and executing queries across hierarchies
>>>> within the context of freebase?
>>>>
>>> One of the most attractive features of Freebase is the ease
>>> with which hierarchies can be represented in terms of
>>> properties.  I leveraged this concept when creating the
>>> schema for GO.  This mechanism extends naturally to queries.
>>> Could you be a bit more specific in stating the type of hierarchy
>>> and queries which interest you?  Perhaps that could help the
>>> list participants to generate some specific examples.
>>
>> Sure.  I'd like to find all gene groups (and all genes when the
>> associations get into freebase) that have been annotated with any
>> Cellular Component or, any component of the cytoplasm.  This is
>> basically the same as a query for all of the organism classifications
>> lower-than a particular classification.  I can see how to request
>> hierarchical chains when I know how many levels I want to traverse,
>> but I would also like to be able to handle queries when I don't know
>> in advance how far down they will go.  This could be solved through  
>> an
>> iterative request but, as it seems like such a general case, it would
>> be great if there was some sort of standard way to deal with it.   
>> Here
>> is MQL for a getting several levels down from 'bird' or 'Aves'.
>>
>> [
>>  {
>>    "lower_classifications" : [
>>      {
>>        "lower_classifications" : [
>>          {
>>            "lower_classifications" : [
>>              {
>>                "lower_classifications" : [
>>                  {
>>                    "lower_classifications" : []
>>                  }
>>                ],
>>                "name" : null
>>              }
>>            ],
>>            "name" : null
>>          }
>>        ],
>>        "name" : null
>>      }
>>    ],
>>    "name" : null,
>>    "scientific_name" : "Aves",
>>    "type" : "/biology/organism_classification"
>>  }
>> ]
>>
>> In an answer to a previous inquiry about subsumption in freebase,  
>> John
>> Giannandrea said "So why dont we support strict type inheritance?
>> Well because real  world data is messy."  .  OK, cool, I'm not
>> suggesting that Freebase should support inheritance at the level of
>> Types, but why not explicitly support the general case of what he
>> calls "phylogeny patterns" ?  Though real world data is indeed very
>> messy, there is a lot of very valuable, fairly clean data (e.g. the
>> GO, the FMA, etc.) that would be much more easily merged into  
>> freebase
>> given the definition (even just a consensus agreement) on what the
>> general-purpose broader-than/narrower-than relationship should be and
>> how it should be interacted with in MQL.
>>
>> -Ben
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Data-modeling mailing list
>> Data-modeling at freebase.com
>> http://lists.freebase.com/mailman/listinfo/data-modeling
>>
>>
>> End of Data-modeling Digest, Vol 10, Issue 11
>> *********************************************
>>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling

-- 
Kirrily Robert
Freebase Community Director
kirrily at metaweb.com
http://freebase.com/






More information about the Data-modeling mailing list