From kirrily at metaweb.com Mon Jun 1 18:52:58 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Mon, 1 Jun 2009 11:52:58 -0700 Subject: [Data-modeling] Feedback on RABJ feature In-Reply-To: References: <3BC1794D-CCA5-4B37-AA09-224AA3F849D5@metaweb.com> Message-ID: <67EB1F42-9A75-4B31-8ECF-102265F42BFD@metaweb.com> On May 30, 2009, at 10:04 AM, Spencer Kelly wrote: > is there an official count going somewhere? we should have a party/ > get slashdotted at 100% - isn't it somewhat close? i'll make one if > there isn't one... Bryan just ran a report. Looks like there are 1.1m untyped topics, or about 20%. I've asked him to pull out a sample of them so we can see what sort of topics there are. But it would be AWESOME to have a community effort for 100%. What can we do to encourage this effort? K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ From spencerkelly86 at gmail.com Mon Jun 1 22:31:54 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Mon, 1 Jun 2009 19:31:54 -0300 Subject: [Data-modeling] Linking Freebase tables to Wikipedia articles In-Reply-To: <4A22F70F.7080104@maden.org> References: <4A22F70F.7080104@maden.org> Message-ID: i talked to brian karlak about this a few weeks ago. i was suggesting using wex to make a public list of wikipedia articles with juicy lists and tables .. like a to-do list. so when someone imports it they cross it out or whatever. never got made, - wex is too big for me. http://www.freebase.com/discuss/threads/guid/9202a8c04000641f800000000b19e3e7 love the idea, lots of people are working on the same task. we should organize it./ :) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090601/a260f91a/attachment.htm From jack.alves at gmail.com Wed Jun 3 13:45:19 2009 From: jack.alves at gmail.com (Jack Alves) Date: Wed, 3 Jun 2009 06:45:19 -0700 Subject: [Data-modeling] attribution Message-ID: <554723b0906030645q2fe57c33l91df0695645222@mail.gmail.com> I noticed creator and attribution properties are set to the user when data is created via freebase.com. Are there any cases when they would be different when data is created using freebase.com? What is the intended use for the attribution property? Can anything be inserted in that property? Could it be used to store a webpage object id or url that asserts the data is true? Is attribution intended only for cases where data sets get inhaled like wikipedia or musicbrainz data? This topic is slightly related to something I want to try soon. For every link, I want a way for any user to indicate whether they believe the assertion to be true or false. I imagine this could be accomplished by pointing a protected user or group object to the link. How should something like this be modeled? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090603/bbe03e6d/attachment-0001.htm From zenkat at metaweb.com Wed Jun 3 19:41:50 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Wed, 3 Jun 2009 12:41:50 -0700 Subject: [Data-modeling] attribution In-Reply-To: <554723b0906030645q2fe57c33l91df0695645222@mail.gmail.com> References: <554723b0906030645q2fe57c33l91df0695645222@mail.gmail.com> Message-ID: On Jun 3, 2009, at 6:45 AM, Jack Alves wrote: > I noticed creator and attribution properties are set to the user > when data is created via freebase.com. Are there any cases when > they would be different when data is created using freebase.com? When writing via freebase.com, the attribution should always show the writing user. I do not believe there is any mechanism available standard MQL to change this, > What is the intended use for the attribution property? Can anything > be inserted in that property? Could it be used to store a webpage > object id or url that asserts the data is true? Is attribution > intended only for cases where data sets get inhaled like wikipedia > or musicbrainz data? Some of our large internal data loads, such as the wikipedia update, use a specialized "attribution" node that is cotyped with /type/ attribution and /dataworld/provenance. The properties of these type contain information about the dataload, including the information source and software tool used for loading. However, the attribution mechanism is not generally available for annotating links in general, nor is it accessible to users via freebase.com. > This topic is slightly related to something I want to try soon. For > every link, I want a way for any user to indicate whether they > believe the assertion to be true or false. I imagine this could be > accomplished by pointing a protected user or group object to the > link. How should something like this be modeled? Currently, we do not support any mechanisms for adding properties to links -- except for the exception of sort index. In any case, attribution is probably not the correct way of doing this, as re-attributing to a "protected user" would obscure the original provenance of the link. Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090603/376ddb21/attachment.htm From jeff at metaweb.com Wed Jun 3 20:43:54 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Wed, 3 Jun 2009 13:43:54 -0700 Subject: [Data-modeling] Changes to disambiguating properties on CVTs Message-ID: <9A0C2D8C4C834FBEAC437F27BF0BFCA9@p4> In conjunction with some client changes, we'd like to change (slightly) the behavior of properties on compound value types. Properties on CVTs can have one of three states: disambiguator, not a disambiguator, and hidden. In the current client, the "not a disambiguator" properties can only be seen in table views, and are particularly hard to edit through the client. The proposal is to change the client's default behavior with CVT properties. All properties on a CVT would be displayed on the topics that expect the CVT, unless they are marked as hidden (that is to say, all non-hidden properties will behave the way disambiguating properties currently do). Disambiguators would take on a slightly different semantics, and would be used to identify which are the primary or most important properties on the CVT. For example, on the Education CVT, the most important properties are Student and Institution, for Film Performance it would probably be Actor, Film, and Character. This only affects CVTs; no change would be made to the way properties are handled on standard and enumerated types. These changes will help us to build better display logic into the client. We'll have greater ability to highlight key properties in the client through interface cues and more intelligent property ordering on key templates such as the topic and view page. The change to client code will only really affect CVTs with non-disambiguating properties that should instead be hidden. We'd hope to clean these up in the commons types before the code change. Users are encouraged to review their own CVT types and flag any non-disambiguating properties that they do not want to display as hidden. The later step would be to review existing CVTs and identify the primary properties. We hope to make these changes along with next week's client release on Tuesday (June 9). Jeff Prucher Type Librarian & Ontologist Metaweb Technologies, Inc. From iainsproat at gmail.com Thu Jun 4 13:20:18 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Thu, 4 Jun 2009 17:20:18 +0400 Subject: [Data-modeling] attribution In-Reply-To: References: <554723b0906030645q2fe57c33l91df0695645222@mail.gmail.com> Message-ID: > a webpage object id or url that asserts the data is true? to nitpick, a source would assert that the data is verifiable, but not necessarily true (e.g. sources for the Aliens base). Wikipedia's got a comprehensive write up on verification http://en.wikipedia.org/wiki/Wikipedia:V > For every link, I want a way for any user to indicate whether they believe the assertion to be true or false. I imagine this could be > accomplished by pointing a protected user or group object to the link +1 to the idea, but in practice this meta-meta-data could get complex. I threw together a *prototype* schema of how this might happen, but please note that this *DOES NOT WORK* in the client. (doesn't show links) A verification type linking a link to a source of verification, and also to user reviews of this source. A verification sourcetype, which can be added to a website, written work etc.. User's can review if this is reliable or not. A verification reviewtype which would allow users to check out a verification/source and leave a report. A verification statustype which would allow users to review if it is verified, reliable, not present or whatever... Iain On Wed, Jun 3, 2009 at 11:41 PM, Brian Karlak wrote: > > > > On Jun 3, 2009, at 6:45 AM, Jack Alves wrote: > > I noticed creator and attribution properties are set to the user when data > is created via freebase.com. Are there any cases when they would be > different when data is created using freebase.com? > > > When writing via freebase.com, the attribution should always show the > writing user. I do not believe there is any mechanism available standard > MQL to change this, > > What is the intended use for the attribution property? Can anything be > inserted in that property? Could it be used to store a webpage object id or > url that asserts the data is true? Is attribution intended only for cases > where data sets get inhaled like wikipedia or musicbrainz data? > > > Some of our large internal data loads, such as the wikipedia update, use a > specialized "attribution" node that is cotyped with /type/attribution and > /dataworld/provenance. The properties of these type contain information > about the dataload, including the information source and software tool used > for loading. > > However, the attribution mechanism is not generally available for > annotating links in general, nor is it accessible to users via > freebase.com. > > This topic is slightly related to something I want to try soon. For every > link, I want a way for any user to indicate whether they believe the > assertion to be true or false. I imagine this could be accomplished by > pointing a protected user or group object to the link. How should something > like this be modeled? > > > Currently, we do not support any mechanisms for adding properties to links > -- except for the exception of sort index. > > In any case, attribution is probably not the correct way of doing this, as > re-attributing to a "protected user" would obscure the original provenance > of the link. > > Brian > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090604/496755fa/attachment.htm From brendan at metaweb.com Thu Jun 4 23:22:34 2009 From: brendan at metaweb.com (brendan) Date: Thu, 4 Jun 2009 16:22:34 -0700 Subject: [Data-modeling] remove /architecture/tower type Message-ID: <6003565A-5852-45EB-97E7-6235D38D9C2D@metaweb.com> At Kirrily and sprocketonline's suggestion, I am going to remove this type. It has no properties and is too vague a category of structure to merit it's own type. The structure type covers the important properties. The 12 existing tower's will be de-typed manually (they will still be structure's) Any objections? Brendan From tigre119 at gmail.com Thu Jun 4 23:45:17 2009 From: tigre119 at gmail.com (tigre119 at gmail.com) Date: Fri, 5 Jun 2009 09:45:17 +1000 Subject: [Data-modeling] remove /architecture/tower type In-Reply-To: <6003565A-5852-45EB-97E7-6235D38D9C2D@metaweb.com> References: <6003565A-5852-45EB-97E7-6235D38D9C2D@metaweb.com> Message-ID: <9e406a50906041645r32489074me944f27f3e59e619@mail.gmail.com> Hi brendan i am tim I disagree and think 'towers' is an important class of objects to keep. In the sense that they are a distinct structure such as the Eiffel tower and now there are thousands of mobile phone towers and radio transmission towers as well as many tower type sky scrapers in modern cities. They are a class of objects in their own right in my view. Why do you want to remove the class? I am new I thought the more the merrier was the approach. Structures is a very broad term, sentences, stories and thoughts have structures. Buildings is another term, however 'tower' means a type of building or construction. Is it better to expand the tower class rather than remove it? My freebase is about marsupials and i want to fine-tune it, in fact it needs to contain all available info about marsupials. How can I set my freebase so it gathers data automatically and can it index it for me itself after collecting data from round the web? So that queries automatically gather and present the details about any species of marsupials and associated plants and animals, from a range of data bases such as Berkley Uni, Smithsonian inst., DBPedia and others. the data results are needed as a mash-up result from several sources, displayed as a form or a webpage. On Fri, Jun 5, 2009 at 9:22 AM, brendan wrote: > At Kirrily and sprocketonline's suggestion, I am going to remove this > type. It has no properties and is too vague a category of structure > to merit it's own type. The structure type covers the important > properties. The 12 existing tower's will be de-typed manually (they > will still be structure's) Any objections? > > Brendan > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090605/72c8ce2f/attachment-0001.htm From spatial.db at gmail.com Fri Jun 5 02:30:51 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Thu, 4 Jun 2009 22:30:51 -0400 Subject: [Data-modeling] Marsupials Message-ID: On Thu, Jun 4, 2009 at 7:45 PM, wrote: > Hi brendan > i am tim [snip] > > My freebase is about marsupials and i want to fine-tune it, in fact it > needs to contain all available info about marsupials. > How can I set my freebase so it gathers data automatically and can it index > it for me itself after collecting data from round the web? > So that queries automatically gather and present the details about any > species of marsupials and associated plants and animals, from a range of > data bases such as Berkley Uni, Smithsonian inst., DBPedia and others. > the data results are needed as a mash-up result from several sources, > displayed as a form or a webpage. Hi Tim, There's a lot of data that will be useful to you already in Freebase (e.g., Organism classification). You may want to check out the Biology , LitCentral , Conservation Action, Bird Conservation , and Bird Infobases for ideas on modeling marsupials. In particular, the Biology schema and most of the LitCentral and Conservation Action schema were modeled to be relevant to any taxa. There are many ways to get data into Freebase. However, linking or adding data from other database systems will likely require some work on your part modeling compatible Freebase schema, as well as obtaining and formatting the data or keys from other sources. Note that these other data sets must be under Creative Commons or compatible licensing. There are ways to automate the import of data into Freebase but you will need to program most of them yourself. The help files accessible at the top of any Freebase page are a great place to start learning more. -Ed -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090604/dde9248f/attachment.htm From tigre119 at gmail.com Fri Jun 5 09:56:21 2009 From: tigre119 at gmail.com (tigre119 at gmail.com) Date: Fri, 5 Jun 2009 19:56:21 +1000 Subject: [Data-modeling] Marsupials In-Reply-To: References: Message-ID: <9e406a50906050256v4880748bqcf7636b7f7914685@mail.gmail.com> Hi Brendan thanx for your reply that will be a great help i know to gather data in mozilla XUL a 'template' has a query and rules to lazy-generate data parsed from multiple rdf datasources On Fri, Jun 5, 2009 at 12:30 PM, Ed Laurent wrote: > > On Thu, Jun 4, 2009 at 7:45 PM, wrote: > >> Hi brendan >> i am tim > > [snip] > >> >> My freebase is about marsupials and i want to fine-tune it, in fact it >> needs to contain all available info about marsupials. >> How can I set my freebase so it gathers data automatically and can it >> index it for me itself after collecting data from round the web? >> So that queries automatically gather and present the details about any >> species of marsupials and associated plants and animals, from a range of >> data bases such as Berkley Uni, Smithsonian inst., DBPedia and others. >> the data results are needed as a mash-up result from several sources, >> displayed as a form or a webpage. > > > Hi Tim, > > There's a lot of data that will be useful to you already in Freebase (e.g., > Organism classification). > You may want to check out the Biology, > LitCentral , Conservation Action, > Bird Conservation , and Bird Infobases for ideas on modeling marsupials. In particular, the Biology schema > and most of the LitCentral and Conservation Action schema were modeled to be > relevant to any taxa. > > There are many ways to get data into Freebase. However, linking or adding > data from other database systems will likely require some work on your part > modeling compatible Freebase schema, as well as obtaining and formatting the > data or keys from other sources. Note that these other data sets must be > under Creative Commons or compatible licensing. There are ways to automate > the import of data into Freebase but you will need to program most of them > yourself. The help files accessible at the > top of any Freebase page are a great place to start learning more. > > -Ed > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090605/9e21a447/attachment.htm From iainsproat at gmail.com Fri Jun 5 12:20:24 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Fri, 5 Jun 2009 16:20:24 +0400 Subject: [Data-modeling] remove /architecture/tower type In-Reply-To: <9e406a50906041645r32489074me944f27f3e59e619@mail.gmail.com> References: <6003565A-5852-45EB-97E7-6235D38D9C2D@metaweb.com> <9e406a50906041645r32489074me944f27f3e59e619@mail.gmail.com> Message-ID: Tim, Structure is found in the Architecture commons, so is specific to architectural structures and is applicable to most things from aqueducts to zoos(including broadcast towers and skyscrapers ). The main criteria for adding a new type is whether there are any properties in it that aren't covered anywhere else. e.g. Floor area property in the building type or span length in the bridge type. The tower type doesn't have any properties, which is a good indicator that the schema wasn't appropriate. My first thought was to try and improve the type, but I couldn't think of a property that wasn't already covered by the structure type and was also appropriate to the wide range of purposes of towers, from historical fortress towers , to communication towers and everything in between. But that's not to say there isn't a property! If anyone can think of a property relevant to the tower type, please reply to this discussion. That said, there are properties appropriate for very specific types of structures which aren't already covered by existing types, e.g. number of microwave transmitters on a TV broadcast tower. If you want, you can create a new "Communication tower" type if you would like to model this. See the help page on creating schemas and feel free to submit a question here or on the discussion boards . If the type is deleted, you can still search for tall towers by using a view to select structures above 250m in height, for example. Iain (sprocketonline) On Fri, Jun 5, 2009 at 3:45 AM, wrote: > Hi brendan > i am tim > I disagree and think 'towers' is an important class of objects to keep. > In the sense that they are a distinct structure such as the Eiffel tower > and now there are thousands of mobile phone towers and radio transmission > towers as well as many tower type sky scrapers in modern cities. > They are a class of objects in their own right in my view. > Why do you want to remove the class? > I am new I thought the more the merrier was the approach. > Structures is a very broad term, sentences, stories and thoughts have > structures. > Buildings is another term, however 'tower' means a type of building or > construction. > Is it better to expand the tower class rather than remove it? > > My freebase is about marsupials and i want to fine-tune it, in fact it > needs to contain all available info about marsupials. > How can I set my freebase so it gathers data automatically and can it index > it for me itself after collecting data from round the web? > So that queries automatically gather and present the details about any > species of marsupials and associated plants and animals, from a range of > data bases such as Berkley Uni, Smithsonian inst., DBPedia and others. > the data results are needed as a mash-up result from several sources, > displayed as a form or a webpage. > > > > > On Fri, Jun 5, 2009 at 9:22 AM, brendan wrote: > >> At Kirrily and sprocketonline's suggestion, I am going to remove this >> type. It has no properties and is too vague a category of structure >> to merit it's own type. The structure type covers the important >> properties. The 12 existing tower's will be de-typed manually (they >> will still be structure's) Any objections? >> >> Brendan >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090605/9dff30d3/attachment.htm From brendan at metaweb.com Fri Jun 5 16:54:21 2009 From: brendan at metaweb.com (brendan) Date: Fri, 5 Jun 2009 09:54:21 -0700 Subject: [Data-modeling] remove /architecture/tower type In-Reply-To: References: <6003565A-5852-45EB-97E7-6235D38D9C2D@metaweb.com> <9e406a50906041645r32489074me944f27f3e59e619@mail.gmail.com> Message-ID: Tim, I created the type a long time ago, so my intuition about towers mirrored yours. But I've been convinced otherwise. I agree with Iain on the criteria for creating a new type. The wikipedia article "Tower" opens with: "Towers are tall human-made structures that are almost always taller than they are wide, usually by a significant margin" dictionary definitions don't go much further... which doesn't inspire a lot of confidence as far as precise definitions go. So, to sum it up, "tower" seems like a specific something but I failed to find evidence that it actually is :) As ridiculous as they might sound. If a person searching freebase is really stuck on finding things that people call towers, they are better off just searching for "tower" in the name. I do, however, support addition of types for some of the specific things you mentioned. Brendan On Jun 5, 2009, at 5:20 AM, Iain Sproat wrote: > Tim, > > Structure is found in the Architecture commons, so is specific to > architectural structures and is applicable to most things from > aqueducts to zoos (including broadcast towers and skyscrapers). > > The main criteria for adding a new type is whether there are any > properties in it that aren't covered anywhere else. e.g. Floor area > property in the building type or span length in the bridge type. > The tower type doesn't have any properties, which is a good > indicator that the schema wasn't appropriate. > > My first thought was to try and improve the type, but I couldn't > think of a property that wasn't already covered by the structure > type and was also appropriate to the wide range of purposes of > towers, from historical fortress towers, to communication towers and > everything in between. But that's not to say there isn't a > property! If anyone can think of a property relevant to the tower > type, please reply to this discussion. > > That said, there are properties appropriate for very specific types > of structures which aren't already covered by existing types, e.g. > number of microwave transmitters on a TV broadcast tower. If you > want, you can create a new "Communication tower" type if you would > like to model this. See the help page on creating schemas and feel > free to submit a question here or on the discussion boards. > > If the type is deleted, you can still search for tall towers by > using a view to select structures above 250m in height, for example. > > Iain (sprocketonline) > > On Fri, Jun 5, 2009 at 3:45 AM, wrote: > Hi brendan > i am tim > I disagree and think 'towers' is an important class of objects to > keep. > In the sense that they are a distinct structure such as the Eiffel > tower and now there are thousands of mobile phone towers and radio > transmission towers as well as many tower type sky scrapers in > modern cities. > They are a class of objects in their own right in my view. > Why do you want to remove the class? > I am new I thought the more the merrier was the approach. > Structures is a very broad term, sentences, stories and thoughts > have structures. > Buildings is another term, however 'tower' means a type of building > or construction. > Is it better to expand the tower class rather than remove it? > > My freebase is about marsupials and i want to fine-tune it, in fact > it needs to contain all available info about marsupials. > How can I set my freebase so it gathers data automatically and can > it index it for me itself after collecting data from round the web? > So that queries automatically gather and present the details about > any species of marsupials and associated plants and animals, from a > range of data bases such as Berkley Uni, Smithsonian inst., DBPedia > and others. > the data results are needed as a mash-up result from several > sources, displayed as a form or a webpage. > > > > > On Fri, Jun 5, 2009 at 9:22 AM, brendan wrote: > At Kirrily and sprocketonline's suggestion, I am going to remove this > type. It has no properties and is too vague a category of structure > to merit it's own type. The structure type covers the important > properties. The 12 existing tower's will be de-typed manually (they > will still be structure's) Any objections? > > Brendan > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090605/acca3b12/attachment-0001.htm From iainsproat at gmail.com Fri Jun 5 17:54:22 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Fri, 5 Jun 2009 21:54:22 +0400 Subject: [Data-modeling] [BULK] Re: remove /architecture/tower type In-Reply-To: References: <6003565A-5852-45EB-97E7-6235D38D9C2D@metaweb.com> <9e406a50906041645r32489074me944f27f3e59e619@mail.gmail.com> Message-ID: I would support having a separate type for communications towers (and another for fortification towers if required.). Skyscrapers are already covered. Anyone searching for towers could run a query pulling those 3 types together. Would it be better if we rename 'tower' as 'communication tower'? (and add relevant properties, e.g. number of antennae, broadcast spectrum etc..). A rough definition when used in structural engineering: the difference between a tower and normal structure depends on the how dynamic the building is. i.e. how much it moves with the wind, earthquakes etc.. This defines the difference between The Great Pyramid of Giza (138m tall, not a tower) and the Montjuic communications tower (136 m tall and a tower). Dynamic properties of a building are calculated using the height as one of many factors. Iain On Fri, Jun 5, 2009 at 8:54 PM, brendan wrote: > Tim, > I created the type a long time ago, so my intuition about towers mirrored > yours. But I've been convinced otherwise. I agree with Iain on the criteria > for creating a new type. The wikipedia article "Tower" opens with: > > "*Towers* are tall human-made structures that > are almost always taller than they are wide, usually by a significant > margin" > > dictionary definitions don't go much further... > > which doesn't inspire a lot of confidence as far as precise definitions go. > So, to sum it up, "tower" seems like a specific something but I failed to > find evidence that it actually is :) As ridiculous as they might sound. If > a person searching freebase is really stuck on finding things that people > call towers, they are better off just searching for "tower" in the name. I > do, however, support addition of types for some of the specific things you > mentioned. > > Brendan > > On Jun 5, 2009, at 5:20 AM, Iain Sproat wrote: > > Tim, > Structure is found > in the Architecture commons, > so is specific to architectural structures and is applicable to most things > from aqueducts to > zoos (including broadcast > towers and skyscrapers > ). > > The main criteria for adding a new type is whether there are any properties > in it that aren't covered anywhere else. e.g. Floor area property in the > building type or span > length in the bridge type. The tower type doesn't have any properties, which is a good indicator > that the schema wasn't appropriate. > > My first thought was to try and improve the type, but I couldn't think of a > property that wasn't already covered by the structure type and was also > appropriate to the wide range of purposes of towers, from historical > fortress towers , to communication > towers and everything in > between. But that's not to say there isn't a property! If anyone can think > of a property relevant to the tower type, please reply to this discussion. > > That said, there are properties appropriate for very specific types of > structures which aren't already covered by existing types, e.g. number of > microwave transmitters on a TV broadcast tower. If you want, you can create > a new "Communication tower" type if you would like to model this. See the > help page > on creating schemas and feel free to submit a question here or on the discussion > boards . > > If the type is deleted, you can still search for tall towers by using a view > to select structures above 250m in > height, for example. > > Iain (sprocketonline) > > On Fri, Jun 5, 2009 at 3:45 AM, wrote: > >> Hi brendan >> i am tim >> I disagree and think 'towers' is an important class of objects to keep. >> In the sense that they are a distinct structure such as the Eiffel tower >> and now there are thousands of mobile phone towers and radio transmission >> towers as well as many tower type sky scrapers in modern cities. >> They are a class of objects in their own right in my view. >> Why do you want to remove the class? >> I am new I thought the more the merrier was the approach. >> Structures is a very broad term, sentences, stories and thoughts have >> structures. >> Buildings is another term, however 'tower' means a type of building or >> construction. >> Is it better to expand the tower class rather than remove it? >> >> My freebase is about marsupials and i want to fine-tune it, in fact it >> needs to contain all available info about marsupials. >> How can I set my freebase so it gathers data automatically and can it >> index it for me itself after collecting data from round the web? >> So that queries automatically gather and present the details about any >> species of marsupials and associated plants and animals, from a range of >> data bases such as Berkley Uni, Smithsonian inst., DBPedia and others. >> the data results are needed as a mash-up result from several sources, >> displayed as a form or a webpage. >> >> >> >> >> On Fri, Jun 5, 2009 at 9:22 AM, brendan wrote: >> >>> At Kirrily and sprocketonline's suggestion, I am going to remove this >>> type. It has no properties and is too vague a category of structure >>> to merit it's own type. The structure type covers the important >>> properties. The 12 existing tower's will be de-typed manually (they >>> will still be structure's) Any objections? >>> >>> Brendan >>> _______________________________________________ >>> Data-modeling mailing list >>> Data-modeling at freebase.com >>> http://lists.freebase.com/mailman/listinfo/data-modeling >>> >> >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090605/1a6c67bc/attachment.htm From jeff at metaweb.com Fri Jun 5 18:37:51 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Fri, 5 Jun 2009 11:37:51 -0700 (PDT) Subject: [Data-modeling] Music Festival type: delete? Message-ID: <251339503.66491244227071373.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> http://www.freebase.com/view/music/festival This is an old type, with no properties of its own. It's causing some problems because "Event" is an included type, even though the type is used for both recurring and one-off events. There are newer types for Concert and Concert Tour, which cover similar ground, and there has been a request to delete the Music Festival type. There's a discussion with further background here: http://www.freebase.com/view/guid/9202a8c04000641f800000000b20b737 What do people think? Jeff From jeff at metaweb.com Fri Jun 5 23:25:08 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Fri, 5 Jun 2009 16:25:08 -0700 (PDT) Subject: [Data-modeling] Refactoring appointees out of Government Position Held Message-ID: <1505574559.68471244244308249.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> The Government Position Held CVT includes a property for "appointed by" that expects the "political appointer" type. This is creating a lot of problematic data because the Government Position Held schema only really works with legislatures and executive positions (president, prime minister, monarch, etc.), but users are putting in people appointed to cabinets, ministries, departments, agencies, embassies, and similar types of entities. To deal with this, we'd like to remove the "appointed by" property entirely, and use a new schema to deal with all kinds of appointments (including, but not limited to government). The new appointments schema is currently in a base, but will be promoted to the commons as part of this refactoring: http://appointments.freebase.com/. Please speak up if this will affect any of your projects. Jeff From tfmorris at gmail.com Sat Jun 6 13:09:54 2009 From: tfmorris at gmail.com (Tom Morris) Date: Sat, 6 Jun 2009 09:09:54 -0400 Subject: [Data-modeling] [BULK] Re: remove /architecture/tower type In-Reply-To: References: <6003565A-5852-45EB-97E7-6235D38D9C2D@metaweb.com> <9e406a50906041645r32489074me944f27f3e59e619@mail.gmail.com> Message-ID: Personally I don't have a problem with propertyless types, particularly when they map to a commonly understood human classification of things, but if we're going to ban them, then they should all be banned. City/Town is probably one of the most widely used offenders. A real issue here is discoverability. If the word "Tower" pops into someones head when they come across the Leaning Tower of Pisa, how do they find that "Structure" is really the type that is supposed to be. Poor mapping to the way people think about things, as well as lack of regularity in the schema increase friction in the typing process. Something that tripped me up recently was "Manufacturer." There is no such type and I didn't have time to dig through the schema to find out what the right way to do it was, so I just dumped whatever I was looking at into "Company." There needs to be a way to guide folks through the schema from the term they are thinking of to the appropriate official type. > Would it be better if we rename 'tower' as 'communication tower'? ?(and add > relevant properties, e.g. number of antennae, broadcast spectrum etc..). A better name would be 'broadcast facility' or 'antenna farm' or something else which matched just that set of properties. These things don't have to be towers. They can be on a hill or on a tower (oops, structure) used for something else entirely like a bell tower, clock tower, or office tower. Some of the important characteristics like height above mean ground level (or whatever the technical term is) are only partially related to their towerness. > A rough definition when used in structural engineering: the difference > between a tower and normal structure depends on the how dynamic the building > is. ?i.e. how much it moves with the wind, earthquakes etc.. ?This defines > the difference between?The Great Pyramid of Giza?(138m tall, not a tower) > and the?Montjuic communications tower?(136 m tall and a tower). ?Dynamic > properties of a building are calculated using the height as one of many > factors. With all due respect to the structural engineering field, I think everyone knows a tower when they see one. Despite it's incredibly slow dynamics, pretty much everyone thinks of the not very tall and not at all flexible structure at Pisa as a tower. If the Tower type does disappear, a height to cross sectional area ratio is probably a better match to the commonly understood definition of tower. Tom From pauljmackay at gmail.com Sun Jun 7 20:46:20 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Sun, 7 Jun 2009 13:46:20 -0700 Subject: [Data-modeling] What is the best way to prototype a schema? Message-ID: I'm trying to develop some new schema entities and hit some issues after trying to modify some sample entries. Deletion of a topic can only be flagged so now I have some topics that are not correct. So what is the recommended approach: Use a personal namespace? What happens when other existing topics get your new type attached to them? Use the sandbox? I understand that is wiped each week? How to preserve some sample entries? Are there any simple import/export tools that can be used? How to know when it is logical to migrate into a public Base? Thanks paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090607/db7efc5c/attachment.htm From iainsproat at gmail.com Mon Jun 8 06:22:37 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Mon, 8 Jun 2009 10:22:37 +0400 Subject: [Data-modeling] What is the best way to prototype a schema? In-Reply-To: References: Message-ID: On Mon, Jun 8, 2009 at 12:46 AM, Paul Mackay wrote: > I'm trying to develop some new schema entities and hit some issues after > trying to modify some sample entries. Deletion of a topic can only be > flagged so now I have some topics that are not correct. Don't worry, happens all the time. Just flag them for delete, they will soon get removed. But do spend some time playing on the sandbox first. > > > So what is the recommended approach: > > Use a personal namespace? What happens when other existing topics get your > new type attached to them? Sandbox, if your types are temporary and you don't mind them getting deleted on Monday. But personal namespace is the way to go if you don't have the time to do everything in a week. > > > Use the sandbox? I understand that is wiped each week? How to preserve some > sample entries? Are there any simple import/export tools that can be used? > > How to know when it is logical to migrate into a public Base? If you are creating a few types around a single subject, it is better to create it in a base. If you create a base, you can edit it and unselect the "*Include this Base in the directory and in Freebase search*" option. That way, you can experiment with schema, and it won't be seen by others. But your types won't be visible on topics unless you access them via your base. (it appends an &domain=xxxxx to the end of the url) > > Thanks > > paul > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090608/12fa4841/attachment.htm From faye at metaweb.com Mon Jun 8 19:30:51 2009 From: faye at metaweb.com (Faye Harris) Date: Mon, 08 Jun 2009 12:30:51 -0700 Subject: [Data-modeling] What is the best way to prototype a schema? In-Reply-To: References: Message-ID: <4A2D66EB.20409@metaweb.com> Paul Mackay wrote: > I'm trying to develop some new schema entities and hit some issues > after trying to modify some sample entries. Deletion of a topic can > only be flagged so now I have some topics that are not correct. What's an example of such an incorrect topic? Topics are editable, so you may not need to delete them. If the incorrect part of the topic is its type information, you can simply remove the type association from the topic. BTW, you can delete a type created in your personal base from within the schema editor. As Iain has pointed out, sandbox is the best place for experimentation. If you run into questions or problems during schema design, feel free to post a question or request for collaboration. Almost every page in Freebase includes a link to start a discussion thread. I also recommend checking to see if any other user is working or has worked on similar types or schema with similar design patterns. The Freebase community is here to help. -- Faye From kirrily at metaweb.com Wed Jun 10 21:55:20 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Wed, 10 Jun 2009 14:55:20 -0700 Subject: [Data-modeling] Freebase Hack Day, July 11th in San Francisco Message-ID: Hey everyone, we're planning a Freebase Hack Day to be held on July 11th in San Francisco. It's a free event for all Freebase developers, data geeks, etc. If you'd like to attend, there are more details and a signup form here: http://blog.freebase.com/2009/06/10/freebase-hack-day-ii-the-return-of-hack-day/ K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ From tfmorris at gmail.com Sun Jun 14 19:23:37 2009 From: tfmorris at gmail.com (Tom Morris) Date: Sun, 14 Jun 2009 15:23:37 -0400 Subject: [Data-modeling] /lang/de vs /en/german_language Message-ID: These two got flagged for merge recently which reminded me that I've been meaning to ask why there are two parallel sets of language definitions. The /type/lang topics appear to be used internally for things like tagging strings and users languages, etc, but semantically it appears to be identical to the topics which are typed /language/human_language. I imagine that the /type/lang languages were needed for bootstrapping, but are they still needed. If they are, is there a way that they can be hidden or made less visible as well as locked so that people don't try to add the /language/human_language type to them (as happened recently to German when someone picked the wrong one). Tom From pauljmackay at gmail.com Mon Jun 15 04:19:36 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Sun, 14 Jun 2009 21:19:36 -0700 Subject: [Data-modeling] Different display of topics on sandbox? Message-ID: Hi, I've been experimenting in the sandbox and found that topics had a different much more basic view than they would in regular Freebase. The type definitions rarely seemed to show up. Any idea why that is? thanks paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090614/8c1cfd46/attachment.htm From iainsproat at gmail.com Mon Jun 15 06:12:40 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Mon, 15 Jun 2009 10:12:40 +0400 Subject: [Data-modeling] Different display of topics on sandbox? In-Reply-To: References: Message-ID: It will probably be refreshed in a few hours when the sandbox gets cleaned up on Monday. Meanwhile in the sandbox you need to click "Add more facts" at the top right of the page to get all the options.The freebase staff have been experimenting in the sandbox this last week. Iain On Mon, Jun 15, 2009 at 8:19 AM, Paul Mackay wrote: > Hi, > > I've been experimenting in the sandbox and found that topics had a > different much more basic view than they would in regular Freebase. The type > definitions rarely seemed to show up. Any idea why that is? > > thanks > > paul > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090615/c61c36b1/attachment.htm From kirrily at metaweb.com Mon Jun 15 17:29:06 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Mon, 15 Jun 2009 10:29:06 -0700 Subject: [Data-modeling] Different display of topics on sandbox? In-Reply-To: References: Message-ID: <467E7397-59C9-465F-B6A4-22EC5055698D@metaweb.com> On 14/06/2009, at 11:12 PM, Iain Sproat wrote: > It will probably be refreshed in a few hours when the sandbox gets > cleaned up on Monday. > > Meanwhile in the sandbox you need to click "Add more facts" at the > top right of the page to get all the options. > The freebase staff have been experimenting in the sandbox this last > week. Thanks Iain. I should also probably note that this is a pending change which will roll out to the main website at some point this week. The goal is to make the topic page more friendly to casual visitors, without reducing any of the edit functionality (and in fact improving it for data geeks). To that end, we're splitting the topic page into a "consumer" version and a "geek" version. What you see initially is the "consumer" version, but if you click on "Add more facts" you'll wind up on the edit page, which is pretty much the same as the current topic page. Once you're on that page, clicking any links from there will keep you in the edit version of the page. (Note, however, that searches and saved views will lead you back to the consumer page. We're working on using cookies to store your preferences about this.) K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com From alecf at metaweb.com Mon Jun 15 20:15:21 2009 From: alecf at metaweb.com (Alec Flett) Date: Mon, 15 Jun 2009 13:15:21 -0700 Subject: [Data-modeling] /lang/de vs /en/german_language In-Reply-To: References: Message-ID: <13C07B3E-DC68-482E-A985-1C4E3217E674@metaweb.com> On Jun 14, 2009, at 12:23 PM, Tom Morris wrote: > These two got flagged for merge recently which reminded me that I've > been meaning to ask why there are two parallel sets of language > definitions. The /type/lang topics appear to be used internally for > things like tagging strings and users languages, etc, but semantically > it appears to be identical to the topics which are typed > /language/human_language. > > I imagine that the /type/lang languages were needed for bootstrapping, > but are they still needed. If they are, is there a way that they can > be hidden or made less visible as well as locked so that people don't > try to add the /language/human_language type to them (as happened > recently to German when someone picked the wrong one). You've described the situation well - the /type/lang instances are part of how the system works, and even though they're similar to /en/ german_language, they're not topics, they shouldn't be user visible. I've just filed a bug CLI-8407 on this... Alec > > Tom > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090615/cf4d811c/attachment.htm From jeff at metaweb.com Tue Jun 16 22:10:06 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 16 Jun 2009 15:10:06 -0700 (PDT) Subject: [Data-modeling] Products with ingredients In-Reply-To: <1724166281.155921245189741906.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> I've been working on a model for Products With Ingredients (catchy name, eh?) over on sandbox: It's pretty minimal, with two types: Product and Ingredient. The "product with ingredients" type can be used both with a consumer product () or with a brand or product line (), depending on where the ingredients make the most sense (i.e., all packages of Corn Flakes have the same ingredients, so putting the type at the Brand level makes the most sense). There are two things I'm seeing with my example data that don't quite work in the model, though, and I'm not quite sure what the best way to resolve them is. One is the Corn Flakes ingredient "Milled corn". Should the Ingredient topic be "Milled Corn", should it just be "Corn", or do we need a CVT to allow people to modify the ingredient ("Corn", "milled")? The toothpaste has this ingredient also: "sodium lauryl sulfate (from coconut oil)", which I think is the same issue. The other one is ingredients within ingredients: the toothpaste tube lists this ingredient: "fruit extracts (strawberry, banana, and other natural flavors)". Treat as four separate ingredients, and punt on the relationship? I'm tempted toward this one -- if you're looking for potential allergens, or animal-based ingredients, or the like, you don't care whether the offending item is in a main ingredient or is an ingredient of an ingredient. Thoughts? Jeff From gordon at metaweb.com Tue Jun 16 22:24:46 2009 From: gordon at metaweb.com (Gordon Mackenzie) Date: Tue, 16 Jun 2009 15:24:46 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: On Jun 16, 2009, at 3:10 PM, Jeff Prucher wrote: > I've been working on a model for Products With Ingredients (catchy > name, eh?) over on sandbox: > > > > It's pretty minimal, with two types: Product and Ingredient. The > "product with ingredients" type can be used both with a consumer > product ( >) or with a brand or product line ( >), depending on where the ingredients make the most sense (i.e., > all packages of Corn Flakes have the same ingredients, so putting > the type at the Brand level makes the most sense). > > There are two things I'm seeing with my example data that don't > quite work in the model, though, and I'm not quite sure what the > best way to resolve them is. One is the Corn Flakes ingredient > "Milled corn". Should the Ingredient topic be "Milled Corn", should > it just be "Corn", or do we need a CVT to allow people to modify the > ingredient ("Corn", "milled")? The toothpaste has this ingredient > also: "sodium lauryl sulfate (from coconut oil)", which I think is > the same issue. I like CVT, ingredient, how it is treated/processed > The other one is ingredients within ingredients: the toothpaste tube > lists this ingredient: "fruit extracts (strawberry, banana, and > other natural flavors)". Treat as four separate ingredients, and > punt on the relationship? I'm tempted toward this one -- if you're > looking for potential allergens, or animal-based ingredients, or the > like, you don't care whether the offending item is in a main > ingredient or is an ingredient of an ingredient. > All ingrediants... Maybe a property for common allergens contained by this product ( a great reverse in real life for my wife is the feature of Whole Foods stores as they clearly label for Gluten-Free by using a pale green special border around the price sign?)? Allergens: Shellfish, peanuts, dairy products (lactose?), gluten, etc. > Thoughts? > > Jeff > From spatial.db at gmail.com Tue Jun 16 22:24:57 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Tue, 16 Jun 2009 18:24:57 -0400 Subject: [Data-modeling] Products with ingredients In-Reply-To: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <1724166281.155921245189741906.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: Possibly surprising, but I would suggest keeping it as a simple ordered list of ingredients. Order should be indexed if possible because it is semantically important - implies relative amounts. Use "Milled corn" instead of "Corn", and "sodium lauryl sulfate (from coconut oil)". "Processed ingredient" could be a separate type down the road if anyone wants to take it on. "Strawberry extract", "Banana extract", and "Natural flavors" should work. My 2 cents, -Ed On Tue, Jun 16, 2009 at 6:10 PM, Jeff Prucher wrote: > I've been working on a model for Products With Ingredients (catchy name, > eh?) over on sandbox: > > > It's pretty minimal, with two types: Product and Ingredient. The "product > with ingredients" type can be used both with a consumer product (< > https://www.sandbox-freebase.com/view/guid/9202a8c04000641f800000000c461acb>) > or with a brand or product line (< > https://www.sandbox-freebase.com/view/en/corn_flakes>), depending on where > the ingredients make the most sense (i.e., all packages of Corn Flakes have > the same ingredients, so putting the type at the Brand level makes the most > sense). > > There are two things I'm seeing with my example data that don't quite work > in the model, though, and I'm not quite sure what the best way to resolve > them is. One is the Corn Flakes ingredient "Milled corn". Should the > Ingredient topic be "Milled Corn", should it just be "Corn", or do we need a > CVT to allow people to modify the ingredient ("Corn", "milled")? The > toothpaste has this ingredient also: "sodium lauryl sulfate (from coconut > oil)", which I think is the same issue. > > The other one is ingredients within ingredients: the toothpaste tube lists > this ingredient: "fruit extracts (strawberry, banana, and other natural > flavors)". Treat as four separate ingredients, and punt on the relationship? > I'm tempted toward this one -- if you're looking for potential allergens, or > animal-based ingredients, or the like, you don't care whether the offending > item is in a main ingredient or is an ingredient of an ingredient. > > Thoughts? > > Jeff > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090616/7f3525ef/attachment.htm From robert at metaweb.com Tue Jun 16 22:29:44 2009 From: robert at metaweb.com (Robert Cook) Date: Tue, 16 Jun 2009 15:29:44 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: On Jun 16, 2009, at 3:10 PM, Jeff Prucher wrote: > I've been working on a model for Products With Ingredients (catchy > name, eh?) over on sandbox: > > > > It's pretty minimal, with two types: Product and Ingredient. The > "product with ingredients" type can be used both with a consumer > product ( >) or with a brand or product line ( >), depending on where the ingredients make the most sense (i.e., > all packages of Corn Flakes have the same ingredients, so putting > the type at the Brand level makes the most sense). > > There are two things I'm seeing with my example data that don't > quite work in the model, though, and I'm not quite sure what the > best way to resolve them is. One is the Corn Flakes ingredient > "Milled corn". Should the Ingredient topic be "Milled Corn", should > it just be "Corn", or do we need a CVT to allow people to modify the > ingredient ("Corn", "milled")? The toothpaste has this ingredient > also: "sodium lauryl sulfate (from coconut oil)", which I think is > the same issue. I would err on the side of simpler data input (to increase the chances that the schema is actually used). For that reason, I think that "milled corn" is fine. If queries need to find all corn-based ingredients, we then can either refactor data after we have a lot of it, perhaps using your suggested modifier property or we could create a phylogeny pattern that, for instance, encodes that "milled corn" is a type of "corn", and then MQL queries could use this structure. Either way, hew to the existing data and we'll solve the query problems as we go. > > The other one is ingredients within ingredients: the toothpaste tube > lists this ingredient: "fruit extracts (strawberry, banana, and > other natural flavors)". Treat as four separate ingredients, and > punt on the relationship? I'm tempted toward this one -- if you're > looking for potential allergens, or animal-based ingredients, or the > like, you don't care whether the offending item is in a main > ingredient or is an ingredient of an ingredient. This is probably a good guideline - if there are sub-ingredients, they should probably be broken out when the data is added. The only problem here is that ordering matters -- on the original contents list, there is more of item N than item N+1 in the product. If you break them out, it's unclear where they should end up in the list. (As an aside, I can see that the ordering was lost in your corn flakes example -- this is a bug in the client when you add multiple property values at once, their ordering is lost.) R From kurt at spaceship.com Tue Jun 16 22:47:02 2009 From: kurt at spaceship.com (Kurt Bollacker) Date: Tue, 16 Jun 2009 15:47:02 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: References: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <20090616224702.GN19554@spaceship.com> On Tue, Jun 16, 2009 at 03:29:44PM -0700, Robert Cook wrote: > > The other one is ingredients within ingredients: the toothpaste tube > > lists this ingredient: "fruit extracts (strawberry, banana, and > > other natural flavors)". Treat as four separate ingredients, and > > punt on the relationship? I'm tempted toward this one -- if you're > > looking for potential allergens, or animal-based ingredients, or the > > like, you don't care whether the offending item is in a main > > ingredient or is an ingredient of an ingredient. > > This is probably a good guideline - if there are sub-ingredients, they > should probably be broken out when the data is added. The only > problem here is that ordering matters -- on the original contents > list, there is more of item N than item N+1 in the product. If you > break them out, it's unclear where they should end up in the list. This is exactly the same problem that Kirrily has to solve with her Recipe schema. It seems like the ingredient type should work for both products and recipies. It also seems like recipies have the same containment tree pattern. A cherry pie has a "double butter crust" as an ingredient, but this is a recipe as well. So products should be able to contain other products as ingredients. For example, there is a Ritter chocolate bar that contains corn flakes as an ingedient. Kurt :-) From kirrily at metaweb.com Tue Jun 16 22:50:11 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Tue, 16 Jun 2009 15:50:11 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <49D08EDF-C7A7-4AAB-9235-2BDC553FAD3C@metaweb.com> Can we make it clearer that this is for both food products and non- food products? That is, you should be able to include window cleaner, snail bait, paint, etc. K. On 16/06/2009, at 3:10 PM, Jeff Prucher wrote: > I've been working on a model for Products With Ingredients (catchy > name, eh?) over on sandbox: > > > > It's pretty minimal, with two types: Product and Ingredient. The > "product with ingredients" type can be used both with a consumer > product ( >) or with a brand or product line ( >), depending on where the ingredients make the most sense (i.e., > all packages of Corn Flakes have the same ingredients, so putting > the type at the Brand level makes the most sense). > > There are two things I'm seeing with my example data that don't > quite work in the model, though, and I'm not quite sure what the > best way to resolve them is. One is the Corn Flakes ingredient > "Milled corn". Should the Ingredient topic be "Milled Corn", should > it just be "Corn", or do we need a CVT to allow people to modify the > ingredient ("Corn", "milled")? The toothpaste has this ingredient > also: "sodium lauryl sulfate (from coconut oil)", which I think is > the same issue. > > The other one is ingredients within ingredients: the toothpaste tube > lists this ingredient: "fruit extracts (strawberry, banana, and > other natural flavors)". Treat as four separate ingredients, and > punt on the relationship? I'm tempted toward this one -- if you're > looking for potential allergens, or animal-based ingredients, or the > like, you don't care whether the offending item is in a main > ingredient or is an ingredient of an ingredient. > > Thoughts? > > Jeff > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -- Kirrily Robert Freebase Community Director kirrily at metaweb.com From alecf at metaweb.com Tue Jun 16 22:51:48 2009 From: alecf at metaweb.com (Alec Flett) Date: Tue, 16 Jun 2009 15:51:48 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <20090616224702.GN19554@spaceship.com> References: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <20090616224702.GN19554@spaceship.com> Message-ID: On Jun 16, 2009, at 3:47 PM, Kurt Bollacker wrote: >>> >> >> This is probably a good guideline - if there are sub-ingredients, >> they >> should probably be broken out when the data is added. The only >> problem here is that ordering matters -- on the original contents >> list, there is more of item N than item N+1 in the product. If you >> break them out, it's unclear where they should end up in the list. > > This is exactly the same problem that Kirrily has to solve with her > Recipe schema. It seems like the ingredient type should work for both > products and recipies. +1 I know data input is simpler without the CVT, but at the same time having "Amount" in some for makes it much easier to query for "products which have > 10g dextrose per serving" - you can't do that if you just say that the ingredient amounts can only be inferred by their index within the product. Alec > It also seems like recipies have the same > containment tree pattern. A cherry pie has a "double butter crust" as > an ingredient, but this is a recipe as well. So products should be > able to contain other products as ingredients. For example, there is > a Ritter chocolate bar that contains corn flakes as an ingedient. > > > > Kurt :-) > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From robert at metaweb.com Tue Jun 16 22:54:34 2009 From: robert at metaweb.com (Robert Cook) Date: Tue, 16 Jun 2009 15:54:34 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: References: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <20090616224702.GN19554@spaceship.com> Message-ID: <65A82DD0-7354-4FEC-A381-A17A04189D53@metaweb.com> This is true, but (I think) most products don't include this information, so such queries will probably not work. On Jun 16, 2009, at 3:51 PM, Alec Flett wrote: > I know data input is simpler without the CVT, but at the same time > having "Amount" in some for makes it much easier to query for > "products which have > 10g dextrose per serving" - you can't do that > if you just say that the ingredient amounts can only be inferred by > their index within the product. From kirrily at metaweb.com Tue Jun 16 22:56:35 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Tue, 16 Jun 2009 15:56:35 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: References: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <20090616224702.GN19554@spaceship.com> Message-ID: <15263043-CC12-45A3-BAFD-A64296A8C32B@metaweb.com> On 16/06/2009, at 3:51 PM, Alec Flett wrote: > I know data input is simpler without the CVT, but at the same time > having "Amount" in some for makes it much easier to query for > "products which have > 10g dextrose per serving" - you can't do that > if you just say that the ingredient amounts can only be inferred by > their index within the product. But this information is very seldom known, and relies on working with a standardised amount of the product. How much red food colouring is there in kool-aid, either by percentage or weight? We have no idea. Let alone what the secret ingredients are in non-food products, other than that eg. Brand X Floor Polish contains beeswax because it says so on the label. K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com From faye at metaweb.com Tue Jun 16 23:04:31 2009 From: faye at metaweb.com (Faye Harris) Date: Tue, 16 Jun 2009 16:04:31 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <642437608.155971245190206827.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <4A3824FF.7090007@metaweb.com> Yay! Glad to see this making it to the data modeling mailing list since our discussions. > There are two things I'm seeing with my example data that don't quite work in the model, though, and I'm not quite sure what the best way to resolve them is. One is the Corn Flakes ingredient "Milled corn". Should the Ingredient topic be "Milled Corn", should it just be "Corn", or do we need a CVT to allow people to modify the ingredient ("Corn", "milled")? I'd be inclined to make "milled corn" a stand-alone topic. My main concerns regarding data like this are searchability and reusability. Users need to be able to link all products containing "milled corn" to the ingredient, and searching by that ingredient should turn up all of the products that use it. Secondly, in cooking and in chemistry, the process by which a base ingredient is enhanced or modified is usually considered part of its identification and not, I feel, a CVT-level annotator. A recipe that calls for preserved plums cannot be duplicated with fresh plums, and for a consumer, fermented tofu becomes an acquired taste whereas regular tofu is widely accepted. Those, for the purpose of identification and labeling, constitute completely different ingredients and should be modeled as such. > The toothpaste has this ingredient also: "sodium lauryl sulfate (from coconut oil)", which I think is the same issue. > SLS can be derived from coconut oil or palm kernel oil. I'm not sure if either could possibly be less of a skin irritant. Most products (I'd say 90+%?), however, don't specify from which their SLS is derived. The KISS method here would then produce three distinct topics: SLS, SLS (from coconut oil), and SLS (from palm kernel oil). But then wouldn't most topics link to the (unannotated) SLS? Almost seems like a hierarchy is needed to set up a parent-children relationship between SLS and its two (slight) variations. > The other one is ingredients within ingredients: the toothpaste tube lists this ingredient: "fruit extracts (strawberry, banana, and other natural flavors)". Treat as four separate ingredients, and punt on the relationship? +1 on treating as separate ingredients here. -- Faye From jeff at metaweb.com Tue Jun 16 23:21:25 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 16 Jun 2009 16:21:25 -0700 (PDT) Subject: [Data-modeling] Products with ingredients In-Reply-To: Message-ID: <383919793.156281245194485648.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> ----- "Alec Flett" wrote: > From: "Alec Flett" > To: "Freebase data modeling mailing list" > Sent: Tuesday, June 16, 2009 3:51:48 PM GMT -08:00 US/Canada Pacific > Subject: Re: [Data-modeling] Products with ingredients > > On Jun 16, 2009, at 3:47 PM, Kurt Bollacker wrote: > > >>> > >> > >> This is probably a good guideline - if there are sub-ingredients, > > >> they > >> should probably be broken out when the data is added. The only > >> problem here is that ordering matters -- on the original contents > >> list, there is more of item N than item N+1 in the product. If > you > >> break them out, it's unclear where they should end up in the list. > > > > This is exactly the same problem that Kirrily has to solve with her > > Recipe schema. It seems like the ingredient type should work for > both > > products and recipies. > > +1 > > I know data input is simpler without the CVT, but at the same time > having "Amount" in some for makes it much easier to query for > "products which have > 10g dextrose per serving" - you can't do that > > if you just say that the ingredient amounts can only be inferred by > their index within the product. There is a separate model for nutritional information, which applies only to food products. The amount of dextrose in a serving can't be determined from a list of ingredients, since you'd have to know the quantity of dextrose in each ingredient as well as the quantity of each ingredient in the product. Jeff From jeff at metaweb.com Tue Jun 16 23:25:39 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 16 Jun 2009 16:25:39 -0700 (PDT) Subject: [Data-modeling] Products with ingredients In-Reply-To: <20090616224702.GN19554@spaceship.com> Message-ID: <147018019.156341245194739553.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> ----- "Kurt Bollacker" wrote: > From: "Kurt Bollacker" > To: "Freebase data modeling mailing list" > Sent: Tuesday, June 16, 2009 3:47:02 PM GMT -08:00 US/Canada Pacific > Subject: Re: [Data-modeling] Products with ingredients > > On Tue, Jun 16, 2009 at 03:29:44PM -0700, Robert Cook wrote: > > > The other one is ingredients within ingredients: the toothpaste > tube > > > lists this ingredient: "fruit extracts (strawberry, banana, and > > > other natural flavors)". Treat as four separate ingredients, and > > > > punt on the relationship? I'm tempted toward this one -- if you're > > > > looking for potential allergens, or animal-based ingredients, or > the > > > like, you don't care whether the offending item is in a main > > > ingredient or is an ingredient of an ingredient. > > > > This is probably a good guideline - if there are sub-ingredients, > they > > should probably be broken out when the data is added. The only > > problem here is that ordering matters -- on the original contents > > list, there is more of item N than item N+1 in the product. If you > > > break them out, it's unclear where they should end up in the list. > > This is exactly the same problem that Kirrily has to solve with her > Recipe schema. It seems like the ingredient type should work for > both > products and recipies. It also seems like recipies have the same > containment tree pattern. A cherry pie has a "double butter crust" > as > an ingredient, but this is a recipe as well. So products should be > able to contain other products as ingredients. For example, there is > a Ritter chocolate bar that contains corn flakes as an ingedient. Many topics will be typed as both /food/ingredient and /business/product_ingredient, but I think they should exist as cotypes, rather than being the same type. (I really don't want to know what recipes you have that use 1,1,1-trichloroethane, for example, although it's used in a lot of non-food products.) But there's nothing stopping you from including /en/corn_flakes as an ingredient of the Ritter bar -- it's just a topic like any other. Jeff From jeff at metaweb.com Tue Jun 16 23:37:33 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 16 Jun 2009 16:37:33 -0700 (PDT) Subject: [Data-modeling] Products with ingredients In-Reply-To: <1827530175.156421245195086234.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <1189000484.156441245195453588.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Thanks for your input, everyone! I'm taking the approach suggested here by Faye (and a bunch of other people), and I like the idea of adding a phylogeny pattern to the Ingredient type. Any suggestions for what the parent/child property names should be? I don't have a good solution to the ordering problem of ingredients-within-ingredients, but unless we insert a CVT and ask people to explicitly enter the ingredient order, I don't know how much we should rely on the index values, even if the client didn't have a bug with the indexing. (Actually, an explicit order property would work if we asked users to enter 1, 2, 3, 3a, 3b, 3c, 4, etc. for ingredients within ingredients.) Jeff ----- "Faye Harris" wrote: > From: "Faye Harris" > To: "Freebase data modeling mailing list" > Sent: Tuesday, June 16, 2009 4:04:31 PM GMT -08:00 US/Canada Pacific > Subject: Re: [Data-modeling] Products with ingredients > > Yay! Glad to see this making it to the data modeling mailing list > since > our discussions. > > There are two things I'm seeing with my example data that don't > quite work in the model, though, and I'm not quite sure what the best > way to resolve them is. One is the Corn Flakes ingredient "Milled > corn". Should the Ingredient topic be "Milled Corn", should it just be > "Corn", or do we need a CVT to allow people to modify the ingredient > ("Corn", "milled")? > I'd be inclined to make "milled corn" a stand-alone topic. My main > concerns regarding data like this are searchability and reusability. > Users need to be able to link all products containing "milled corn" to > > the ingredient, and searching by that ingredient should turn up all of > > the products that use it. Secondly, in cooking and in chemistry, the > process by which a base ingredient is enhanced or modified is usually > > considered part of its identification and not, I feel, a CVT-level > annotator. A recipe that calls for preserved plums cannot be > duplicated > with fresh plums, and for a consumer, fermented tofu becomes an > acquired > taste whereas regular tofu is widely accepted. Those, for the purpose > of > identification and labeling, constitute completely different > ingredients > and should be modeled as such. > > The toothpaste has this ingredient also: "sodium lauryl sulfate > (from coconut oil)", which I think is the same issue. > > > SLS can be derived from coconut oil or palm kernel oil. I'm not sure > if > either could possibly be less of a skin irritant. Most products (I'd > say > 90+%?), however, don't specify from which their SLS is derived. The > KISS > method here would then produce three distinct topics: SLS, SLS (from > coconut oil), and SLS (from palm kernel oil). But then wouldn't most > topics link to the (unannotated) SLS? Almost seems like a hierarchy is > > needed to set up a parent-children relationship between SLS and its > two > (slight) variations. > > The other one is ingredients within ingredients: the toothpaste tube > lists this ingredient: "fruit extracts (strawberry, banana, and other > natural flavors)". Treat as four separate ingredients, and punt on the > relationship? > +1 on treating as separate ingredients here. > > -- Faye > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From pauljmackay at gmail.com Wed Jun 17 00:02:03 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Tue, 16 Jun 2009 17:02:03 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <1189000484.156441245195453588.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <1827530175.156421245195086234.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <1189000484.156441245195453588.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: I'd like to ask if it might be worth considering this a little differently. What about a "Detailed product" schema that could include ingredients (more aimed at all products not just food) but also other factors as well? Things like manufacturing process, location, recycling information, etc. Or is that too much in one schema? Also labeling information could be useful, but perhaps that belongs in a "Labeled product" schema. I'm basing this comment mostly on the ideas presented here http://www.worldchanging.com/archives/007256.html. Also the "embodied energy" information being stored in WattzOn here ( http://www.wattzon.com/stuff) could potentially be pulled in. paul On Tue, Jun 16, 2009 at 4:37 PM, Jeff Prucher wrote: > Thanks for your input, everyone! I'm taking the approach suggested here by > Faye (and a bunch of other people), and I like the idea of adding a > phylogeny pattern to the Ingredient type. Any suggestions for what the > parent/child property names should be? > > I don't have a good solution to the ordering problem of > ingredients-within-ingredients, but unless we insert a CVT and ask people to > explicitly enter the ingredient order, I don't know how much we should rely > on the index values, even if the client didn't have a bug with the indexing. > (Actually, an explicit order property would work if we asked users to enter > 1, 2, 3, 3a, 3b, 3c, 4, etc. for ingredients within ingredients.) > > Jeff > > > ----- "Faye Harris" wrote: > > > From: "Faye Harris" > > To: "Freebase data modeling mailing list" > > Sent: Tuesday, June 16, 2009 4:04:31 PM GMT -08:00 US/Canada Pacific > > Subject: Re: [Data-modeling] Products with ingredients > > > > Yay! Glad to see this making it to the data modeling mailing list > > since > > our discussions. > > > There are two things I'm seeing with my example data that don't > > quite work in the model, though, and I'm not quite sure what the best > > way to resolve them is. One is the Corn Flakes ingredient "Milled > > corn". Should the Ingredient topic be "Milled Corn", should it just be > > "Corn", or do we need a CVT to allow people to modify the ingredient > > ("Corn", "milled")? > > I'd be inclined to make "milled corn" a stand-alone topic. My main > > concerns regarding data like this are searchability and reusability. > > Users need to be able to link all products containing "milled corn" to > > > > the ingredient, and searching by that ingredient should turn up all of > > > > the products that use it. Secondly, in cooking and in chemistry, the > > process by which a base ingredient is enhanced or modified is usually > > > > considered part of its identification and not, I feel, a CVT-level > > annotator. A recipe that calls for preserved plums cannot be > > duplicated > > with fresh plums, and for a consumer, fermented tofu becomes an > > acquired > > taste whereas regular tofu is widely accepted. Those, for the purpose > > of > > identification and labeling, constitute completely different > > ingredients > > and should be modeled as such. > > > The toothpaste has this ingredient also: "sodium lauryl sulfate > > (from coconut oil)", which I think is the same issue. > > > > > SLS can be derived from coconut oil or palm kernel oil. I'm not sure > > if > > either could possibly be less of a skin irritant. Most products (I'd > > say > > 90+%?), however, don't specify from which their SLS is derived. The > > KISS > > method here would then produce three distinct topics: SLS, SLS (from > > coconut oil), and SLS (from palm kernel oil). But then wouldn't most > > topics link to the (unannotated) SLS? Almost seems like a hierarchy is > > > > needed to set up a parent-children relationship between SLS and its > > two > > (slight) variations. > > > The other one is ingredients within ingredients: the toothpaste tube > > lists this ingredient: "fruit extracts (strawberry, banana, and other > > natural flavors)". Treat as four separate ingredients, and punt on the > > relationship? > > +1 on treating as separate ingredients here. > > > > -- Faye > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090616/0dc65e66/attachment.htm From robert at metaweb.com Wed Jun 17 00:59:14 2009 From: robert at metaweb.com (Robert Cook) Date: Tue, 16 Jun 2009 17:59:14 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <1189000484.156441245195453588.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <1189000484.156441245195453588.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: On Jun 16, 2009, at 4:37 PM, Jeff Prucher wrote: > Thanks for your input, everyone! I'm taking the approach suggested > here by Faye (and a bunch of other people), and I like the idea of > adding a phylogeny pattern to the Ingredient type. Any suggestions > for what the parent/child property names should be? It could be a "a kind of" / "is a generalization of" classic phylogeny. For example: - Milled corn is "a kind of" corn - Corn "is a generalization of" milled corn. Perhaps there is something more elegant and domain specific, but my addled mind couldn't conjure that. > I don't have a good solution to the ordering problem of ingredients- > within-ingredients, but unless we insert a CVT and ask people to > explicitly enter the ingredient order, I don't know how much we > should rely on the index values, even if the client didn't have a > bug with the indexing. (Actually, an explicit order property would > work if we asked users to enter 1, 2, 3, 3a, 3b, 3c, 4, etc. for > ingredients within ingredients.) Index values really are semantically relevant (see Film performances, for example, where billing order is a major contractual obligation). Also, I think hand numbered system would lead to madness. And the client can and should be fixed as well. One solution here could be to add a property to the ingredient type, "Contains ingredients", which allows one to store these sub- ingredients when appropriate. This makes the simple case of data entry straightforward (enter what you see on the package into the product type), while ultimately supporting the allergy use case you mentioned. And, for the allergy use case, you could add an additional property to ingredient to the specific allergy it causes. That way, nobody is forced to break these things out during data entry, and it doesn't break the "order by ingredient amount" semantics on the product. So, I'm proposing two phylogenies. The world craves hierarchy. Sorry anarchists. R > Jeff > > > ----- "Faye Harris" wrote: > >> From: "Faye Harris" >> To: "Freebase data modeling mailing list" > modeling at freebase.com> >> Sent: Tuesday, June 16, 2009 4:04:31 PM GMT -08:00 US/Canada Pacific >> Subject: Re: [Data-modeling] Products with ingredients >> >> Yay! Glad to see this making it to the data modeling mailing list >> since >> our discussions. >>> There are two things I'm seeing with my example data that don't >> quite work in the model, though, and I'm not quite sure what the best >> way to resolve them is. One is the Corn Flakes ingredient "Milled >> corn". Should the Ingredient topic be "Milled Corn", should it just >> be >> "Corn", or do we need a CVT to allow people to modify the ingredient >> ("Corn", "milled")? >> I'd be inclined to make "milled corn" a stand-alone topic. My main >> concerns regarding data like this are searchability and reusability. >> Users need to be able to link all products containing "milled corn" >> to >> >> the ingredient, and searching by that ingredient should turn up all >> of >> >> the products that use it. Secondly, in cooking and in chemistry, the >> process by which a base ingredient is enhanced or modified is usually >> >> considered part of its identification and not, I feel, a CVT-level >> annotator. A recipe that calls for preserved plums cannot be >> duplicated >> with fresh plums, and for a consumer, fermented tofu becomes an >> acquired >> taste whereas regular tofu is widely accepted. Those, for the purpose >> of >> identification and labeling, constitute completely different >> ingredients >> and should be modeled as such. >>> The toothpaste has this ingredient also: "sodium lauryl sulfate >> (from coconut oil)", which I think is the same issue. >>> >> SLS can be derived from coconut oil or palm kernel oil. I'm not sure >> if >> either could possibly be less of a skin irritant. Most products (I'd >> say >> 90+%?), however, don't specify from which their SLS is derived. The >> KISS >> method here would then produce three distinct topics: SLS, SLS (from >> coconut oil), and SLS (from palm kernel oil). But then wouldn't most >> topics link to the (unannotated) SLS? Almost seems like a hierarchy >> is >> >> needed to set up a parent-children relationship between SLS and its >> two >> (slight) variations. >>> The other one is ingredients within ingredients: the toothpaste tube >> lists this ingredient: "fruit extracts (strawberry, banana, and other >> natural flavors)". Treat as four separate ingredients, and punt on >> the >> relationship? >> +1 on treating as separate ingredients here. >> >> -- Faye >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From jeff at metaweb.com Wed Jun 17 16:55:19 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Wed, 17 Jun 2009 09:55:19 -0700 (PDT) Subject: [Data-modeling] Products with ingredients In-Reply-To: <842715229.157491245257673436.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <1788912805.157511245257719374.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> ----- "Robert Cook" wrote: > > I don't have a good solution to the ordering problem of ingredients- > > > within-ingredients, but unless we insert a CVT and ask people to > > explicitly enter the ingredient order, I don't know how much we > > should rely on the index values, even if the client didn't have a > > bug with the indexing. (Actually, an explicit order property would > > > work if we asked users to enter 1, 2, 3, 3a, 3b, 3c, 4, etc. for > > ingredients within ingredients.) > > Index values really are semantically relevant (see Film performances, > > for example, where billing order is a major contractual obligation). > > Also, I think hand numbered system would lead to madness. And the > client can and should be fixed as well. > > One solution here could be to add a property to the ingredient type, > > "Contains ingredients", which allows one to store these sub- > ingredients when appropriate. This makes the simple case of data > entry straightforward (enter what you see on the package into the > product type), while ultimately supporting the allergy use case you > mentioned. And, for the allergy use case, you could add an additional > > property to ingredient to the specific allergy it causes. The trouble there is that it will require users to know *which* "enriched flour" or "chocolate chips" topic to select -- many of these compound ingredients have common names, but it's unreasonable to expect them to be the same from manufacturer to manufacturer (or even from product to product, in many cases). Unless I completely misunderstand you (which is possible), this actually makes the simple case harder -- rather than simply entering what you see on the box, for all compound ingredients you have to first explore Freebase to determine whether the compound ingredient (with all the same ingredients, in the same order) already exists, and, if it doesn't, to edit the new ingredient topic you've created to add all of its ingredients. People are far more likely, I think, to grab the first ingredient they find with the right name, and not take the next steps to determine whether it's really the same or not. (What we really need for this is two-level disambiguation in the cli ent!) Jeff > That way, nobody is forced to break these things out during data > entry, and it doesn't break the "order by ingredient amount" semantics > > on the product. > > So, I'm proposing two phylogenies. The world craves hierarchy. Sorry > > anarchists. > > R > > > > > Jeff > > > > > > ----- "Faye Harris" wrote: > > > >> From: "Faye Harris" > >> To: "Freebase data modeling mailing list" >> modeling at freebase.com> > >> Sent: Tuesday, June 16, 2009 4:04:31 PM GMT -08:00 US/Canada > Pacific > >> Subject: Re: [Data-modeling] Products with ingredients > >> > >> Yay! Glad to see this making it to the data modeling mailing list > >> since > >> our discussions. > >>> There are two things I'm seeing with my example data that don't > >> quite work in the model, though, and I'm not quite sure what the > best > >> way to resolve them is. One is the Corn Flakes ingredient "Milled > >> corn". Should the Ingredient topic be "Milled Corn", should it just > > >> be > >> "Corn", or do we need a CVT to allow people to modify the > ingredient > >> ("Corn", "milled")? > >> I'd be inclined to make "milled corn" a stand-alone topic. My main > >> concerns regarding data like this are searchability and > reusability. > >> Users need to be able to link all products containing "milled corn" > > >> to > >> > >> the ingredient, and searching by that ingredient should turn up all > > >> of > >> > >> the products that use it. Secondly, in cooking and in chemistry, > the > >> process by which a base ingredient is enhanced or modified is > usually > >> > >> considered part of its identification and not, I feel, a CVT-level > >> annotator. A recipe that calls for preserved plums cannot be > >> duplicated > >> with fresh plums, and for a consumer, fermented tofu becomes an > >> acquired > >> taste whereas regular tofu is widely accepted. Those, for the > purpose > >> of > >> identification and labeling, constitute completely different > >> ingredients > >> and should be modeled as such. > >>> The toothpaste has this ingredient also: "sodium lauryl sulfate > >> (from coconut oil)", which I think is the same issue. > >>> > >> SLS can be derived from coconut oil or palm kernel oil. I'm not > sure > >> if > >> either could possibly be less of a skin irritant. Most products > (I'd > >> say > >> 90+%?), however, don't specify from which their SLS is derived. > The > >> KISS > >> method here would then produce three distinct topics: SLS, SLS > (from > >> coconut oil), and SLS (from palm kernel oil). But then wouldn't > most > >> topics link to the (unannotated) SLS? Almost seems like a hierarchy > > >> is > >> > >> needed to set up a parent-children relationship between SLS and > its > >> two > >> (slight) variations. > >>> The other one is ingredients within ingredients: the toothpaste > tube > >> lists this ingredient: "fruit extracts (strawberry, banana, and > other > >> natural flavors)". Treat as four separate ingredients, and punt on > > >> the > >> relationship? > >> +1 on treating as separate ingredients here. > >> > >> -- Faye > >> _______________________________________________ > >> Data-modeling mailing list > >> Data-modeling at freebase.com > >> http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From robert at metaweb.com Wed Jun 17 17:28:29 2009 From: robert at metaweb.com (Robert Cook) Date: Wed, 17 Jun 2009 10:28:29 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <1788912805.157511245257719374.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <1788912805.157511245257719374.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: On Jun 17, 2009, at 9:55 AM, Jeff Prucher wrote: > > ----- "Robert Cook" wrote: > >>> I don't have a good solution to the ordering problem of ingredients- >> >>> within-ingredients, but unless we insert a CVT and ask people to >>> explicitly enter the ingredient order, I don't know how much we >>> should rely on the index values, even if the client didn't have a >>> bug with the indexing. (Actually, an explicit order property would >> >>> work if we asked users to enter 1, 2, 3, 3a, 3b, 3c, 4, etc. for >>> ingredients within ingredients.) >> >> Index values really are semantically relevant (see Film performances, >> >> for example, where billing order is a major contractual obligation). >> >> Also, I think hand numbered system would lead to madness. And the >> client can and should be fixed as well. >> >> One solution here could be to add a property to the ingredient type, >> >> "Contains ingredients", which allows one to store these sub- >> ingredients when appropriate. This makes the simple case of data >> entry straightforward (enter what you see on the package into the >> product type), while ultimately supporting the allergy use case you >> mentioned. And, for the allergy use case, you could add an >> additional >> >> property to ingredient to the specific allergy it causes. > > The trouble there is that it will require users to know *which* > "enriched flour" or "chocolate chips" topic to select -- many of > these compound ingredients have common names, but it's unreasonable > to expect them to be the same from manufacturer to manufacturer (or > even from product to product, in many cases). Unless I completely > misunderstand you (which is possible), this actually makes the > simple case harder -- rather than simply entering what you see on > the box, for all compound ingredients you have to first explore > Freebase to determine whether the compound ingredient (with all the > same ingredients, in the same order) already exists, and, if it > doesn't, to edit the new ingredient topic you've created to add all > of its ingredients. People are far more likely, I think, to grab > the first ingredient they find with the right name, and not take the > next steps to determine whether it's really the same or not. (What > we really need for this is two-level disambiguation in the client!) One solution would be to create a topic with a long name -- enter it exactly as it appears on the label such as "Enriched flour - (wheat, niacin, iron, baby powder, sawdust, DDT)". And then it's not necessary to enter the subingredients or they can be added later. And the subingredients could be made a disambiguator for good measure. This also makes data input less painful and defers the structure to later, which is probably the best way to actually get data into the system. Of course, this approach will in time create too many topics that appear when one types "enriched flour" into Freebase Suggest, but if the user types in one of the subingredients it should give more precise results. R From bryan.cheung at metaweb.com Wed Jun 17 21:49:38 2009 From: bryan.cheung at metaweb.com (Bryan Cheung) Date: Wed, 17 Jun 2009 14:49:38 -0700 Subject: [Data-modeling] Uniquifying non-unique properties Message-ID: Cross-post from the dev list: http://lists.freebase.com/pipermail/developers/2009-June/003000.html From jeff at metaweb.com Thu Jun 18 19:59:59 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Thu, 18 Jun 2009 12:59:59 -0700 (PDT) Subject: [Data-modeling] Products with ingredients In-Reply-To: Message-ID: <2017452479.162401245355199377.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> ----- "Robert Cook" wrote: > > One solution would be to create a topic with a long name -- enter it > > exactly as it appears on the label such as "Enriched flour - (wheat, > > niacin, iron, baby powder, sawdust, DDT)". And then it's not > necessary to enter the subingredients or they can be added later. And > > the subingredients could be made a disambiguator for good measure. > > This also makes data input less painful and defers the structure to > later, which is probably the best way to actually get data into the > system. > > Of course, this approach will in time create too many topics that > appear when one types "enriched flour" into Freebase Suggest, but if > > the user types in one of the subingredients it should give more > precise results. This would answer. Anyone else have any comments or thoughts on this before I load the schema? Jeff From pauljmackay at gmail.com Thu Jun 18 21:37:10 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Thu, 18 Jun 2009 14:37:10 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <2017452479.162401245355199377.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <2017452479.162401245355199377.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: Any comments on the mail I sent 2 days ago? cheers paul On Thu, Jun 18, 2009 at 12:59 PM, Jeff Prucher wrote: > > ----- "Robert Cook" wrote: > > > > > One solution would be to create a topic with a long name -- enter it > > > > exactly as it appears on the label such as "Enriched flour - (wheat, > > > > niacin, iron, baby powder, sawdust, DDT)". And then it's not > > necessary to enter the subingredients or they can be added later. And > > > > the subingredients could be made a disambiguator for good measure. > > > > This also makes data input less painful and defers the structure to > > later, which is probably the best way to actually get data into the > > system. > > > > Of course, this approach will in time create too many topics that > > appear when one types "enriched flour" into Freebase Suggest, but if > > > > the user types in one of the subingredients it should give more > > precise results. > > This would answer. Anyone else have any comments or thoughts on this before > I load the schema? > > Jeff > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090618/266a2338/attachment.htm From faye at metaweb.com Thu Jun 18 21:59:19 2009 From: faye at metaweb.com (Faye Harris) Date: Thu, 18 Jun 2009 14:59:19 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <2017452479.162401245355199377.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <2017452479.162401245355199377.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <4A3AB8B7.4080308@metaweb.com> Jeff Prucher wrote: > ----- "Robert Cook" wrote: > > >> One solution would be to create a topic with a long name -- enter it >> >> exactly as it appears on the label such as "Enriched flour - (wheat, >> >> niacin, iron, baby powder, sawdust, DDT)". >> > > This would answer. Anyone else have any comments or thoughts on this before I load the schema? > The main problem with this is you can't arrive at the products that use enriched flour by clicking on a property link from a single "enriched flour" topic. Rather, you have to do a keyword search for products based on matching all the various "enriched flour - (foo, bar, bat, baz)" ingredient topics with the words "enriched" and "flour". That's quite a loss in queriability. The schema is fine to get us started, but we're still going to try to put together some phylogeny pattern in place (in the near future) right? -- Faye From jeff at metaweb.com Fri Jun 19 18:25:06 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Fri, 19 Jun 2009 11:25:06 -0700 (PDT) Subject: [Data-modeling] Products with ingredients In-Reply-To: <1472566331.166811245435561212.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <523450980.166831245435906563.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Sorry, Paul -- I overlooked this amidst all the other responses. Certainly, the "product with ingredients" type is meant for any kind of product that has ingredients, not just food. Your other suggestions are interesting, and would maybe need to go on another type. (Or on the existing Consumer Product type itself?) In the short term, I'd like to get the Product with Ingredients type up, but additional properties can always be added. In terms of specifics, Manufacturing Process is an interesting idea; someone (Sprocket? Spencer?) has done some work with modeling processes already. I don't know much about the subject, but it might be an interesting model. Location is a bit trickier, since many products are produced in many locations (I guess this can mean two things -- a product that comprises parts from more than once manufacturer, and multiple manufacturing plants that produce the same product).* I suppose making the property non-unique would serve, although you wouldn't necessarily know which kind of multiple-location product you were getting. Recycling might be trickier, since what's recyclable varies hugely from region to region and time to time. Faye suggested off-line that a Certification property might be useful (things like "organic", various "green" certifications -- we could have a greenwashing base!, UL, etc.) Again, some of these might vary from manufacturing plant to manufacturing plant (Coke is only certified as kosher in some markets, for example). Jeff *Product With Ingredients already points out the fact that we'll need multiple topics for some products, like Coke**, where the ingredients vary from location to location. **No, I don't know why I'm obsessed with Coke today. I don't even drink it! ----- "Paul Mackay" wrote: > From: "Paul Mackay" > To: "Freebase data modeling mailing list" > Sent: Tuesday, June 16, 2009 5:02:03 PM GMT -08:00 US/Canada Pacific > Subject: Re: [Data-modeling] Products with ingredients > > I'd like to ask if it might be worth considering this a little > differently. What about a "Detailed product" schema that could include > ingredients (more aimed at all products not just food) but also other > factors as well? Things like manufacturing process, location, > recycling information, etc. Or is that too much in one schema? Also > labeling information could be useful, but perhaps that belongs in a > "Labeled product" schema. > > I'm basing this comment mostly on the ideas presented here > http://www.worldchanging.com/archives/007256.html . Also the "embodied > energy" information being stored in WattzOn here ( > http://www.wattzon.com/stuff ) could potentially be pulled in. > > paul > > > On Tue, Jun 16, 2009 at 4:37 PM, Jeff Prucher < jeff at metaweb.com > > wrote: > > > Thanks for your input, everyone! I'm taking the approach suggested > here by Faye (and a bunch of other people), and I like the idea of > adding a phylogeny pattern to the Ingredient type. Any suggestions for > what the parent/child property names should be? > > I don't have a good solution to the ordering problem of > ingredients-within-ingredients, but unless we insert a CVT and ask > people to explicitly enter the ingredient order, I don't know how much > we should rely on the index values, even if the client didn't have a > bug with the indexing. (Actually, an explicit order property would > work if we asked users to enter 1, 2, 3, 3a, 3b, 3c, 4, etc. for > ingredients within ingredients.) > > Jeff > > > ----- "Faye Harris" < faye at metaweb.com > wrote: > > > From: "Faye Harris" < faye at metaweb.com > > > > To: "Freebase data modeling mailing list" < > data-modeling at freebase.com > > > Sent: Tuesday, June 16, 2009 4:04:31 PM GMT -08:00 US/Canada Pacific > > > Subject: Re: [Data-modeling] Products with ingredients > > > > > > > Yay! Glad to see this making it to the data modeling mailing list > > since > > our discussions. > > > There are two things I'm seeing with my example data that don't > > quite work in the model, though, and I'm not quite sure what the > best > > way to resolve them is. One is the Corn Flakes ingredient "Milled > > corn". Should the Ingredient topic be "Milled Corn", should it just > be > > "Corn", or do we need a CVT to allow people to modify the ingredient > > ("Corn", "milled")? > > I'd be inclined to make "milled corn" a stand-alone topic. My main > > concerns regarding data like this are searchability and reusability. > > Users need to be able to link all products containing "milled corn" > to > > > > the ingredient, and searching by that ingredient should turn up all > of > > > > the products that use it. Secondly, in cooking and in chemistry, the > > process by which a base ingredient is enhanced or modified is > usually > > > > considered part of its identification and not, I feel, a CVT-level > > annotator. A recipe that calls for preserved plums cannot be > > duplicated > > with fresh plums, and for a consumer, fermented tofu becomes an > > acquired > > taste whereas regular tofu is widely accepted. Those, for the > purpose > > of > > identification and labeling, constitute completely different > > ingredients > > and should be modeled as such. > > > The toothpaste has this ingredient also: "sodium lauryl sulfate > > (from coconut oil)", which I think is the same issue. > > > > > SLS can be derived from coconut oil or palm kernel oil. I'm not sure > > if > > either could possibly be less of a skin irritant. Most products (I'd > > say > > 90+%?), however, don't specify from which their SLS is derived. The > > KISS > > method here would then produce three distinct topics: SLS, SLS (from > > coconut oil), and SLS (from palm kernel oil). But then wouldn't most > > topics link to the (unannotated) SLS? Almost seems like a hierarchy > is > > > > needed to set up a parent-children relationship between SLS and its > > two > > (slight) variations. > > > > The other one is ingredients within ingredients: the toothpaste > tube > > lists this ingredient: "fruit extracts (strawberry, banana, and > other > > natural flavors)". Treat as four separate ingredients, and punt on > the > > relationship? > > > +1 on treating as separate ingredients here. > > > > -- Faye > > _______________________________________________ > > > > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From jeff at metaweb.com Fri Jun 19 18:42:23 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Fri, 19 Jun 2009 11:42:23 -0700 (PDT) Subject: [Data-modeling] Products with ingredients In-Reply-To: <6586947.166861245436075490.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <121132659.228131245436943914.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> ----- "Faye Harris" wrote: > From: "Faye Harris" > To: "Freebase data modeling mailing list" > Sent: Thursday, June 18, 2009 2:59:19 PM GMT -08:00 US/Canada Pacific > Subject: Re: [Data-modeling] Products with ingredients > > Jeff Prucher wrote: > > ----- "Robert Cook" wrote: > > > > > >> One solution would be to create a topic with a long name -- enter > it > >> > >> exactly as it appears on the label such as "Enriched flour - > (wheat, > >> > >> niacin, iron, baby powder, sawdust, DDT)". > >> > > > > This would answer. Anyone else have any comments or thoughts on this > before I load the schema? > > > > The main problem with this is you can't arrive at the products that > use > enriched flour by clicking on a property link from a single "enriched > > flour" topic. Rather, you have to do a keyword search for products > based > on matching all the various "enriched flour - (foo, bar, bat, baz)" > ingredient topics with the words "enriched" and "flour". That's quite > a > loss in queriability. > > The schema is fine to get us started, but we're still going to try to > > put together some phylogeny pattern in place (in the near future) > right? I plan to add a phylogeny pattern before moving the schema to freebase.com, which should help queryability. It doesn't address the fact that topics named things like "enriched flour (that, that, the other thing)" are exceedingly ugly, however (no-one said it was called "prettybase.com", though). I was going to post a revised schema to sandbox, with the double-phylogeny pattern suggested by Robert, but it got horribly munged in the process. I'll try to fix it, but it might not be till next week. Jeff From iainsproat at gmail.com Fri Jun 19 19:01:50 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Fri, 19 Jun 2009 23:01:50 +0400 Subject: [Data-modeling] Products with ingredients In-Reply-To: <121132659.228131245436943914.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <6586947.166861245436075490.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <121132659.228131245436943914.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: "In terms of specifics, Manufacturing Process is an interesting idea; someone (Sprocket? Spencer?) has done some work with modeling processes already." process.freebase.com is where the process schema lives. It's fairly abstract, so I think can be used OK with ingredient manufacturing - the 'process actor ' type would allow you to note the factory location where the manufacturing is done. The schema is in its infancy, so I'd appreciate any feedback. Iain (sprocketonline) On Fri, Jun 19, 2009 at 10:42 PM, Jeff Prucher wrote: > > ----- "Faye Harris" wrote: > > > From: "Faye Harris" > > To: "Freebase data modeling mailing list" > > Sent: Thursday, June 18, 2009 2:59:19 PM GMT -08:00 US/Canada Pacific > > Subject: Re: [Data-modeling] Products with ingredients > > > > Jeff Prucher wrote: > > > ----- "Robert Cook" wrote: > > > > > > > > >> One solution would be to create a topic with a long name -- enter > > it > > >> > > >> exactly as it appears on the label such as "Enriched flour - > > (wheat, > > >> > > >> niacin, iron, baby powder, sawdust, DDT)". > > >> > > > > > > This would answer. Anyone else have any comments or thoughts on this > > before I load the schema? > > > > > > > The main problem with this is you can't arrive at the products that > > use > > enriched flour by clicking on a property link from a single "enriched > > > > flour" topic. Rather, you have to do a keyword search for products > > based > > on matching all the various "enriched flour - (foo, bar, bat, baz)" > > ingredient topics with the words "enriched" and "flour". That's quite > > a > > loss in queriability. > > > > The schema is fine to get us started, but we're still going to try to > > > > put together some phylogeny pattern in place (in the near future) > > right? > > I plan to add a phylogeny pattern before moving the schema to freebase.com, > which should help queryability. It doesn't address the fact that topics > named things like "enriched flour (that, that, the other thing)" are > exceedingly ugly, however (no-one said it was called "prettybase.com", > though). I was going to post a revised schema to sandbox, with the > double-phylogeny pattern suggested by Robert, but it got horribly munged in > the process. I'll try to fix it, but it might not be till next week. > > Jeff > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090619/8747c321/attachment.htm From tfmorris at gmail.com Fri Jun 19 23:11:09 2009 From: tfmorris at gmail.com (Tom Morris) Date: Fri, 19 Jun 2009 19:11:09 -0400 Subject: [Data-modeling] Products with ingredients In-Reply-To: References: <6586947.166861245436075490.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <121132659.228131245436943914.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: I think including processes is overkill for most uses. If a manufacturer cares about the process used to produce something, they'll specify it and give things different part numbers for things produced by different processes. For non-foodstuffs, the list of "ingredients" for manufactured things is often called a Bill of Materials (BOM) and everything is driven part numbers. Internally part numbers are assigned at a very fine level of granularity with two nominally identical components manufactured by different manufacturers being assigned different part numbers so that they can be tracked. Things which can change have a revision code as a piece of their part number. Parts which can be substituted for each other are specified explicitly. Obviously any part can have its own BOM, so these can be nested to arbitrary levels. The model number that a consumer sees like a Dell Dimension 7777 is just the tip of the iceberg. Internally within Dell, they know exactly which plant manufactured a given instance, what rev etch was used on the PCBs, what firmware was blasted into the ROMs, what revision of which marketing collateral went into the box, etc, etc. Something a simple as a new "Read Me First" letter in the box is under revision control and will cause versions of part numbers to get bumped. On the other hand, things which are considered equivalent can be substituted at will, so if they've decided that Samsung and Hynix SDRAM can be substituted for each other, this will be mostly invisible at the BOM level. I imagine that pretty much the same thing happens for food. What gets called "high fructose corn syrup" on a product label could be a variety of different ingredients from different sources that are tracked very explicitly by the food manufacturer. Some can be substituted for each other and some can't. As long as the nutritional contents are within bounds, none of this is visible to the consumer. It's only when there's a problem that people go back into the system and track back to which lot of a specific ingredient from what vendor was used in manufacturing a specific batch of food. I think the level of granularity that you want to track all this stuff at to start with is the end user level. If you start getting into the complexity that the manufacturers deal with, you'll get overwhelmed. Tom From spencerkelly86 at gmail.com Sat Jun 20 17:27:26 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Sat, 20 Jun 2009 14:27:26 -0300 Subject: [Data-modeling] Products with ingredients In-Reply-To: References: <6586947.166861245436075490.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <121132659.228131245436943914.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: On Fri, Jun 19, 2009 at 8:11 PM, Tom Morris wrote: > I think the level of granularity that you want to track all this stuff > at to start with is the end user level. If you start getting into the > complexity that the manufacturers deal with, you'll get overwhelmed. +1. this is handled with the materialtype, which 'product ingredient' should just cotype. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090620/4c1ab2ec/attachment.htm From spencerkelly86 at gmail.com Mon Jun 22 12:34:20 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Mon, 22 Jun 2009 09:34:20 -0300 Subject: [Data-modeling] space mission Message-ID: hello all, I'm feeling that the time has come to bulk up our space missionschema. if unfamiliar, the schema (one of our first) is well-built, but modest - and doesn't support modeling the things a space mission does outside of its 'Mission destination' property, which are quite numerous. Principally, i think we should now treat launches and landings as events. This is a smooth way to begin adding data here, and also most generous toward other types (crowd event, televised event, disaster.. ) that could be adding data here. A date property isn't sufficient for these subjects. aswell, i think mission should cotype project focus, for funding, design, and management information. space missions take part in other events, like flybys, dockings, accidents, particular experiments, diplomatic events, and spacewalks - which are all begging to become structured data. sounds like a ripe subject for a bit of brainstorming... -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090622/15b8f0e8/attachment.htm From jeff at metaweb.com Tue Jun 23 20:24:38 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 23 Jun 2009 13:24:38 -0700 Subject: [Data-modeling] Organization mergers Message-ID: Tfmorris just pointed out (in a thread here: ) that although we have a good schema for company mergers, we don't have a similar model on the Organization type. I think that simply replicating the merger schema from the /business/company type would work well and be uncontroversial, but I thought I would check here to see if anyone had further thoughts. I also wonder if we should have properties for Spun Off/Spun Off From, as we do on the Company type. Thoughts? Jeff Prucher Type Librarian & Ontologist Metaweb Technologies, Inc. From jeff at metaweb.com Tue Jun 23 21:18:15 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 23 Jun 2009 14:18:15 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <121132659.228131245436943914.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <7D281CB850B1461CB59ECE7398B9CC46@p4> OK, I've got the double-phylogeny pattern working now. Take a look here: http://www.sandbox-freebase.com/type/schema/business/product_ingredient And here's a table view of the ingredients of a breakfast cereal I found in the office kitchen: http://www.sandbox-freebase.com/view/user/jeff/default_domain/views/cranberr y_almond_crunch_ingredients I'm not really happy with the "variety of" and "generalization of" names, but I'm not coming up with anything better. Any suggestions would be most welcome. Jeff > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher > Sent: Friday, June 19, 2009 11:42 AM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] Products with ingredients > > > ----- "Faye Harris" wrote: > > > From: "Faye Harris" > > To: "Freebase data modeling mailing list" > > > Sent: Thursday, June 18, 2009 2:59:19 PM GMT -08:00 > US/Canada Pacific > > Subject: Re: [Data-modeling] Products with ingredients > > > > Jeff Prucher wrote: > > > ----- "Robert Cook" wrote: > > > > > > > > >> One solution would be to create a topic with a long name -- enter > > it > > >> > > >> exactly as it appears on the label such as "Enriched flour - > > (wheat, > > >> > > >> niacin, iron, baby powder, sawdust, DDT)". > > >> > > > > > > This would answer. Anyone else have any comments or > thoughts on this > > before I load the schema? > > > > > > > The main problem with this is you can't arrive at the products that > > use enriched flour by clicking on a property link from a single > > "enriched > > > > flour" topic. Rather, you have to do a keyword search for products > > based on matching all the various "enriched flour - (foo, bar, bat, > > baz)" > > ingredient topics with the words "enriched" and "flour". > That's quite > > a loss in queriability.> > > > The schema is fine to get us started, but we're still going > to try to > > > > put together some phylogeny pattern in place (in the near future) > > right? > > I plan to add a phylogeny pattern before moving the schema to > freebase.com, which should help queryability. It doesn't > address the fact that topics named things like "enriched > flour (that, that, the other thing)" are exceedingly ugly, > however (no-one said it was called "prettybase.com", though). > I was going to post a revised schema to sandbox, with the > double-phylogeny pattern suggested by Robert, but it got > horribly munged in the process. I'll try to fix it, but it > might not be till next week. > > Jeff > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From jeff at metaweb.com Tue Jun 23 21:32:41 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 23 Jun 2009 14:32:41 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: Message-ID: <5A63D50EC4E341D7B82AF35995914AFB@p4> _____ From: data-modeling-bounces at freebase.com [mailto:data-modeling-bounces at freebase.com] On Behalf Of Spencer Kelly Sent: Saturday, June 20, 2009 10:27 AM To: Freebase data modeling mailing list Subject: Re: [Data-modeling] Products with ingredients On Fri, Jun 19, 2009 at 8:11 PM, Tom Morris wrote: I think the level of granularity that you want to track all this stuff at to start with is the end user level. If you start getting into the complexity that the manufacturers deal with, you'll get overwhelmed. +1. this is handled with the material type, which 'product ingredient' should just cotype. OK, I think you're right, Tom. We can get users to enter data from a carton; beyond that, we'd need much more information than is easily available (in addition to the added complexities of the model). As far as cotyping "material" goes, I'm torn -- it has one of the phylogeny patterns we want already, and I suppose that all ingredients are technically materials, but the material type seems largely intended for engineering materials (hull material, etc.), which doesn't seem to be quite the same thing. Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090623/b0174f0c/attachment.htm From faye at metaweb.com Wed Jun 24 01:02:15 2009 From: faye at metaweb.com (Faye Harris) Date: Tue, 23 Jun 2009 18:02:15 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <7D281CB850B1461CB59ECE7398B9CC46@p4> References: <7D281CB850B1461CB59ECE7398B9CC46@p4> Message-ID: <4A417B17.3070009@metaweb.com> Very cool! That rice flour is called a "variety of" rice in the schema is indeed very odd. Based on the sandbox examples, this schema seems to use "variety of" for two types of relationships: 1) variety of, e.g. brown rice is a "variety of" rice 2) derived from, e.g. rice flour is "derived from" rice The former relationship is categorical, the latter relates to post-processing. -- Faye Jeff Prucher wrote: > OK, I've got the double-phylogeny pattern working now. Take a look here: > http://www.sandbox-freebase.com/type/schema/business/product_ingredient > > And here's a table view of the ingredients of a breakfast cereal I found in > the office kitchen: > http://www.sandbox-freebase.com/view/user/jeff/default_domain/views/cranberr > y_almond_crunch_ingredients > > I'm not really happy with the "variety of" and "generalization of" names, > but I'm not coming up with anything better. Any suggestions would be most > welcome. > > Jeff > > >> -----Original Message----- >> From: data-modeling-bounces at freebase.com >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher >> Sent: Friday, June 19, 2009 11:42 AM >> To: Freebase data modeling mailing list >> Subject: Re: [Data-modeling] Products with ingredients >> >> >> ----- "Faye Harris" wrote: >> >> >>> From: "Faye Harris" >>> To: "Freebase data modeling mailing list" >>> >> >> >>> Sent: Thursday, June 18, 2009 2:59:19 PM GMT -08:00 >>> >> US/Canada Pacific >> >>> Subject: Re: [Data-modeling] Products with ingredients >>> >>> Jeff Prucher wrote: >>> >>>> ----- "Robert Cook" wrote: >>>> >>>> >>>> >>>>> One solution would be to create a topic with a long name -- enter >>>>> >>> it >>> >>>>> exactly as it appears on the label such as "Enriched flour - >>>>> >>> (wheat, >>> >>>>> niacin, iron, baby powder, sawdust, DDT)". >>>>> >>>>> >>>> This would answer. Anyone else have any comments or >>>> >> thoughts on this >> >>> before I load the schema? >>> >>>> >>>> >>> The main problem with this is you can't arrive at the products that >>> use enriched flour by clicking on a property link from a single >>> "enriched >>> >>> flour" topic. Rather, you have to do a keyword search for products >>> based on matching all the various "enriched flour - (foo, bar, bat, >>> baz)" >>> ingredient topics with the words "enriched" and "flour". >>> >> That's quite >> >>> a loss in queriability.> > >>> The schema is fine to get us started, but we're still going >>> >> to try to >> >>> put together some phylogeny pattern in place (in the near future) >>> right? >>> > > >> I plan to add a phylogeny pattern before moving the schema to >> freebase.com, which should help queryability. It doesn't >> address the fact that topics named things like "enriched >> flour (that, that, the other thing)" are exceedingly ugly, >> however (no-one said it was called "prettybase.com", though). >> I was going to post a revised schema to sandbox, with the >> double-phylogeny pattern suggested by Robert, but it got >> horribly munged in the process. I'll try to fix it, but it >> might not be till next week. >> >> Jeff >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090623/107ab154/attachment.htm From spencerkelly86 at gmail.com Wed Jun 24 14:53:42 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Wed, 24 Jun 2009 11:53:42 -0300 Subject: [Data-modeling] Products with ingredients In-Reply-To: <4A417B17.3070009@metaweb.com> References: <7D281CB850B1461CB59ECE7398B9CC46@p4> <4A417B17.3070009@metaweb.com> Message-ID: > > > Based on the sandbox examples, this schema seems to use "variety of" for > two types of relationships: > 1) variety of, e.g. brown rice is a "variety of" rice > 2) derived from, e.g. rice flour is "derived from" rice > > The former relationship is categorical, the latter relates to > post-processing. > ha, ya faye, this is exactly what happened in developing the materials type. the two types of relationships you mentioned are technically the same, but are at a different granularity. 'variety of' just means they were 'derived from' the same thing at some point of time... ... the problem is that this is unrealistic to model - the differences in all the ingredients by the specific processes ('derived from') that made them, like 'milled for 3.5 hours and baked with bicarbonate' or something. so 'derived from', only applies to a few topics, and should be handled with the 'process' type, though i think this type still needs some work. (breath) i still vote that 'Variety of' and 'Generalisation of' should be mapped with the materials type. It will just mean renaming its properties... and its already a commons type. the distinction between a material and an product ingredient can be handled with our foodtype. one more note, will this type interact with dish ? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090624/390ee273/attachment-0001.htm From tfmorris at gmail.com Wed Jun 24 17:37:02 2009 From: tfmorris at gmail.com (Tom Morris) Date: Wed, 24 Jun 2009 13:37:02 -0400 Subject: [Data-modeling] Products with ingredients In-Reply-To: <4A417B17.3070009@metaweb.com> References: <7D281CB850B1461CB59ECE7398B9CC46@p4> <4A417B17.3070009@metaweb.com> Message-ID: I'm with Faye. It seems very weird to have rice flour and rice so strongly related. I don't consider rice flour to be a generalization of rice at all. About the only places where they would potentially interchangeable would be for nutritional information or for allergies. You might be able to substitute basmati rice for jasmine rice if you didn't care too much about the difference in texture or maintaining cultural authenticity, but if you substituted rice flour (of any variety), you'd be in a whole heap of trouble. The examples in the schema descriptions (yay for descriptions!) seem to have the same problem. You can get lavender oil out of a lavender plant, but they aren't generalizations of each other. If anything, the generalization would be aromatic oil or fragrance or something. For most applications, it's more useful to have things linked together because of common properties rather than because they are made from the same source material or by the same process. -1 for making this even more obscure by linking in Material. Tom On Tue, Jun 23, 2009 at 9:02 PM, Faye Harris wrote: > Very cool! > > That rice flour is called a "variety of" rice in the schema is indeed very > odd. > > Based on the sandbox examples, this schema seems to use "variety of" for > two types of relationships: > 1) variety of, e.g. brown rice is a "variety of" rice > 2) derived from, e.g. rice flour is "derived from" rice > > The former relationship is categorical, the latter relates to > post-processing. > > -- Faye > > > > Jeff Prucher wrote: > > OK, I've got the double-phylogeny pattern working now. Take a look here:http://www.sandbox-freebase.com/type/schema/business/product_ingredient > > And here's a table view of the ingredients of a breakfast cereal I found in > the office kitchen:http://www.sandbox-freebase.com/view/user/jeff/default_domain/views/cranberr > y_almond_crunch_ingredients > > I'm not really happy with the "variety of" and "generalization of" names, > but I'm not coming up with anything better. Any suggestions would be most > welcome. > > Jeff > > > > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com ] On Behalf Of Jeff Prucher > Sent: Friday, June 19, 2009 11:42 AM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] Products with ingredients > > > ----- "Faye Harris" wrote: > > > > From: "Faye Harris" > To: "Freebase data modeling mailing list" > > > > > Sent: Thursday, June 18, 2009 2:59:19 PM GMT -08:00 > > > US/Canada Pacific > > > Subject: Re: [Data-modeling] Products with ingredients > > Jeff Prucher wrote: > > > ----- "Robert Cook" wrote: > > > > > One solution would be to create a topic with a long name -- enter > > > it > > > exactly as it appears on the label such as "Enriched flour - > > > (wheat, > > > niacin, iron, baby powder, sawdust, DDT)". > > > > This would answer. Anyone else have any comments or > > > thoughts on this > > > before I load the schema? > > > > The main problem with this is you can't arrive at the products that > use enriched flour by clicking on a property link from a single > "enriched > > flour" topic. Rather, you have to do a keyword search for products > based on matching all the various "enriched flour - (foo, bar, bat, > baz)" > ingredient topics with the words "enriched" and "flour". > > > That's quite > > > a loss in queriability.> > > The schema is fine to get us started, but we're still going > > > to try to > > > put together some phylogeny pattern in place (in the near future) > right? > > > I plan to add a phylogeny pattern before moving the schema to freebase.com, which should help queryability. It doesn't > address the fact that topics named things like "enriched > flour (that, that, the other thing)" are exceedingly ugly, > however (no-one said it was called "prettybase.com", though). > I was going to post a revised schema to sandbox, with the > double-phylogeny pattern suggested by Robert, but it got > horribly munged in the process. I'll try to fix it, but it > might not be till next week. > > Jeff > _______________________________________________ > Data-modeling mailing listData-modeling at freebase.comhttp://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing listData-modeling at freebase.comhttp://lists.freebase.com/mailman/listinfo/data-modeling > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090624/c819d1b6/attachment.htm From jeff at metaweb.com Wed Jun 24 21:17:54 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Wed, 24 Jun 2009 14:17:54 -0700 Subject: [Data-modeling] Products with ingredients Message-ID: <45A11A4F7D4248D584F0A2299E7C2FBD@amd> There are three examples upthread that led to the phylogeny pattern, each of which is a slightly different case: (variety) <--> (generalization) Milled corn <--> Corn Sodium lauryl sulfate (from coconut oil) <--> Sodium lauryl sulfate Enriched flour (foo, bar, bazz, fazz) <--> Enriched flour Faye's division fits this pretty well: Milled corn is derived from corn; SLS (from coconut) is a variety of SLS, and is also derived from coconut; enriched flour (etc., usw) is a variety of enriched flour. (Reviewing this thread, I note that Ed suggested a Processed Ingredient type way back at the outset.) The big question is, would we be asking for trouble by adding a parent/child relationship to this, in addition to the two phylogeny patterns? Or should we just punt it for now? Jeff > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Tom Morris > Sent: Wednesday, June 24, 2009 10:37 AM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] Products with ingredients > > I'm with Faye. It seems very weird to have rice flour and rice so > strongly related. I don't consider rice flour to be a generalization > of rice at all. About the only places where they would potentially > interchangeable would be for nutritional information or for allergies. > You might be able to substitute basmati rice for jasmine rice if you > didn't care too much about the difference in texture or maintaining > cultural authenticity, but if you substituted rice flour (of any > variety), you'd be in a whole heap of trouble. > > > The examples in the schema descriptions (yay for > descriptions!) seem to have the same problem. You can get > lavender oil out of a lavender plant, but they aren't > generalizations of each other. If anything, the > generalization would be aromatic oil or fragrance or something. > > For most applications, it's more useful to have things linked > together because of common properties rather than because > they are made from the same source material or by the same process. > > -1 for making this even more obscure by linking in Material. > > Tom > > On Tue, Jun 23, 2009 at 9:02 PM, Faye Harris wrote: > > > Very cool! > > That rice flour is called a "variety of" rice in the > schema is indeed very odd. > > Based on the sandbox examples, this schema seems to use > "variety of" for two types of relationships: > 1) variety of, e.g. brown rice is a "variety of" rice > 2) derived from, e.g. rice flour is "derived from" rice > > The former relationship is categorical, the latter > relates to post-processing. > > -- Faye > > > > Jeff Prucher wrote: > > OK, I've got the double-phylogeny pattern > working now. Take a look here: > > http://www.sandbox-freebase.com/type/schema/business/product_i ngredient > > And here's a table view of the ingredients of a > breakfast cereal I found in > the office kitchen: > > http://www.sandbox-freebase.com/view/user/jeff/default_domain/ views/cranberr > y_almond_crunch_ingredients > > I'm not really happy with the "variety of" and > "generalization of" names, > but I'm not coming up with anything better. Any > suggestions would be most > welcome. > > Jeff > > > > -----Original Message----- > From: data-modeling-bounces at freebase.com > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher > Sent: Friday, June 19, 2009 11:42 AM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] Products > with ingredients > > > ----- "Faye Harris" > wrote: > > > > From: "Faye Harris" > > To: "Freebase data modeling > mailing list" > > > > > > > Sent: Thursday, June 18, 2009 > 2:59:19 PM GMT -08:00 > > > US/Canada Pacific > > > Subject: Re: [Data-modeling] > Products with ingredients > > Jeff Prucher wrote: > > > ----- "Robert Cook" > wrote: > > > > > One solution would be > to create a topic with a long name -- enter > > > it > > > exactly as it appears > on the label such as "Enriched flour - > > > (wheat, > > > niacin, iron, baby > powder, sawdust, DDT)". > > > > This would answer. > Anyone else have any comments or > > > thoughts on this > > > before I load the schema? > > > > > > The main problem with this is > you can't arrive at the products that > use enriched flour by clicking > on a property link from a single > "enriched > > flour" topic. Rather, you have > to do a keyword search for products > based on matching all the > various "enriched flour - (foo, bar, bat, > baz)" > ingredient topics with the > words "enriched" and "flour". > > > That's quite > > > a loss in queriability.> > > The schema is fine to get us > started, but we're still going > > > to try to > > > put together some phylogeny > pattern in place (in the near future) > right? > > > > > I plan to add a phylogeny pattern > before moving the schema to > freebase.com, which should help > queryability. It doesn't > address the fact that topics named > things like "enriched > flour (that, that, the other thing)" > are exceedingly ugly, > however (no-one said it was called > "prettybase.com", though). > I was going to post a revised schema > to sandbox, with the > double-phylogeny pattern suggested by > Robert, but it got > horribly munged in the process. I'll > try to fix it, but it > might not be till next week. > > Jeff > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > > From kirrily at metaweb.com Wed Jun 24 22:52:05 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Wed, 24 Jun 2009 15:52:05 -0700 Subject: [Data-modeling] Fwd: Help wanted: consumer medical ontology References: Message-ID: Mike Shwe asked me to forward this to you. He's looking for people with experience in the medicine field, to help with filing out consumer-oriented medical information, eg. diseases and symptoms. If anyone's interested, please contact him directly. K. Begin forwarded message: > From: Mike Shwe > Date: June 12, 2009 4:48:17 PM PDT > To: Kirrily Robert > Subject: Help wanted: consumer medical ontology > > Kirrily, > > We?re looking for Freebase community members to contribute to / > medicine, in a consumer medical ontology? filling out the topics and > properties that we?ve currently got in diseases, symptoms, risk > factors, treatments, and medical specialties. Reconciliation with > other medical ontologies and coding systems will be included. > > In contrast to the UMLS (http://www.nlm.nih.gov/research/umls/), > which is more targeted at medical professionals, we?re looking for > information geared towards consumers, in terms of a lay-person?s > vocabulary and level of abstraction. > > Community members that are interested should let us know how much > they think they?ll be able to contribute to the effort, e.g., > disease profiles (symptoms, risk factors, treatments) for X > diseases, disease ontologies for some field of medicine, or some > other measurable piece of the consumer medical area. > > Thanks, > Mike > > > > > > > -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090624/cbd25ff6/attachment-0001.htm From duncan.oliver at gmail.com Fri Jun 26 20:11:57 2009 From: duncan.oliver at gmail.com (Duncan Oliver) Date: Fri, 26 Jun 2009 15:11:57 -0500 Subject: [Data-modeling] Proposed Type (cvg): Computer Game Distribution Channel Message-ID: Posted about this about a month ago in discussion. Thought I would propose a fix. The thread and my proposal are here: http://www.freebase.com/view/guid/9202a8c04000641f80000000083da842 I think a new type needs to be made for services like Xbox Live and WiiWare to illustrate how a game is deliver. These aren't really platforms for games, and should not be typed as such. --- Duncan From faye at metaweb.com Fri Jun 26 21:24:04 2009 From: faye at metaweb.com (Faye Harris) Date: Fri, 26 Jun 2009 14:24:04 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <45A11A4F7D4248D584F0A2299E7C2FBD@amd> References: <45A11A4F7D4248D584F0A2299E7C2FBD@amd> Message-ID: <4A453C74.9000403@metaweb.com> +1 on adding the parent-child relationship so that we'd have both "variety of" and "derives from" properties, instead of just the former for both purposes. -- Faye Jeff Prucher wrote: > There are three examples upthread that led to the phylogeny pattern, each of > which is a slightly different case: > > (variety) <--> (generalization) > Milled corn <--> Corn > Sodium lauryl sulfate (from coconut oil) <--> Sodium lauryl sulfate Enriched > flour (foo, bar, bazz, fazz) <--> Enriched flour > > Faye's division fits this pretty well: > Milled corn is derived from corn; SLS (from coconut) is a variety of SLS, > and is also derived from coconut; enriched flour (etc., usw) is a variety of > enriched flour. (Reviewing this thread, I note that Ed suggested a Processed > Ingredient type way back at the outset.) > > The big question is, would we be asking for trouble by adding a parent/child > relationship to this, in addition to the two phylogeny patterns? Or should > we just punt it for now? > > Jeff > > > >> -----Original Message----- >> From: data-modeling-bounces at freebase.com >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Tom Morris >> Sent: Wednesday, June 24, 2009 10:37 AM >> To: Freebase data modeling mailing list >> Subject: Re: [Data-modeling] Products with ingredients >> >> I'm with Faye. It seems very weird to have rice flour and rice so >> strongly related. I don't consider rice flour to be a generalization >> of rice at all. About the only places where they would potentially >> interchangeable would be for nutritional information or for allergies. >> You might be able to substitute basmati rice for jasmine rice if you >> didn't care too much about the difference in texture or maintaining >> cultural authenticity, but if you substituted rice flour (of any >> variety), you'd be in a whole heap of trouble. >> >> >> The examples in the schema descriptions (yay for >> descriptions!) seem to have the same problem. You can get >> lavender oil out of a lavender plant, but they aren't >> generalizations of each other. If anything, the >> generalization would be aromatic oil or fragrance or something. >> >> For most applications, it's more useful to have things linked >> together because of common properties rather than because >> they are made from the same source material or by the same process. >> >> -1 for making this even more obscure by linking in Material. >> >> Tom >> >> On Tue, Jun 23, 2009 at 9:02 PM, Faye Harris wrote: >> >> >> Very cool! >> >> That rice flour is called a "variety of" rice in the >> schema is indeed very odd. >> >> Based on the sandbox examples, this schema seems to use >> "variety of" for two types of relationships: >> 1) variety of, e.g. brown rice is a "variety of" rice >> 2) derived from, e.g. rice flour is "derived from" rice >> >> The former relationship is categorical, the latter >> relates to post-processing. >> >> -- Faye >> >> >> >> Jeff Prucher wrote: >> >> OK, I've got the double-phylogeny pattern >> working now. Take a look here: >> >> http://www.sandbox-freebase.com/type/schema/business/product_i >> > ngredient > >> >> And here's a table view of the ingredients of a >> breakfast cereal I found in >> the office kitchen: >> >> http://www.sandbox-freebase.com/view/user/jeff/default_domain/ >> > views/cranberr > >> y_almond_crunch_ingredients >> >> I'm not really happy with the "variety of" and >> "generalization of" names, >> but I'm not coming up with anything better. Any >> suggestions would be most >> welcome. >> >> Jeff >> >> >> >> -----Original Message----- >> From: data-modeling-bounces at freebase.com >> >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher >> Sent: Friday, June 19, 2009 11:42 AM >> To: Freebase data modeling mailing list >> Subject: Re: [Data-modeling] Products >> with ingredients >> >> >> ----- "Faye Harris" >> wrote: >> >> >> >> From: "Faye Harris" >> >> To: "Freebase data modeling >> mailing list" >> >> >> >> >> >> >> Sent: Thursday, June 18, 2009 >> 2:59:19 PM GMT -08:00 >> >> >> US/Canada Pacific >> >> >> Subject: Re: [Data-modeling] >> Products with ingredients >> >> Jeff Prucher wrote: >> >> >> ----- "Robert Cook" >> wrote: >> >> >> >> >> One solution would be >> to create a topic with a long name -- enter >> >> >> it >> >> >> exactly as it appears >> on the label such as "Enriched flour - >> >> >> (wheat, >> >> >> niacin, iron, baby >> powder, sawdust, DDT)". >> >> >> >> This would answer. >> Anyone else have any comments or >> >> >> thoughts on this >> >> >> before I load the schema? >> >> >> >> >> >> The main problem with this is >> you can't arrive at the products that >> use enriched flour by clicking >> on a property link from a single >> "enriched >> >> flour" topic. Rather, you have >> to do a keyword search for products >> based on matching all the >> various "enriched flour - (foo, bar, bat, >> baz)" >> ingredient topics with the >> words "enriched" and "flour". >> >> >> That's quite >> >> >> a loss in queriability.> > >> The schema is fine to get us >> started, but we're still going >> >> >> to try to >> >> >> put together some phylogeny >> pattern in place (in the near future) >> right? >> >> >> >> >> I plan to add a phylogeny pattern >> before moving the schema to >> freebase.com, which should help >> queryability. It doesn't >> address the fact that topics named >> things like "enriched >> flour (that, that, the other thing)" >> are exceedingly ugly, >> however (no-one said it was called >> "prettybase.com", though). >> I was going to post a revised schema >> to sandbox, with the >> double-phylogeny pattern suggested by >> Robert, but it got >> horribly munged in the process. I'll >> try to fix it, but it >> might not be till next week. >> >> Jeff >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> >> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> >> >> >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> >> >> >> >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > From stefano at metaweb.com Fri Jun 26 22:42:10 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Fri, 26 Jun 2009 15:42:10 -0700 Subject: [Data-modeling] Freebase Schemas weirdness Message-ID: <4A454EC2.9090405@metaweb.com> While working on the schema explorer I came across a bunch of things that don't look quite right, notably: 1) /atom is a domain about.... feeds. Am I the only one to think that is really a bad name for it? (I understand this was created a long time ago when the graph was small, but still) 2) /nytimes, /wikipedia and /imbd are domains that contain one type each and no instances.... are these really domains (they are typed as so) or just key namespaces? 3) why aren't /pipeline and /dataworld a subdomain of /freebase like /freebase/apps is? 4) why is /cricket not co-typed "/freebase/domain_profile" like all the other consumer domains are? 5) why are there so many special namespaces off of the root node? are they used or are they vestigial? see http://tinyurl.com/qqenbz Thanks! -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From jeff at metaweb.com Fri Jun 26 23:37:48 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Fri, 26 Jun 2009 16:37:48 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <4A453C74.9000403@metaweb.com> Message-ID: <32473E87FA9E488982DE6C26F0EA5194@p4> Here's a revised schema with the derives from/derivative properties. Lemme know whatcha think: Jeff > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Faye Harris > Sent: Friday, June 26, 2009 2:24 PM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] Products with ingredients > > +1 on adding the parent-child relationship so that we'd have both > "variety of" and "derives from" properties, instead of just > the former for both purposes. > > -- Faye > > > Jeff Prucher wrote: > > There are three examples upthread that led to the phylogeny > pattern, > > each of which is a slightly different case: > > > > (variety) <--> (generalization) > > Milled corn <--> Corn > > Sodium lauryl sulfate (from coconut oil) <--> Sodium lauryl sulfate > > Enriched flour (foo, bar, bazz, fazz) <--> Enriched flour > > > > Faye's division fits this pretty well: > > Milled corn is derived from corn; SLS (from coconut) is a > variety of > > SLS, and is also derived from coconut; enriched flour > (etc., usw) is a > > variety of enriched flour. (Reviewing this thread, I note that Ed > > suggested a Processed Ingredient type way back at the outset.) > > > > The big question is, would we be asking for trouble by adding a > > parent/child relationship to this, in addition to the two phylogeny > > patterns? Or should we just punt it for now? > > > > Jeff > > > > > > > >> -----Original Message----- > >> From: data-modeling-bounces at freebase.com > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Tom Morris > >> Sent: Wednesday, June 24, 2009 10:37 AM > >> To: Freebase data modeling mailing list > >> Subject: Re: [Data-modeling] Products with ingredients > >> > >> I'm with Faye. It seems very weird to have rice flour and rice so > >> strongly related. I don't consider rice flour to be a > generalization > >> of rice at all. About the only places where they would > potentially > >> interchangeable would be for nutritional information or > for allergies. > >> You might be able to substitute basmati rice for jasmine > rice if you > >> didn't care too much about the difference in texture or > maintaining > >> cultural authenticity, but if you substituted rice flour (of any > >> variety), you'd be in a whole heap of trouble. > >> > >> > >> The examples in the schema descriptions (yay for > >> descriptions!) seem to have the same problem. You can get > lavender > >> oil out of a lavender plant, but they aren't > generalizations of each > >> other. If anything, the generalization would be aromatic oil or > >> fragrance or something. > >> > >> For most applications, it's more useful to have things linked > >> together because of common properties rather than because they are > >> made from the same source material or by the same process. > >> > >> -1 for making this even more obscure by linking in Material. > >> > >> Tom > >> > >> On Tue, Jun 23, 2009 at 9:02 PM, Faye Harris > wrote: > >> > >> > >> Very cool! > >> > >> That rice flour is called a "variety of" rice in the schema is > >> indeed very odd. > >> > >> Based on the sandbox examples, this schema seems to use > "variety of" > >> for two types of relationships: > >> 1) variety of, e.g. brown rice is a "variety of" rice > >> 2) derived from, e.g. rice flour is "derived from" rice > >> > >> The former relationship is categorical, the latter relates to > >> post-processing. > >> > >> -- Faye > >> > >> > >> > >> Jeff Prucher wrote: > >> > >> OK, I've got the double-phylogeny pattern > working now. Take a look > >> here: > >> > >> http://www.sandbox-freebase.com/type/schema/business/product_i > >> > > ngredient > > > >> > >> And here's a table view of the ingredients of a > breakfast cereal I > >> found in > >> the office kitchen: > >> > >> http://www.sandbox-freebase.com/view/user/jeff/default_domain/ > >> > > views/cranberr > > > >> y_almond_crunch_ingredients > >> > >> I'm not really happy with the "variety of" and > "generalization of" > >> names, > >> but I'm not coming up with anything better. Any > suggestions would > >> be most > >> welcome. > >> > >> Jeff > >> > >> > >> > >> -----Original Message----- > >> From: data-modeling-bounces at freebase.com > >> > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of > Jeff Prucher > >> Sent: Friday, June 19, 2009 11:42 AM > >> To: Freebase data modeling mailing list > >> Subject: Re: [Data-modeling] Products > with ingredients > >> > >> > >> ----- "Faye Harris" > > >> wrote: > >> > >> > >> > >> From: "Faye Harris" > >> > >> To: "Freebase data modeling > >> mailing list" > >> > >> > >> > >> > >> > >> > >> Sent: Thursday, June 18, 2009 > >> 2:59:19 PM GMT -08:00 > >> > >> > >> US/Canada Pacific > >> > >> > >> Subject: Re: [Data-modeling] > >> Products with ingredients > >> > >> Jeff Prucher wrote: > >> > >> > >> ----- "Robert Cook" > >> wrote: > >> > >> > >> > >> > >> One solution would be > >> to create a topic with a long name -- enter > >> > >> > >> it > >> > >> > >> exactly as it appears > >> on the label such as "Enriched flour - > >> > >> > >> (wheat, > >> > >> > >> niacin, iron, baby > >> powder, sawdust, DDT)". > >> > >> > >> > >> This would answer. > >> Anyone else have any comments or > >> > >> > >> thoughts on this > >> > >> > >> before I load the schema? > >> > >> > >> > >> > >> > >> The main problem with this is > >> you can't arrive at the products that > >> use enriched flour by clicking > >> on a property link from a single > >> "enriched > >> > >> flour" topic. Rather, you have > >> to do a keyword search for products > >> based on matching all the > >> various "enriched flour - (foo, bar, bat, > >> baz)" > >> ingredient topics with the > >> words "enriched" and "flour". > >> > >> > >> That's quite > >> > >> > >> a loss in queriability.> > > >> The schema is fine to get us > >> started, but we're still going > >> > >> > >> to try to > >> > >> > >> put together some phylogeny > >> pattern in place (in the near future) > >> right? > >> > >> > >> > >> > >> I plan to add a phylogeny pattern > >> before moving the schema to > >> freebase.com, which should help > >> queryability. It doesn't > >> address the fact that topics named > >> things like "enriched > >> flour (that, that, the other thing)" > >> are exceedingly ugly, > >> however (no-one said it was called > >> "prettybase.com", though). > >> I was going to post a revised schema > >> to sandbox, with the > >> double-phylogeny pattern suggested by > >> Robert, but it got > >> horribly munged in the process. I'll > >> try to fix it, but it > >> might not be till next week. > >> > >> Jeff > >> _______________________________________________ > >> Data-modeling mailing list > >> Data-modeling at freebase.com > >> > >> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > >> > >> > >> _______________________________________________ > >> Data-modeling mailing list > >> Data-modeling at freebase.com > >> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > >> > >> > >> > >> > >> _______________________________________________ > >> Data-modeling mailing list > >> Data-modeling at freebase.com > >> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > >> > >> > >> > >> > >> > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From vishal at metaweb.com Fri Jun 26 23:41:42 2009 From: vishal at metaweb.com (Vishal Talwar) Date: Fri, 26 Jun 2009 16:41:42 -0700 (PDT) Subject: [Data-modeling] Proposed Type (cvg): Computer Game Distribution Channel In-Reply-To: Message-ID: <1840422291.292011246059702854.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Hi Duncan, You probably saw my +1 response to your post, but just in case, I'm +1-ing it again here. Also, I've asked Kirrily and Jeff about making you an admin of the CVG domain and they both think it's a great idea; if you'd like to be one, it's all yours :) If not I'll be happy to make the changes you've suggested (assuming no interested party objects). If you want to know what you're getting into, here's a help page on the subject: http://www.freebase.com/view/guid/9202a8c04000641f800000000b75f213 Please let me know what you decide. - Vishal [/user/vtalwar] ----- Original Message ----- From: "Duncan Oliver" To: "data-modeling" Sent: Friday, June 26, 2009 1:11:57 PM GMT -08:00 US/Canada Pacific Subject: [Data-modeling] Proposed Type (cvg): Computer Game Distribution Channel Posted about this about a month ago in discussion. Thought I would propose a fix. The thread and my proposal are here: http://www.freebase.com/view/guid/9202a8c04000641f80000000083da842 I think a new type needs to be made for services like Xbox Live and WiiWare to illustrate how a game is deliver. These aren't really platforms for games, and should not be typed as such. --- Duncan _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling From roland.bouman at gmail.com Sun Jun 28 19:46:24 2009 From: roland.bouman at gmail.com (Roland Bouman) Date: Sun, 28 Jun 2009 21:46:24 +0200 Subject: [Data-modeling] graph, nodes, relationships - a few questions Message-ID: Hi All! this is the first time I'm posting on the data-modeling list - i hope this is the right place. I am trying to understand MQL so I'm reading the MQL Reference Guide. I have a few questions about a few things I read, and I was hoping to get some help here. In particular, I am a tad confused about some of the things I read in http://mql.freebaseapps.com/ch02.html In "2.1. Nodes and Relationships" (http://mql.freebaseapps.com/ch02.html#id2942817) it reads: "...the Metaweb graph is a set of nodes and a set of links or relationships between those nodes. ....however, the nodes in the graph hold no information themselves. ....All the interesting data in the database is stored in the form of relationships between nodes (or between nodes and primitive values). ...Graphs can be represented visually using circles to represent nodes and arrows between the circles to represent relationships. " So far, so good. As I continue reading "2.2. Properties" (http://mql.freebaseapps.com/ch02.html#id2943383), I am confused by this: "...You may have wondered about the fact that these property identifiers look so much like node identifiers. ....this means, of course, [..] that properties are themselves nodes in the Metaweb graph. ....Since properties are nodes, they can appear in the From column of a table of tuples, and can have relationships themselves." Now, from what I understand, these "properties" are actually relationships (it's just represented as property in the tabular representation of the graph). But to me this sounds like it contradicts 2.1: One the one hand, nodes hold no information themselves, rather the relationships between nodes hold the interesting information, yet at the same time, relationships are themselves actually nodes. I also have trouble how this would be visualized "using circles to represent nodes and arrows between the circles to represent relationships". I mean, I can see how it is possible to still visualize a graph if you elect beforehand which nodes to draw as circles, and which ones as connecting lines, but my gut feeling says you cannot visually represent the same node both as a cirle and as a connecting line without duplication. Just to be clear - I am not interested in actually drawing graphs - I am just looking for a way to understand this node/relationship duality. I'd greatly appreciate any pointers / explanations. kind regards, -- Roland Bouman http://rpbouman.blogspot.com/ Author of "Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL", http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470484322.html From narphorium at gmail.com Sun Jun 28 20:33:20 2009 From: narphorium at gmail.com (Shawn Simister) Date: Sun, 28 Jun 2009 16:33:20 -0400 Subject: [Data-modeling] graph, nodes, relationships - a few questions In-Reply-To: References: Message-ID: <4A47D390.5090500@gmail.com> Roland Bouman wrote: > Hi All! > > this is the first time I'm posting on the data-modeling list - i hope > this is the right place. > > I am trying to understand MQL so I'm reading the MQL Reference Guide. > I have a few questions about a few things I read, and I was hoping to > get some help here. > In particular, I am a tad confused about some of the things I read in > http://mql.freebaseapps.com/ch02.html > > In "2.1. Nodes and Relationships" > (http://mql.freebaseapps.com/ch02.html#id2942817) it reads: > > "...the Metaweb graph is a set of nodes and a set of links or > relationships between those nodes. > ....however, the nodes in the graph hold no information themselves. > ....All the interesting data in the database is stored in the form of > relationships between nodes (or between nodes and primitive values). > ...Graphs can be represented visually using circles to represent nodes > and arrows between the circles to represent relationships. > " > > So far, so good. As I continue reading "2.2. Properties" > (http://mql.freebaseapps.com/ch02.html#id2943383), I am confused by > this: > > "...You may have wondered about the fact that these property > identifiers look so much like node identifiers. > ....this means, of course, [..] that properties are themselves nodes > in the Metaweb graph. > ....Since properties are nodes, they can appear in the From column of > a table of tuples, and can have relationships themselves." > > Now, from what I understand, these "properties" are actually > relationships (it's just represented as property in the tabular > representation of the graph). > But to me this sounds like it contradicts 2.1: > > One the one hand, nodes hold no information themselves, rather the > relationships between nodes hold the interesting information, yet at > the same time, relationships are themselves actually nodes. > I also have trouble how this would be visualized "using circles to > represent nodes and arrows between the circles to represent > relationships". > I mean, I can see how it is possible to still visualize a graph if you > elect beforehand which nodes to draw as circles, and which ones as > connecting lines, > but my gut feeling says you cannot visually represent the same node > both as a cirle and as a connecting line without duplication. > > Just to be clear - I am not interested in actually drawing graphs - I > am just looking for a way to understand this node/relationship > duality. > > I'd greatly appreciate any pointers / explanations. > > kind regards, > > Check out the Thinkbase app that uses Freebase data to draw the sort of graph diagrams that are talked about in chapter 2. For example, see the graph for the movie Transformers which was directed by Michael Bay. You can see that both the Transformers node and the Michael Bay node are connected with a line and if you mouse over that line it will say "Directed By". As you've correctly read, the Directed By property itself is also stored in the graph and can be represented as it's own graph in Thinkbase. To show both of these graphs in the same diagram would certainly be confusing because you would need to show a lot more detail than what is currently being shown. For example, in the Transformers diagram, each line is actually what Freebase calls a Link which is a node unto itself. To accurately show the underlying structure of the Freebase graph, each link should really be drawn as its own node with lines connectimg to the source, target and and the master property nodes. However, in most cases it makes more sense to abstract this detail away from the user to give them a simpler view of the graph that is closer the the object-oriented models that most people are used to working with. Don't worry if this makes your head hurt. It should. It's pretty geeky stuff that probably took some really smart people a long time to design. Shawn -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090628/2bae3a48/attachment.htm From jg at metaweb.com Sun Jun 28 20:38:56 2009 From: jg at metaweb.com (John Giannandrea) Date: Sun, 28 Jun 2009 13:38:56 -0700 Subject: [Data-modeling] graph, nodes, relationships - a few questions In-Reply-To: References: Message-ID: Roland Bouman wrote: > One the one hand, nodes hold no information themselves, rather the > relationships between nodes hold the interesting information, yet at > the same time, relationships are themselves actually nodes. The confusion stems from what we call primitives, from which nodes and links are made. You can read more about this here: http://blog.freebase.com/2008/04/09/a-brief-tour-of-graphd/ Basically we have a database of primitives, from which we make a directed graph (nodes and links) and then MQL tries hard to hide the graph and expose objects with properties. All the properties are links and all the objects are nodes. But everything is a primitive. -jg From spencerkelly86 at gmail.com Mon Jun 29 14:37:54 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Mon, 29 Jun 2009 11:37:54 -0300 Subject: [Data-modeling] the commons / non distinction is a farce Message-ID: an undiscussed part of the new layout is that only commons types and their data are displayed on the browse page. a fair choice, especially considering we have (so far) decided not to delete the duplicate, silly, or otherwise junk types. though, by determining if anyone will see the work we do, the commons distinction has now become political. in all fairness, there are some really thoughtful types outside the commons, and some pretty flakey types within. We need a fair way for our good types to 'rise out of the junk', and a responsible (and scalable!) way to do this is allowing users to vote to promote or demote specific types. is metaweb ready to democratize control of what makes the browse page and what doesn't? if implemented, schema building would instantly become more collaborative, goal-oriented, and users would be better motivated to maintain and defend their ontologies. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090629/cba8b542/attachment.htm From duncan.oliver at gmail.com Mon Jun 29 17:21:14 2009 From: duncan.oliver at gmail.com (Duncan Oliver) Date: Mon, 29 Jun 2009 12:21:14 -0500 Subject: [Data-modeling] Proposed Type (cvg): Computer Game Distribution Channel In-Reply-To: <1840422291.292011246059702854.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <1840422291.292011246059702854.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: K, I've made a type in the sandbox to illustrate the idea, feel free to make suggestions for additions (or subtractions). Here's a view of some sample topics: http://www.sandbox-freebase.com/view/cvg/computer_game_distribution_system and here's some games with versions that use the new "Distributed through" property: http://www.sandbox-freebase.com/view/user/drakecaiman/default_domain/views/computer_games_with_distribution_systems - Duncan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090629/363dd3ac/attachment.htm From spatial.db at gmail.com Mon Jun 29 18:08:12 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Mon, 29 Jun 2009 14:08:12 -0400 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: Message-ID: +1! I find the current approach to commons prioritization pretty frustrating. It just add extra steps and confusion for me and anyone else using my types. I try to use commons types whenever I can, but I'm increasingly finding commons types and properties that duplicate what I've already modeled. There is therefore an increased burden on me to repeatedly refactor my models and migrate data if I use them. -Ed On Mon, Jun 29, 2009 at 10:37 AM, Spencer Kelly wrote: > an undiscussed part of the new layout is that only commons types and their > data are displayed on the browse page. > a fair choice, especially considering we have (so far) decided not to > delete the duplicate, silly, or otherwise junk types. > though, by determining if anyone will see the work we do, the commons > distinction has now become political. > in all fairness, there are some really thoughtful types outside the > commons, and some pretty flakey types within. > We need a fair way for our good types to 'rise out of the junk', and a > responsible (and scalable!) way to do this is allowing users to vote to > promote or demote specific types. > > is metaweb ready to democratize control of what makes the browse page and > what doesn't? > if implemented, schema building would instantly become more collaborative, > goal-oriented, and users would be better motivated to maintain and defend > their ontologies. > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090629/9b19a109/attachment-0001.htm From stefano at metaweb.com Mon Jun 29 18:20:49 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Mon, 29 Jun 2009 11:20:49 -0700 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: Message-ID: <4A490601.7090200@metaweb.com> Spencer Kelly wrote: > an undiscussed part of the new layout is that only commons types and > their data are displayed on the browse page. > a fair choice, especially considering we have (so far) decided not to > delete the duplicate, silly, or otherwise junk types. > though, by determining if anyone will see the work we do, the commons > distinction has now become political. > in all fairness, there are some really thoughtful types outside the > commons, and some pretty flakey types within. > We need a fair way for our good types to 'rise out of the junk', and a > responsible (and scalable!) way to do this is allowing users to vote to > promote or demote specific types. > > is metaweb ready to democratize control of what makes the browse page > and what doesn't? Say it were and you were in charge of making it happen: how would you do it? I'm not defensive, I'm honestly curious. > if implemented, schema building would instantly become more > collaborative, goal-oriented, and users would be better motivated to > maintain and defend their ontologies. -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From kirrily at metaweb.com Mon Jun 29 18:39:49 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Mon, 29 Jun 2009 11:39:49 -0700 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: <4A490601.7090200@metaweb.com> References: <4A490601.7090200@metaweb.com> Message-ID: <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> On Jun 29, 2009, at 11:20 AM, Stefano Mazzocchi wrote: > > Say it were and you were in charge of making it happen: how would > you do > it? > > I'm not defensive, I'm honestly curious. This was what I was going to ask too :) We've talked about this a LOT internally and it's actually an extremely hard problem (or at least we find it to be so). We would really welcome suggestions for the design of a more collaborative way of doing schema that doesn't break things for app developers. Even just a workflow of "I'm a schema developer, here's what I would like to do, step by step" would help. Meanwhile, there are some things we can do to help with the main annoyances, I think. The email notification system that just came live is a help as it allows us to contact schema admins more easily. There's a new Freebase Suggest in the works which I hope will make it easier to distinguish between multiple types with the same name. Acre code search is likely to be increasingly useful for seeing just how many people are using a certain schema in their apps (it won't catch non-Acre apps, but as the number of apps grows, it will at least give us *some* idea.) K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ From iainsproat at gmail.com Mon Jun 29 20:19:45 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Tue, 30 Jun 2009 00:19:45 +0400 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> References: <4A490601.7090200@metaweb.com> <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> Message-ID: > > There's a new Freebase Suggest in the works which I hope will make it > easier to distinguish between multiple types with the same name +1, the number of times I've almost typed Politicians as Brazilian Politicians .... The email notification system that just came > live is a help as it allows us to contact schema admins more easily. I've noticed emails making a big difference already. good stuff! Brainstorming, I've come up with the following ideas: 1. A lot of types I create are just drafts. I would rather create a draft type in the sandbox if it wasn't guaranteed that my hard work was scrubbed away each week. Perhaps a way of flagging draft types in the sandbox so they are saved temporarily during scrubbing and get re-inserted. It would reduce one of the biggest hangups of sandbox use. If the sandbox was used more, it would reduce the number of types in the production graph. 2. Making it easier to clone types in the sandbox would help boost its popularity as the place to do schema drafts. Once the draft is finished, a quick way of getting it from sandbox to a base in the production graph would help reduce friction. One click would be great! 3. A more drastic solution-> User properties. Similar to having user types on topics, it might be possible to have user properties on types? Would definitely help with all the requests for reciprocation. And if a user property proves useful, the type admin could click a button to bring it into the fold. 4. And more complex; some Digg style voting on types and properties. Coupled with an algorithm which uses those votes and the number of links to types/properties could alter the UI visibility in the client, greying out unpopular and brightening up the popular. Linking that to a leaderboard, rising and falling types/properties could be identified and would help to spot any candidates for commons promotion. Acre code search is likely to be increasingly useful for seeing just how > many people are using a certain schema in their apps Ooh, is this available now? This would be useful for discovering new apps relevant to a domain. Iain On Mon, Jun 29, 2009 at 10:39 PM, Kirrily Robert wrote: > On Jun 29, 2009, at 11:20 AM, Stefano Mazzocchi wrote: > > > > Say it were and you were in charge of making it happen: how would > > you do > > it? > > > > I'm not defensive, I'm honestly curious. > > > This was what I was going to ask too :) We've talked about this a LOT > internally and it's actually an extremely hard problem (or at least we > find it to be so). We would really welcome suggestions for the design > of a more collaborative way of doing schema that doesn't break things > for app developers. Even just a workflow of "I'm a schema developer, > here's what I would like to do, step by step" would help. > > Meanwhile, there are some things we can do to help with the main > annoyances, I think. The email notification system that just came > live is a help as it allows us to contact schema admins more easily. > There's a new Freebase Suggest in the works which I hope will make it > easier to distinguish between multiple types with the same name. Acre > code search is likely to be increasingly useful for seeing just how > many people are using a certain schema in their apps (it won't catch > non-Acre apps, but as the number of apps grows, it will at least give > us *some* idea.) > > K. > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > http://freebase.com/ > > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090630/96422711/attachment.htm From stefano at metaweb.com Mon Jun 29 20:37:26 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Mon, 29 Jun 2009 13:37:26 -0700 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: <4A490601.7090200@metaweb.com> <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> Message-ID: <4A492606.6040304@metaweb.com> Iain Sproat wrote: > Acre code search is likely to be increasingly useful for seeing just how > many people are using a certain schema in their apps > > Ooh, is this available now? This would be useful for discovering new > apps relevant to a domain. http://codesearch.freebaseapps.com/ -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From tfmorris at gmail.com Mon Jun 29 21:46:17 2009 From: tfmorris at gmail.com (Tom Morris) Date: Mon, 29 Jun 2009 17:46:17 -0400 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> References: <4A490601.7090200@metaweb.com> <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> Message-ID: On Mon, Jun 29, 2009 at 2:39 PM, Kirrily Robert wrote: > The email notification system that just came > live is a help as it allows us to contact schema admins more easily. I wouldn't count on this necessarily helping. I was quite surprised recently to discover that I wasn't subscribed to updates for a domain that I created, so I just created https://bugs.freebase.com/browse/FREEBASE-834 suggesting that admins be forced to listen to updates on their domains. > Acre > code search is likely to be increasingly useful for seeing just how > many people are using a certain schema in their apps (it won't catch > non-Acre apps, but as the number of apps grows, it will at least give > us *some* idea.) If this is to distinguish things that need non-breaking refactorings vs those that don't, I think you should assume that every piece of the schema is being used and refactor it in a non-breaking fashion, if at all possible. If it's to determine "popularity" of particular domains/types/properties, I'd suggest that doing it by query count would be a) more realistic in terms of popularity and b) fairer to non-Acre apps. Tom From robert at metaweb.com Mon Jun 29 22:05:50 2009 From: robert at metaweb.com (Robert Cook) Date: Mon, 29 Jun 2009 15:05:50 -0700 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: <4A490601.7090200@metaweb.com> <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> Message-ID: On Jun 29, 2009, at 2:46 PM, Tom Morris wrote: > On Mon, Jun 29, 2009 at 2:39 PM, Kirrily Robert > wrote: > >> The email notification system that just came >> live is a help as it allows us to contact schema admins more easily. > > I wouldn't count on this necessarily helping. I was quite surprised > recently to discover that I wasn't subscribed to updates for a domain > that I created, so I just created > https://bugs.freebase.com/browse/FREEBASE-834 suggesting that admins > be forced to listen to updates on their domains. I wasn't aware of this problem. I just used the Jira voting feature to support this fix. Being aware of discussions in a domain that you administer should be the minimum responsibility. > >> Acre >> code search is likely to be increasingly useful for seeing just how >> many people are using a certain schema in their apps (it won't catch >> non-Acre apps, but as the number of apps grows, it will at least give >> us *some* idea.) > > If this is to distinguish things that need non-breaking refactorings > vs those that don't, I think you should assume that every piece of the > schema is being used and refactor it in a non-breaking fashion, if at > all possible. If it's to determine "popularity" of particular > domains/types/properties, I'd suggest that doing it by query count > would be a) more realistic in terms of popularity and b) fairer to > non-Acre apps. Schema stability is a major factor when promoting private domains to the commons. In practice, however, there are some pretty fundamental refactorings happening in the commons, albeit with a lot of warning and discussion. I think the question is whether schemas that aren't stable enough to be in commons should be displayed on the "read-only" topic view. They probably should be, but I'm not sure if it's worth introducing an extra level of complexity. An algorithm of the sort you describe isn't transparently deterministic enough, It might confuse people when information suddenly appears (or disappears) from topic pages. I believe we should just be more proactive about moving bases into the commons. R -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090629/265d1790/attachment-0001.htm From kirrily at metaweb.com Mon Jun 29 22:14:37 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Mon, 29 Jun 2009 15:14:37 -0700 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: <4A490601.7090200@metaweb.com> <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> Message-ID: <6D65FA8D-2A1A-4678-A17A-0CB7837CFDEE@metaweb.com> On Jun 29, 2009, at 2:46 PM, Tom Morris wrote: > On Mon, Jun 29, 2009 at 2:39 PM, Kirrily Robert > wrote: > >> The email notification system that just came >> live is a help as it allows us to contact schema admins more easily. > > I wouldn't count on this necessarily helping. I was quite surprised > recently to discover that I wasn't subscribed to updates for a domain > that I created, so I just created > https://bugs.freebase.com/browse/FREEBASE-834 suggesting that admins > be forced to listen to updates on their domains. This is how it currently works, in fact. You are not "subscribed" in the sense of actively following a domain you admin, but all admins of a domain (base or commons) do get notifications for any discussion posts on their domain's homepage. >> Acre >> code search is likely to be increasingly useful for seeing just how >> many people are using a certain schema in their apps (it won't catch >> non-Acre apps, but as the number of apps grows, it will at least give >> us *some* idea.) > > If this is to distinguish things that need non-breaking refactorings > vs those that don't, I think you should assume that every piece of the > schema is being used and refactor it in a non-breaking fashion, if at > all possible. If it's to determine "popularity" of particular > domains/types/properties, I'd suggest that doing it by query count > would be a) more realistic in terms of popularity and b) fairer to > non-Acre apps. There are some things that simply cannot be refactored in a non- breaking way. For instance, changing a property from unique to non- unique. There's no way to do that while guaranteeing backward compatibility. When that sort of thing comes up, we make a judgement call on just how much notice is required, based on how much the types in question appear to be used. We look at both the number of links, how common saved views are against a type, and how common apps are against a type. It's a bit of a black art, but we can judge all these things based on what's in Freebase, *except* for external (non-Acre) applications. The help topic on "Freebase Commons admin guidelines" at http://www.freebase.com/view/guid/9202a8c04000641f800000000b75f213 describes how we usually handle it. Anyway -- there'll never be a way to tell exactly what people are doing with the API or data dumps, but Acre code search may help us guess at the popularity of a type. For instance, if there are many Acre apps, then there are probably many non-Acre apps too. If there are no Acre apps, then while there *might* be a bunch of non-Acre apps that we don't know about, the likelihood is less. K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ From tfmorris at gmail.com Mon Jun 29 22:36:41 2009 From: tfmorris at gmail.com (Tom Morris) Date: Mon, 29 Jun 2009 18:36:41 -0400 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: <6D65FA8D-2A1A-4678-A17A-0CB7837CFDEE@metaweb.com> References: <4A490601.7090200@metaweb.com> <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> <6D65FA8D-2A1A-4678-A17A-0CB7837CFDEE@metaweb.com> Message-ID: On Mon, Jun 29, 2009 at 6:14 PM, Kirrily Robert wrote: > On Jun 29, 2009, at 2:46 PM, Tom Morris wrote: > >> On Mon, Jun 29, 2009 at 2:39 PM, Kirrily Robert >> wrote: >> >>> The email notification system that just came >>> live is a help as it allows us to contact schema admins more easily. >> >> I wouldn't count on this necessarily helping. ?I was quite surprised >> recently to discover that I wasn't subscribed to updates for a domain >> that I created, so ? I just created >> https://bugs.freebase.com/browse/FREEBASE-834 suggesting that admins >> be forced to listen to updates on their domains. > > This is how it currently works, in fact. ?You are not "subscribed" in > the sense of actively following a domain you admin, but all admins of > a domain (base or commons) do get notifications for any discussion > posts on their domain's homepage. Was this a recent change? This happened a couple of weeks ago (pre-email notifications), so I'm talking about notifications on my logged in Freebase home page that never showed up, but I'm assuming that there's a 1:1 correspondance between email notifications and homepage notifications. Oh wait, I see what might be the difference, you wrote "domain's *homepage*." (emphasis added) Does that mean discussions on a specific type in the domain don't necessarily get notifications posted unless you're subscribed? If so, I think these should be included, because that's where people end up when they click "Suggest a property" on the type schema page. All such suggestions should get admin notifications. Tom From kirrily at metaweb.com Mon Jun 29 23:59:13 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Mon, 29 Jun 2009 16:59:13 -0700 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: <4A490601.7090200@metaweb.com> <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> <6D65FA8D-2A1A-4678-A17A-0CB7837CFDEE@metaweb.com> Message-ID: <69C8E96D-E638-4FAE-BF07-3E950CC8EABD@metaweb.com> On Jun 29, 2009, at 3:36 PM, Tom Morris wrote: > Was this a recent change? This happened a couple of weeks ago > (pre-email notifications), so I'm talking about notifications on my > logged in Freebase home page that never showed up, but I'm assuming > that there's a 1:1 correspondance between email notifications and > homepage notifications. There is no such correspondence, sorry. The newsfeed on the homepage and the email notifications are completely different subsystems. > Oh wait, I see what might be the difference, you wrote "domain's > *homepage*." (emphasis added) Does that mean discussions on a > specific type in the domain don't necessarily get notifications posted > unless you're subscribed? If so, I think these should be included, > because that's where people end up when they click "Suggest a > property" on the type schema page. All such suggestions should get > admin notifications. Hmm, I thought type discussions were meant to be automatically crossposted to the domain they were part of, but this does not seem to be the case. K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ From spatial.db at gmail.com Tue Jun 30 00:00:43 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Mon, 29 Jun 2009 20:00:43 -0400 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: <4A490601.7090200@metaweb.com> <000B2B10-B8B0-4BBA-B8B0-7CFD2F6093D3@metaweb.com> <6D65FA8D-2A1A-4678-A17A-0CB7837CFDEE@metaweb.com> Message-ID: > > Oh wait, I see what might be the difference, you wrote "domain's > *homepage*." (emphasis added) Does that mean discussions on a > specific type in the domain don't necessarily get notifications posted > unless you're subscribed? If so, I think these should be included, > because that's where people end up when they click "Suggest a > property" on the type schema page. All such suggestions should get > admin notifications. > > Tom > See my similar comment from 21 June on https://bugs.freebase.com/browse/CLI-7723 -Ed -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090629/df91a0d1/attachment.htm From spencerkelly86 at gmail.com Tue Jun 30 14:58:02 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Tue, 30 Jun 2009 11:58:02 -0300 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: <4A490601.7090200@metaweb.com> References: <4A490601.7090200@metaweb.com> Message-ID: On Mon, Jun 29, 2009 at 3:20 PM, Stefano Mazzocchi wrote: > Say it were and you were in charge of making it happen: how would you do > it? if i were king of freebase, i would decree: 'No types shall be createth in the commons' Metaweb empoyees would have to make their types in bases, and all types will be evaluated for promotion according to something like http://nominate.freebaseapps.com/ (not working yet) On Mon, Jun 29, 2009 at 7:05 PM, Robert Cook wrote: > I think the question is whether schemas that aren't stable enough to be in > commons should be displayed on the "read-only" topic view. They probably > should be, but I'm not sure if it's worth introducing an extra level of > complexity. agree. From tfmorris at gmail.com Tue Jun 30 15:14:08 2009 From: tfmorris at gmail.com (Tom Morris) Date: Tue, 30 Jun 2009 11:14:08 -0400 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: <4A490601.7090200@metaweb.com> Message-ID: My preferred fix for the disparity between sucky Commons domains and excellent private (base) domains is to see the quality of the Commons domains improve. I have no real desire to win fame and glory by maintaining a Sailing domain (which is actually called sails.freebase.com because Jamie grabbed Sailing and then never put anything in it). I did it because it seemed like too much work to get the Boats domain whipped into shape so it could deal with things which weren't engine powered military ships in a reasonable fashion. Now perhaps a more effective promotion mechanism would encourage me to create Boats2 and lobby for its promotion based on merit, but what I'd really prefer is for the admins of Boats to respond to the half dozen schema suggestions that I and others have made, even if just to say "No way - that's stupid." In some ways this is just another aspect of the same problem, but I wanted to present a slightly different emphasis on ways to solve it. Tom From philip-freebase at shadowmagic.org.uk Tue Jun 30 15:40:12 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Tue, 30 Jun 2009 16:40:12 +0100 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: Message-ID: <20090630154010.GI32383@sphinx.mythic-beasts.com> On Mon, Jun 29, 2009 at 11:37:54AM -0300, Spencer Kelly wrote: > is metaweb ready to democratize control of what makes the browse page and > what doesn't? Let me phrase that question a different way: is the Freebase community ready to take on the responsibility of what makes commons / the browse page? Honestly, I don't think that it is. Look at any type-related discussion on Freebase, either on this list or the site itself and, most of the time, the people that are actually giving critical reviews to a type are Metaweb staff, usually Jeff. While I'm in no way saying that the current mechanisms (or lack thereof) for getting a type into the commons are perfect, one of the most important things (IMO) with regards to Freebase's data is having high quality schemas, and that's not something I'd like to see us compromise on. Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From robert at metaweb.com Tue Jun 30 16:53:11 2009 From: robert at metaweb.com (Robert Cook) Date: Tue, 30 Jun 2009 09:53:11 -0700 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: References: <4A490601.7090200@metaweb.com> Message-ID: <5F07812D-BD77-4A38-8DF1-E265EC189697@metaweb.com> I agree that that the most important thing we could do now is list out the domains which are not up to the community's standards and either fix them in the near term or move them to bases until they have properly gestated. It would be very helpful if people could list the domains they think need serious work and what they'd like to be done and we'll work together to fix them. I think this mailing list is a good place to air pent up concerns about domains and if it becomes too detailed, we'll move it to a different venue. There are a couple of domains that I believe need work: Boats - http://schemas.freebaseapps.com/domain?id=/boats -- I agree with Tom, this is a bit of a mess. (And maybe this should be renamed to "Powered naval vessels".) It was started by Patrick, who made great initial progress 2 years ago, but he no longer works for Metaweb. Anime/manga - http://schemas.freebaseapps.com/domain?id=/anime_manga -- Anime blends media types (film, video, OVA) in a way that forces questions about property delegation. There is a lot of really good data waiting to be loaded from Wikipedia, but the schemas need to be figured out first. Others? R On Jun 30, 2009, at 8:14 AM, Tom Morris wrote: > My preferred fix for the disparity between sucky Commons domains and > excellent private (base) domains is to see the quality of the Commons > domains improve. > > I have no real desire to win fame and glory by maintaining a Sailing > domain (which is actually called sails.freebase.com because Jamie > grabbed Sailing and then never put anything in it). I did it because > it seemed like too much work to get the Boats domain whipped into > shape so it could deal with things which weren't engine powered > military ships in a reasonable fashion. > > Now perhaps a more effective promotion mechanism would encourage me to > create Boats2 and lobby for its promotion based on merit, but what I'd > really prefer is for the admins of Boats to respond to the half dozen > schema suggestions that I and others have made, even if just to say > "No way - that's stupid." > > In some ways this is just another aspect of the same problem, but I > wanted to present a slightly different emphasis on ways to solve it. > > Tom > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From iainsproat at gmail.com Tue Jun 30 17:22:51 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Tue, 30 Jun 2009 21:22:51 +0400 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: <5F07812D-BD77-4A38-8DF1-E265EC189697@metaweb.com> References: <4A490601.7090200@metaweb.com> <5F07812D-BD77-4A38-8DF1-E265EC189697@metaweb.com> Message-ID: soccer - but I've worked out a new schema in a draft, which will be moved across; so there should be some big changes shortly. engineering - my bad, was tasked to look at this and didn't make much progress. Spencer and I threw some ideas in engineering draft, and there's also a lot of good ideas in the material sciencebase. I'll make a resolution to get back round to this.... And there's a few with not very many types/data: atom feeds bicycles geology Iain On Tue, Jun 30, 2009 at 8:53 PM, Robert Cook wrote: > I agree that that the most important thing we could do now is list out > the domains which are not up to the community's standards and either > fix them in the near term or move them to bases until they have > properly gestated. > > It would be very helpful if people could list the domains they think > need serious work and what they'd like to be done and we'll work > together to fix them. I think this mailing list is a good place to > air pent up concerns about domains and if it becomes too detailed, > we'll move it to a different venue. > > There are a couple of domains that I believe need work: > > Boats - http://schemas.freebaseapps.com/domain?id=/boats -- I agree > with Tom, this is a bit of a mess. (And maybe this should be renamed > to "Powered naval vessels".) It was started by Patrick, who made great > initial progress 2 years ago, but he no longer works for Metaweb. > > Anime/manga - http://schemas.freebaseapps.com/domain?id=/anime_manga > -- Anime blends media types (film, video, OVA) in a way that forces > questions about property delegation. There is a lot of really good > data waiting to be loaded from Wikipedia, but the schemas need to be > figured out first. > > Others? > > R > On Jun 30, 2009, at 8:14 AM, Tom Morris wrote: > > > My preferred fix for the disparity between sucky Commons domains and > > excellent private (base) domains is to see the quality of the Commons > > domains improve. > > > > I have no real desire to win fame and glory by maintaining a Sailing > > domain (which is actually called sails.freebase.com because Jamie > > grabbed Sailing and then never put anything in it). I did it because > > it seemed like too much work to get the Boats domain whipped into > > shape so it could deal with things which weren't engine powered > > military ships in a reasonable fashion. > > > > Now perhaps a more effective promotion mechanism would encourage me to > > create Boats2 and lobby for its promotion based on merit, but what I'd > > really prefer is for the admins of Boats to respond to the half dozen > > schema suggestions that I and others have made, even if just to say > > "No way - that's stupid." > > > > In some ways this is just another aspect of the same problem, but I > > wanted to present a slightly different emphasis on ways to solve it. > > > > Tom > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090630/e222f8f0/attachment.htm From kirrily at metaweb.com Tue Jun 30 18:39:16 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Tue, 30 Jun 2009 11:39:16 -0700 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: <5F07812D-BD77-4A38-8DF1-E265EC189697@metaweb.com> References: <4A490601.7090200@metaweb.com> <5F07812D-BD77-4A38-8DF1-E265EC189697@metaweb.com> Message-ID: <261B0EB5-AB10-465D-861D-988C8B68D9A5@metaweb.com> On Jun 30, 2009, at 9:53 AM, Robert Cook wrote: > Boats - http://schemas.freebaseapps.com/domain?id=/boats -- I agree > with Tom, this is a bit of a mess. (And maybe this should be renamed > to "Powered naval vessels".) It was started by Patrick, who made great > initial progress 2 years ago, but he no longer works for Metaweb. FYI, I have a draft of a completely reworked version of this at http://www.freebase.com/view/user/skud/boats ... not finished but at least the rough bones are there. K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ From vishal at metaweb.com Tue Jun 30 18:42:16 2009 From: vishal at metaweb.com (Vishal Talwar) Date: Tue, 30 Jun 2009 11:42:16 -0700 (PDT) Subject: [Data-modeling] Proposed Type (cvg): Computer Game Distribution Channel In-Reply-To: <1545905050.301751246387252017.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <213129060.301831246387336393.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> I like the schema but the client has a bit of a problem making that view useful for adding data. I clicked on "Add More" to add a game being distributed through Steam, but the only property of Computer Game Version I was able to set was "Distributed through" (presumably because you were requiring it in your filters), leading to a very incomplete CVT... I had to go to the edit page of the game to add a platform. Anyhow, the schema itself looks good. Vishal ----- Original Message ----- From: "Duncan Oliver" To: "Freebase data modeling mailing list" Sent: Monday, June 29, 2009 10:21:14 AM GMT -08:00 US/Canada Pacific Subject: Re: [Data-modeling] Proposed Type (cvg): Computer Game Distribution Channel K, I've made a type in the sandbox to illustrate the idea, feel free to make suggestions for additions (or subtractions). Here's a view of some sample topics: http://www.sandbox-freebase.com/view/cvg/computer_game_distribution_system and here's some games with versions that use the new "Distributed through" property: http://www.sandbox-freebase.com/view/user/drakecaiman/default_domain/views/computer_games_with_distribution_systems - Duncan _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling From jeff at metaweb.com Tue Jun 30 19:57:16 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 30 Jun 2009 12:57:16 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <32473E87FA9E488982DE6C26F0EA5194@p4> Message-ID: <40A040FCC3624723B96F4F0A6DCB798B@p4> And one last time: http://www.sandbox-freebase.com/view/user/typelibrarian/default_domain/views /cheerios_ingredients Jeff > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher > Sent: Friday, June 26, 2009 4:38 PM > To: 'Freebase data modeling mailing list' > Subject: Re: [Data-modeling] Products with ingredients > > Here's a revised schema with the derives from/derivative > properties. Lemme know whatcha think: > > lt_domain/view > s/cranberry_almond_crunch_ingredients> > > Jeff > > > -----Original Message----- > > From: data-modeling-bounces at freebase.com > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Faye Harris > > Sent: Friday, June 26, 2009 2:24 PM > > To: Freebase data modeling mailing list > > Subject: Re: [Data-modeling] Products with ingredients > > > > +1 on adding the parent-child relationship so that we'd have both > > "variety of" and "derives from" properties, instead of just > the former > > for both purposes. > > > > -- Faye > > > > > > Jeff Prucher wrote: > > > There are three examples upthread that led to the phylogeny > > pattern, > > > each of which is a slightly different case: > > > > > > (variety) <--> (generalization) > > > Milled corn <--> Corn > > > Sodium lauryl sulfate (from coconut oil) <--> Sodium > lauryl sulfate > > > Enriched flour (foo, bar, bazz, fazz) <--> Enriched flour > > > > > > Faye's division fits this pretty well: > > > Milled corn is derived from corn; SLS (from coconut) is a > > variety of > > > SLS, and is also derived from coconut; enriched flour > > (etc., usw) is a > > > variety of enriched flour. (Reviewing this thread, I note that Ed > > > suggested a Processed Ingredient type way back at the outset.) > > > > > > The big question is, would we be asking for trouble by adding a > > > parent/child relationship to this, in addition to the two > phylogeny > > > patterns? Or should we just punt it for now? > > > > > > Jeff > > > > > > > > > > > >> -----Original Message----- > > >> From: data-modeling-bounces at freebase.com > > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of > Tom Morris > > >> Sent: Wednesday, June 24, 2009 10:37 AM > > >> To: Freebase data modeling mailing list > > >> Subject: Re: [Data-modeling] Products with ingredients > > >> > > >> I'm with Faye. It seems very weird to have rice flour > and rice so > > >> strongly related. I don't consider rice flour to be a > > generalization > > >> of rice at all. About the only places where they would > > potentially > > >> interchangeable would be for nutritional information or > > for allergies. > > >> You might be able to substitute basmati rice for jasmine > > rice if you > > >> didn't care too much about the difference in texture or > > maintaining > > >> cultural authenticity, but if you substituted rice flour (of any > > >> variety), you'd be in a whole heap of trouble. > > >> > > >> > > >> The examples in the schema descriptions (yay for > > >> descriptions!) seem to have the same problem. You can get > > lavender > > >> oil out of a lavender plant, but they aren't > > generalizations of each > > >> other. If anything, the generalization would be aromatic oil or > > >> fragrance or something. > > >> > > >> For most applications, it's more useful to have things linked > > >> together because of common properties rather than > because they are > > >> made from the same source material or by the same process. > > >> > > >> -1 for making this even more obscure by linking in Material. > > >> > > >> Tom > > >> > > >> On Tue, Jun 23, 2009 at 9:02 PM, Faye Harris > > wrote: > > >> > > >> > > >> Very cool! > > >> > > >> That rice flour is called a "variety of" rice in the schema is > > >> indeed very odd. > > >> > > >> Based on the sandbox examples, this schema seems to use > > "variety of" > > >> for two types of relationships: > > >> 1) variety of, e.g. brown rice is a "variety of" rice > > >> 2) derived from, e.g. rice flour is "derived from" rice > > >> > > >> The former relationship is categorical, the latter relates to > > >> post-processing. > > >> > > >> -- Faye > > >> > > >> > > >> > > >> Jeff Prucher wrote: > > >> > > >> OK, I've got the double-phylogeny pattern > > working now. Take a look > > >> here: > > >> > > >> http://www.sandbox-freebase.com/type/schema/business/product_i > > >> > > > ngredient > > > > > >> > > >> And here's a table view of the ingredients of a > > breakfast cereal I > > >> found in > > >> the office kitchen: > > >> > > >> http://www.sandbox-freebase.com/view/user/jeff/default_domain/ > > >> > > > views/cranberr > > > > > >> y_almond_crunch_ingredients > > >> > > >> I'm not really happy with the "variety of" and > > "generalization of" > > >> names, > > >> but I'm not coming up with anything better. Any > > suggestions would > > >> be most > > >> welcome. > > >> > > >> Jeff > > >> > > >> > > >> > > >> -----Original Message----- > > >> From: data-modeling-bounces at freebase.com > > >> > > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of > > Jeff Prucher > > >> Sent: Friday, June 19, 2009 11:42 AM > > >> To: Freebase data modeling mailing list > > >> Subject: Re: [Data-modeling] Products > > with ingredients > > >> > > >> > > >> ----- "Faye Harris" > > > > >> wrote: > > >> > > >> > > >> > > >> From: "Faye Harris" > > >> > > >> To: "Freebase data modeling > > >> mailing list" > > >> > > >> > > >> > > >> > > >> > > >> > > >> Sent: Thursday, June 18, 2009 > > >> 2:59:19 PM GMT -08:00 > > >> > > >> > > >> US/Canada Pacific > > >> > > >> > > >> Subject: Re: [Data-modeling] > > >> Products with ingredients > > >> > > >> Jeff Prucher wrote: > > >> > > >> > > >> ----- "Robert Cook" > > >> wrote: > > >> > > >> > > >> > > >> > > >> One solution would be > > >> to create a topic with a long name -- enter > > >> > > >> > > >> it > > >> > > >> > > >> exactly as it appears > > >> on the label such as "Enriched flour - > > >> > > >> > > >> (wheat, > > >> > > >> > > >> niacin, iron, baby > > >> powder, sawdust, DDT)". > > >> > > >> > > >> > > >> This would answer. > > >> Anyone else have any comments or > > >> > > >> > > >> thoughts on this > > >> > > >> > > >> before I load the schema? > > >> > > >> > > >> > > >> > > >> > > >> The main problem with this is > > >> you can't arrive at the products that > > >> use enriched flour by clicking > > >> on a property link from a single > > >> "enriched > > >> > > >> flour" topic. Rather, you have > > >> to do a keyword search for products > > >> based on matching all the > > >> various "enriched flour - (foo, bar, bat, > > >> baz)" > > >> ingredient topics with the > > >> words "enriched" and "flour". > > >> > > >> > > >> That's quite > > >> > > >> > > >> a loss in queriability.> > > > >> The schema is fine to get us > > >> started, but we're still going > > >> > > >> > > >> to try to > > >> > > >> > > >> put together some phylogeny > > >> pattern in place (in the near future) > > >> right? > > >> > > >> > > >> > > >> > > >> I plan to add a phylogeny pattern > before moving the schema to > > >> freebase.com, which should help > queryability. It doesn't > > >> address the fact that topics named > things like "enriched > > >> flour (that, that, the other thing)" > > >> are exceedingly ugly, > > >> however (no-one said it was called > "prettybase.com", though). > > >> I was going to post a revised schema > to sandbox, with the > > >> double-phylogeny pattern suggested by > Robert, but it got > > >> horribly munged in the process. I'll > try to fix it, but it > > >> might not be till next week. > > >> > > >> Jeff > > >> _______________________________________________ > > >> Data-modeling mailing list > > >> Data-modeling at freebase.com > > >> > > >> http://lists.freebase.com/mailman/listinfo/data-modeling > > >> > > >> > > >> > > >> _______________________________________________ > > >> Data-modeling mailing list > > >> Data-modeling at freebase.com > > >> http://lists.freebase.com/mailman/listinfo/data-modeling > > >> > > >> > > >> > > >> > > >> > > >> _______________________________________________ > > >> Data-modeling mailing list > > >> Data-modeling at freebase.com > > >> http://lists.freebase.com/mailman/listinfo/data-modeling > > >> > > >> > > >> > > >> > > >> > > >> > > > > > > _______________________________________________ > > > Data-modeling mailing list > > > Data-modeling at freebase.com > > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > > > > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From faye at metaweb.com Tue Jun 30 20:14:09 2009 From: faye at metaweb.com (Faye Harris) Date: Tue, 30 Jun 2009 13:14:09 -0700 Subject: [Data-modeling] Products with ingredients In-Reply-To: <40A040FCC3624723B96F4F0A6DCB798B@p4> References: <40A040FCC3624723B96F4F0A6DCB798B@p4> Message-ID: <4A4A7211.7010305@metaweb.com> Looks good! I think it's ready for prime time. -- Faye Jeff Prucher wrote: > And one last time: > http://www.sandbox-freebase.com/view/user/typelibrarian/default_domain/views > /cheerios_ingredients > > Jeff > > >> -----Original Message----- >> From: data-modeling-bounces at freebase.com >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher >> Sent: Friday, June 26, 2009 4:38 PM >> To: 'Freebase data modeling mailing list' >> Subject: Re: [Data-modeling] Products with ingredients >> >> Here's a revised schema with the derives from/derivative >> properties. Lemme know whatcha think: >> >> > lt_domain/view >> s/cranberry_almond_crunch_ingredients> >> >> Jeff >> >> >>> -----Original Message----- >>> From: data-modeling-bounces at freebase.com >>> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Faye Harris >>> Sent: Friday, June 26, 2009 2:24 PM >>> To: Freebase data modeling mailing list >>> Subject: Re: [Data-modeling] Products with ingredients >>> >>> +1 on adding the parent-child relationship so that we'd have both >>> "variety of" and "derives from" properties, instead of just >>> >> the former >> >>> for both purposes. >>> >>> -- Faye >>> >>> >>> Jeff Prucher wrote: >>> >>>> There are three examples upthread that led to the phylogeny >>>> >>> pattern, >>> >>>> each of which is a slightly different case: >>>> >>>> (variety) <--> (generalization) >>>> Milled corn <--> Corn >>>> Sodium lauryl sulfate (from coconut oil) <--> Sodium >>>> >> lauryl sulfate >> >>>> Enriched flour (foo, bar, bazz, fazz) <--> Enriched flour >>>> >>>> Faye's division fits this pretty well: >>>> Milled corn is derived from corn; SLS (from coconut) is a >>>> >>> variety of >>> >>>> SLS, and is also derived from coconut; enriched flour >>>> >>> (etc., usw) is a >>> >>>> variety of enriched flour. (Reviewing this thread, I note that Ed >>>> suggested a Processed Ingredient type way back at the outset.) >>>> >>>> The big question is, would we be asking for trouble by adding a >>>> parent/child relationship to this, in addition to the two >>>> >> phylogeny >> >>>> patterns? Or should we just punt it for now? >>>> >>>> Jeff >>>> >>>> >>>> >>>> >>>>> -----Original Message----- >>>>> From: data-modeling-bounces at freebase.com >>>>> [mailto:data-modeling-bounces at freebase.com] On Behalf Of >>>>> >> Tom Morris >> >>>>> Sent: Wednesday, June 24, 2009 10:37 AM >>>>> To: Freebase data modeling mailing list >>>>> Subject: Re: [Data-modeling] Products with ingredients >>>>> >>>>> I'm with Faye. It seems very weird to have rice flour >>>>> >> and rice so >> >>>>> strongly related. I don't consider rice flour to be a >>>>> >>> generalization >>> >>>>> of rice at all. About the only places where they would >>>>> >>> potentially >>> >>>>> interchangeable would be for nutritional information or >>>>> >>> for allergies. >>> >>>>> You might be able to substitute basmati rice for jasmine >>>>> >>> rice if you >>> >>>>> didn't care too much about the difference in texture or >>>>> >>> maintaining >>> >>>>> cultural authenticity, but if you substituted rice flour (of any >>>>> variety), you'd be in a whole heap of trouble. >>>>> >>>>> >>>>> The examples in the schema descriptions (yay for >>>>> descriptions!) seem to have the same problem. You can get >>>>> >>> lavender >>> >>>>> oil out of a lavender plant, but they aren't >>>>> >>> generalizations of each >>> >>>>> other. If anything, the generalization would be aromatic oil or >>>>> fragrance or something. >>>>> >>>>> For most applications, it's more useful to have things linked >>>>> together because of common properties rather than >>>>> >> because they are >> >>>>> made from the same source material or by the same process. >>>>> >>>>> -1 for making this even more obscure by linking in Material. >>>>> >>>>> Tom >>>>> >>>>> On Tue, Jun 23, 2009 at 9:02 PM, Faye Harris >>>>> >>> wrote: >>> >>>>> Very cool! >>>>> >>>>> That rice flour is called a "variety of" rice in the schema is >>>>> indeed very odd. >>>>> >>>>> Based on the sandbox examples, this schema seems to use >>>>> >>> "variety of" >>> >>>>> for two types of relationships: >>>>> 1) variety of, e.g. brown rice is a "variety of" rice >>>>> 2) derived from, e.g. rice flour is "derived from" rice >>>>> >>>>> The former relationship is categorical, the latter relates to >>>>> post-processing. >>>>> >>>>> -- Faye >>>>> >>>>> >>>>> >>>>> Jeff Prucher wrote: >>>>> >>>>> OK, I've got the double-phylogeny pattern >>>>> >>> working now. Take a look >>> >>>>> here: >>>>> >>>>> http://www.sandbox-freebase.com/type/schema/business/product_i >>>>> >>>>> >>>> ngredient >>>> >>>> >>>>> >>>>> And here's a table view of the ingredients of a >>>>> >>> breakfast cereal I >>> >>>>> found in >>>>> the office kitchen: >>>>> >>>>> http://www.sandbox-freebase.com/view/user/jeff/default_domain/ >>>>> >>>>> >>>> views/cranberr >>>> >>>> >>>>> y_almond_crunch_ingredients >>>>> >>>>> I'm not really happy with the "variety of" and >>>>> >>> "generalization of" >>> >>>>> names, >>>>> but I'm not coming up with anything better. Any >>>>> >>> suggestions would >>> >>>>> be most >>>>> welcome. >>>>> >>>>> Jeff >>>>> >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: data-modeling-bounces at freebase.com >>>>> >>>>> [mailto:data-modeling-bounces at freebase.com] On Behalf Of >>>>> >>> Jeff Prucher >>> >>>>> Sent: Friday, June 19, 2009 11:42 AM >>>>> To: Freebase data modeling mailing list >>>>> Subject: Re: [Data-modeling] Products >>>>> >>> with ingredients >>> >>>>> >>>>> >>>>> ----- "Faye Harris" >>>>> >>> >>> >>>>> wrote: >>>>> >>>>> >>>>> >>>>> From: "Faye Harris" >>>>> >>>>> To: "Freebase data modeling >>>>> mailing list" >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Sent: Thursday, June 18, 2009 >>>>> 2:59:19 PM GMT -08:00 >>>>> >>>>> >>>>> US/Canada Pacific >>>>> >>>>> >>>>> Subject: Re: [Data-modeling] >>>>> Products with ingredients >>>>> >>>>> Jeff Prucher wrote: >>>>> >>>>> >>>>> ----- "Robert Cook" >>>>> wrote: >>>>> >>>>> >>>>> >>>>> >>>>> One solution would be >>>>> to create a topic with a long name -- enter >>>>> >>>>> >>>>> it >>>>> >>>>> >>>>> exactly as it appears >>>>> on the label such as "Enriched flour - >>>>> >>>>> >>>>> (wheat, >>>>> >>>>> >>>>> niacin, iron, baby >>>>> powder, sawdust, DDT)". >>>>> >>>>> >>>>> >>>>> This would answer. >>>>> Anyone else have any comments or >>>>> >>>>> >>>>> thoughts on this >>>>> >>>>> >>>>> before I load the schema? >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> The main problem with this is >>>>> you can't arrive at the products that >>>>> use enriched flour by clicking >>>>> on a property link from a single >>>>> "enriched >>>>> >>>>> flour" topic. Rather, you have >>>>> to do a keyword search for products >>>>> based on matching all the >>>>> various "enriched flour - (foo, bar, bat, >>>>> baz)" >>>>> ingredient topics with the >>>>> words "enriched" and "flour". >>>>> >>>>> >>>>> That's quite >>>>> >>>>> >>>>> a loss in queriability.> > >>>>> The schema is fine to get us >>>>> started, but we're still going >>>>> >>>>> >>>>> to try to >>>>> >>>>> >>>>> put together some phylogeny >>>>> pattern in place (in the near future) >>>>> right? >>>>> >>>>> >>>>> >>>>> >>>>> I plan to add a phylogeny pattern >>>>> >> before moving the schema to >> >>>>> freebase.com, which should help >>>>> >> queryability. It doesn't >> >>>>> address the fact that topics named >>>>> >> things like "enriched >> >>>>> flour (that, that, the other thing)" >>>>> are exceedingly ugly, >>>>> however (no-one said it was called >>>>> >> "prettybase.com", though). >> >>>>> I was going to post a revised schema >>>>> >> to sandbox, with the >> >>>>> double-phylogeny pattern suggested by >>>>> >> Robert, but it got >> >>>>> horribly munged in the process. I'll >>>>> >> try to fix it, but it >> >>>>> might not be till next week. >>>>> >>>>> Jeff >>>>> _______________________________________________ >>>>> Data-modeling mailing list >>>>> Data-modeling at freebase.com >>>>> >>>>> http://lists.freebase.com/mailman/listinfo/data-modeling >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Data-modeling mailing list >>>>> Data-modeling at freebase.com >>>>> http://lists.freebase.com/mailman/listinfo/data-modeling >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Data-modeling mailing list >>>>> Data-modeling at freebase.com >>>>> http://lists.freebase.com/mailman/listinfo/data-modeling >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Data-modeling mailing list >>>> Data-modeling at freebase.com >>>> http://lists.freebase.com/mailman/listinfo/data-modeling >>>> >>>> >>>> >>> _______________________________________________ >>> Data-modeling mailing list >>> Data-modeling at freebase.com >>> http://lists.freebase.com/mailman/listinfo/data-modeling >>> >>> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > From tfmorris at gmail.com Tue Jun 30 20:56:00 2009 From: tfmorris at gmail.com (Tom Morris) Date: Tue, 30 Jun 2009 16:56:00 -0400 Subject: [Data-modeling] Products with ingredients In-Reply-To: <40A040FCC3624723B96F4F0A6DCB798B@p4> References: <32473E87FA9E488982DE6C26F0EA5194@p4> <40A040FCC3624723B96F4F0A6DCB798B@p4> Message-ID: Thanks for recreating the sandbox. I tried to follow up on this earlier today and it had been blown away in the refresh. I agree with Faye. Looks good. Tom On Tue, Jun 30, 2009 at 3:57 PM, Jeff Prucher wrote: > And one last time: > http://www.sandbox-freebase.com/view/user/typelibrarian/default_domain/views > /cheerios_ingredients > > Jeff > >> -----Original Message----- >> From: data-modeling-bounces at freebase.com >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher >> Sent: Friday, June 26, 2009 4:38 PM >> To: 'Freebase data modeling mailing list' >> Subject: Re: [Data-modeling] Products with ingredients >> >> Here's a revised schema with the derives from/derivative >> properties. Lemme know whatcha think: >> >> > lt_domain/view >> s/cranberry_almond_crunch_ingredients> >> >> Jeff >> >> > -----Original Message----- >> > From: data-modeling-bounces at freebase.com >> > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Faye Harris >> > Sent: Friday, June 26, 2009 2:24 PM >> > To: Freebase data modeling mailing list >> > Subject: Re: [Data-modeling] Products with ingredients >> > >> > +1 on adding the parent-child relationship so that we'd have both >> > "variety of" and "derives from" properties, instead of just >> the former >> > for both purposes. >> > >> > -- Faye >> > >> > >> > Jeff Prucher wrote: >> > > There are three examples upthread that led to the phylogeny >> > pattern, >> > > each of which is a slightly different case: >> > > >> > > (variety) <--> (generalization) >> > > Milled corn <--> Corn >> > > Sodium lauryl sulfate (from coconut oil) <--> Sodium >> lauryl sulfate >> > > Enriched flour (foo, bar, bazz, fazz) <--> Enriched flour >> > > >> > > Faye's division fits this pretty well: >> > > Milled corn is derived from corn; SLS (from coconut) is a >> > variety of >> > > SLS, and is also derived from coconut; enriched flour >> > (etc., usw) is a >> > > variety of enriched flour. (Reviewing this thread, I note that Ed >> > > suggested a Processed Ingredient type way back at the outset.) >> > > >> > > The big question is, would we be asking for trouble by adding a >> > > parent/child relationship to this, in addition to the two >> phylogeny >> > > patterns? ?Or should we just punt it for now? >> > > >> > > Jeff >> > > >> > > >> > > >> > >> -----Original Message----- >> > >> From: data-modeling-bounces at freebase.com >> > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of >> Tom Morris >> > >> Sent: Wednesday, June 24, 2009 10:37 AM >> > >> To: Freebase data modeling mailing list >> > >> Subject: Re: [Data-modeling] Products with ingredients >> > >> >> > >> I'm with Faye. ?It seems very weird to have rice flour >> and rice so >> > >> strongly related. ?I don't consider rice flour to be a >> > generalization >> > >> of rice at all. ?About the only places where they would >> > potentially >> > >> interchangeable would be for nutritional information or >> > for allergies. >> > >> You might be able to substitute basmati rice for jasmine >> > rice if you >> > >> didn't care too much about the difference in texture or >> > maintaining >> > >> cultural authenticity, but if you substituted rice flour (of any >> > >> variety), you'd be in a whole heap of trouble. >> > >> >> > >> >> > >> The examples in the schema descriptions (yay for >> > >> descriptions!) seem to have the same problem. ?You can get >> > lavender >> > >> oil out of a lavender plant, but they aren't >> > generalizations of each >> > >> other. ?If anything, the generalization would be aromatic oil or >> > >> fragrance or something. >> > >> >> > >> For most applications, it's more useful to have things linked >> > >> together because of common properties rather than >> because they are >> > >> made from the same source material or by the same process. >> > >> >> > >> -1 for making this even more obscure by linking in Material. >> > >> >> > >> Tom >> > >> >> > >> On Tue, Jun 23, 2009 at 9:02 PM, Faye Harris >> > wrote: >> > >> >> > >> >> > >> ?Very cool! >> > >> >> > >> ?That rice flour is called a "variety of" rice in the schema is >> > >> indeed very odd. >> > >> >> > >> ?Based on the sandbox examples, this schema seems to use >> > "variety of" >> > >> for two types of relationships: >> > >> ? ? ?1) variety of, e.g. brown rice is a "variety of" rice >> > >> ? ? ?2) derived from, e.g. rice flour is "derived from" rice >> > >> >> > >> ?The former relationship is categorical, the latter relates to >> > >> post-processing. >> > >> >> > >> ?-- Faye >> > >> >> > >> >> > >> >> > >> ?Jeff Prucher wrote: >> > >> >> > >> ? ? ? ? ?OK, I've got the double-phylogeny pattern >> > working now. Take a look >> > >> here: >> > >> >> > >> http://www.sandbox-freebase.com/type/schema/business/product_i >> > >> >> > > ngredient >> > > >> > >> >> > >> ? ? ? ? ?And here's a table view of the ingredients of a >> > breakfast cereal I >> > >> found in >> > >> ? ? ? ? ?the office kitchen: >> > >> >> > >> http://www.sandbox-freebase.com/view/user/jeff/default_domain/ >> > >> >> > > views/cranberr >> > > >> > >> ? ? ? ? ?y_almond_crunch_ingredients >> > >> >> > >> ? ? ? ? ?I'm not really happy with the "variety of" and >> > "generalization of" >> > >> names, >> > >> ? ? ? ? ?but I'm not coming up with anything better. Any >> > suggestions would >> > >> be most >> > >> ? ? ? ? ?welcome. >> > >> >> > >> ? ? ? ? ?Jeff >> > >> >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ?-----Original Message----- >> > >> ? ? ? ? ? ? ? ? ?From: data-modeling-bounces at freebase.com >> > >> >> > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of >> > Jeff Prucher >> > >> ? ? ? ? ? ? ? ? ?Sent: Friday, June 19, 2009 11:42 AM >> > >> ? ? ? ? ? ? ? ? ?To: Freebase data modeling mailing list >> > >> ? ? ? ? ? ? ? ? ?Subject: Re: [Data-modeling] Products >> > with ingredients >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ?----- "Faye Harris" >> > >> > >> wrote: >> > >> >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?From: "Faye Harris" >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?To: "Freebase data modeling >> > >> mailing list" >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? >> > >> >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?Sent: Thursday, June 18, 2009 >> > >> 2:59:19 PM GMT -08:00 >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ?US/Canada Pacific >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?Subject: Re: [Data-modeling] >> > >> Products with ingredients >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?Jeff Prucher wrote: >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?----- "Robert Cook" >> > >> ?wrote: >> > >> >> > >> >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?One solution would be >> > >> to create a topic with a long name -- enter >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?it >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?exactly as it appears >> > >> on the label such as "Enriched flour - >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?(wheat, >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?niacin, iron, baby >> > >> powder, sawdust, DDT)". >> > >> >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?This would answer. >> > >> Anyone else have any comments or >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ?thoughts on this >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?before I load the schema? >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?The main problem with this is >> > >> you can't arrive at the products that >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?use enriched flour by clicking >> > >> on a property link from a single >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?"enriched >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?flour" topic. Rather, you have >> > >> to do a keyword search for products >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?based on matching all the >> > >> various "enriched flour - (foo, bar, bat, >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?baz)" >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?ingredient topics with the >> > >> words "enriched" and "flour". >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ?That's quite >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?a loss in queriability.> > >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?The schema is fine to get us >> > >> started, but we're still going >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ?to try to >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?put together some phylogeny >> > >> pattern in place (in the near future) >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ?right? >> > >> >> > >> >> > >> >> > >> >> > >> ? ? ? ? ? ? ? ? ?I plan to add a phylogeny pattern >> before moving the schema to >> > >> ? ? ? ? ? ? ? ? ?freebase.com, which should help >> queryability. It doesn't >> > >> ? ? ? ? ? ? ? ? ?address the fact that topics named >> things like "enriched >> > >> ? ? ? ? ? ? ? ? ?flour (that, that, the other thing)" >> > >> are exceedingly ugly, >> > >> ? ? ? ? ? ? ? ? ?however (no-one said it was called >> "prettybase.com", though). >> > >> ? ? ? ? ? ? ? ? ? I was going to post a revised schema >> to sandbox, with the >> > >> ? ? ? ? ? ? ? ? ?double-phylogeny pattern suggested by >> Robert, but it got >> > >> ? ? ? ? ? ? ? ? ?horribly munged in the process. I'll >> try to fix it, but it >> > >> ? ? ? ? ? ? ? ? ?might not be till next week. >> > >> >> > >> ? ? ? ? ? ? ? ? ?Jeff >> > >> ? ? ? ? ? ? ? ? ?_______________________________________________ >> > >> ? ? ? ? ? ? ? ? ?Data-modeling mailing list >> > >> ? ? ? ? ? ? ? ? ?Data-modeling at freebase.com >> > >> >> > >> http://lists.freebase.com/mailman/listinfo/data-modeling >> > >> >> > >> >> > >> >> > >> ? ? ? ? ?_______________________________________________ >> > >> ? ? ? ? ?Data-modeling mailing list >> > >> ? ? ? ? ?Data-modeling at freebase.com >> > >> ? ? ? ? ?http://lists.freebase.com/mailman/listinfo/data-modeling >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> ?_______________________________________________ >> > >> ?Data-modeling mailing list >> > >> ?Data-modeling at freebase.com >> > >> ?http://lists.freebase.com/mailman/listinfo/data-modeling >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > > >> > > _______________________________________________ >> > > Data-modeling mailing list >> > > Data-modeling at freebase.com >> > > http://lists.freebase.com/mailman/listinfo/data-modeling >> > > >> > > >> > >> > _______________________________________________ >> > Data-modeling mailing list >> > Data-modeling at freebase.com >> > http://lists.freebase.com/mailman/listinfo/data-modeling >> > >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From spatial.db at gmail.com Tue Jun 30 21:19:13 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Tue, 30 Jun 2009 17:19:13 -0400 Subject: [Data-modeling] Products with ingredients In-Reply-To: References: <32473E87FA9E488982DE6C26F0EA5194@p4> <40A040FCC3624723B96F4F0A6DCB798B@p4> Message-ID: +1 -Ed On Tue, Jun 30, 2009 at 4:56 PM, Tom Morris wrote: > Thanks for recreating the sandbox. I tried to follow up on this > earlier today and it had been blown away in the refresh. > > I agree with Faye. Looks good. > > Tom > > On Tue, Jun 30, 2009 at 3:57 PM, Jeff Prucher wrote: > > And one last time: > > > http://www.sandbox-freebase.com/view/user/typelibrarian/default_domain/views > > /cheerios_ingredients > > > > Jeff > > > >> -----Original Message----- > >> From: data-modeling-bounces at freebase.com > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher > >> Sent: Friday, June 26, 2009 4:38 PM > >> To: 'Freebase data modeling mailing list' > >> Subject: Re: [Data-modeling] Products with ingredients > >> > >> Here's a revised schema with the derives from/derivative > >> properties. Lemme know whatcha think: > >> > >> >> lt_domain/view > >> s/cranberry_almond_crunch_ingredients> > >> > >> Jeff > >> > >> > -----Original Message----- > >> > From: data-modeling-bounces at freebase.com > >> > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Faye Harris > >> > Sent: Friday, June 26, 2009 2:24 PM > >> > To: Freebase data modeling mailing list > >> > Subject: Re: [Data-modeling] Products with ingredients > >> > > >> > +1 on adding the parent-child relationship so that we'd have both > >> > "variety of" and "derives from" properties, instead of just > >> the former > >> > for both purposes. > >> > > >> > -- Faye > >> > > >> > > >> > Jeff Prucher wrote: > >> > > There are three examples upthread that led to the phylogeny > >> > pattern, > >> > > each of which is a slightly different case: > >> > > > >> > > (variety) <--> (generalization) > >> > > Milled corn <--> Corn > >> > > Sodium lauryl sulfate (from coconut oil) <--> Sodium > >> lauryl sulfate > >> > > Enriched flour (foo, bar, bazz, fazz) <--> Enriched flour > >> > > > >> > > Faye's division fits this pretty well: > >> > > Milled corn is derived from corn; SLS (from coconut) is a > >> > variety of > >> > > SLS, and is also derived from coconut; enriched flour > >> > (etc., usw) is a > >> > > variety of enriched flour. (Reviewing this thread, I note that Ed > >> > > suggested a Processed Ingredient type way back at the outset.) > >> > > > >> > > The big question is, would we be asking for trouble by adding a > >> > > parent/child relationship to this, in addition to the two > >> phylogeny > >> > > patterns? Or should we just punt it for now? > >> > > > >> > > Jeff > >> > > > >> > > > >> > > > >> > >> -----Original Message----- > >> > >> From: data-modeling-bounces at freebase.com > >> > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of > >> Tom Morris > >> > >> Sent: Wednesday, June 24, 2009 10:37 AM > >> > >> To: Freebase data modeling mailing list > >> > >> Subject: Re: [Data-modeling] Products with ingredients > >> > >> > >> > >> I'm with Faye. It seems very weird to have rice flour > >> and rice so > >> > >> strongly related. I don't consider rice flour to be a > >> > generalization > >> > >> of rice at all. About the only places where they would > >> > potentially > >> > >> interchangeable would be for nutritional information or > >> > for allergies. > >> > >> You might be able to substitute basmati rice for jasmine > >> > rice if you > >> > >> didn't care too much about the difference in texture or > >> > maintaining > >> > >> cultural authenticity, but if you substituted rice flour (of any > >> > >> variety), you'd be in a whole heap of trouble. > >> > >> > >> > >> > >> > >> The examples in the schema descriptions (yay for > >> > >> descriptions!) seem to have the same problem. You can get > >> > lavender > >> > >> oil out of a lavender plant, but they aren't > >> > generalizations of each > >> > >> other. If anything, the generalization would be aromatic oil or > >> > >> fragrance or something. > >> > >> > >> > >> For most applications, it's more useful to have things linked > >> > >> together because of common properties rather than > >> because they are > >> > >> made from the same source material or by the same process. > >> > >> > >> > >> -1 for making this even more obscure by linking in Material. > >> > >> > >> > >> Tom > >> > >> > >> > >> On Tue, Jun 23, 2009 at 9:02 PM, Faye Harris > >> > wrote: > >> > >> > >> > >> > >> > >> Very cool! > >> > >> > >> > >> That rice flour is called a "variety of" rice in the schema is > >> > >> indeed very odd. > >> > >> > >> > >> Based on the sandbox examples, this schema seems to use > >> > "variety of" > >> > >> for two types of relationships: > >> > >> 1) variety of, e.g. brown rice is a "variety of" rice > >> > >> 2) derived from, e.g. rice flour is "derived from" rice > >> > >> > >> > >> The former relationship is categorical, the latter relates to > >> > >> post-processing. > >> > >> > >> > >> -- Faye > >> > >> > >> > >> > >> > >> > >> > >> Jeff Prucher wrote: > >> > >> > >> > >> OK, I've got the double-phylogeny pattern > >> > working now. Take a look > >> > >> here: > >> > >> > >> > >> http://www.sandbox-freebase.com/type/schema/business/product_i > >> > >> > >> > > ngredient > >> > > > >> > >> > >> > >> And here's a table view of the ingredients of a > >> > breakfast cereal I > >> > >> found in > >> > >> the office kitchen: > >> > >> > >> > >> http://www.sandbox-freebase.com/view/user/jeff/default_domain/ > >> > >> > >> > > views/cranberr > >> > > > >> > >> y_almond_crunch_ingredients > >> > >> > >> > >> I'm not really happy with the "variety of" and > >> > "generalization of" > >> > >> names, > >> > >> but I'm not coming up with anything better. Any > >> > suggestions would > >> > >> be most > >> > >> welcome. > >> > >> > >> > >> Jeff > >> > >> > >> > >> > >> > >> > >> > >> -----Original Message----- > >> > >> From: data-modeling-bounces at freebase.com > >> > >> > >> > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of > >> > Jeff Prucher > >> > >> Sent: Friday, June 19, 2009 11:42 AM > >> > >> To: Freebase data modeling mailing list > >> > >> Subject: Re: [Data-modeling] Products > >> > with ingredients > >> > >> > >> > >> > >> > >> ----- "Faye Harris" > >> > > >> > >> wrote: > >> > >> > >> > >> > >> > >> > >> > >> From: "Faye Harris" > >> > >> > >> > >> To: "Freebase data modeling > >> > >> mailing list" > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> Sent: Thursday, June 18, 2009 > >> > >> 2:59:19 PM GMT -08:00 > >> > >> > >> > >> > >> > >> US/Canada Pacific > >> > >> > >> > >> > >> > >> Subject: Re: [Data-modeling] > >> > >> Products with ingredients > >> > >> > >> > >> Jeff Prucher wrote: > >> > >> > >> > >> > >> > >> ----- "Robert Cook" > >> > >> wrote: > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> One solution would be > >> > >> to create a topic with a long name -- enter > >> > >> > >> > >> > >> > >> it > >> > >> > >> > >> > >> > >> exactly as it appears > >> > >> on the label such as "Enriched flour - > >> > >> > >> > >> > >> > >> (wheat, > >> > >> > >> > >> > >> > >> niacin, iron, baby > >> > >> powder, sawdust, DDT)". > >> > >> > >> > >> > >> > >> > >> > >> This would answer. > >> > >> Anyone else have any comments or > >> > >> > >> > >> > >> > >> thoughts on this > >> > >> > >> > >> > >> > >> before I load the schema? > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> The main problem with this is > >> > >> you can't arrive at the products that > >> > >> use enriched flour by clicking > >> > >> on a property link from a single > >> > >> "enriched > >> > >> > >> > >> flour" topic. Rather, you have > >> > >> to do a keyword search for products > >> > >> based on matching all the > >> > >> various "enriched flour - (foo, bar, bat, > >> > >> baz)" > >> > >> ingredient topics with the > >> > >> words "enriched" and "flour". > >> > >> > >> > >> > >> > >> That's quite > >> > >> > >> > >> > >> > >> a loss in queriability.> > > >> > >> The schema is fine to get us > >> > >> started, but we're still going > >> > >> > >> > >> > >> > >> to try to > >> > >> > >> > >> > >> > >> put together some phylogeny > >> > >> pattern in place (in the near future) > >> > >> right? > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> I plan to add a phylogeny pattern > >> before moving the schema to > >> > >> freebase.com, which should help > >> queryability. It doesn't > >> > >> address the fact that topics named > >> things like "enriched > >> > >> flour (that, that, the other thing)" > >> > >> are exceedingly ugly, > >> > >> however (no-one said it was called > >> "prettybase.com", though). > >> > >> I was going to post a revised schema > >> to sandbox, with the > >> > >> double-phylogeny pattern suggested by > >> Robert, but it got > >> > >> horribly munged in the process. I'll > >> try to fix it, but it > >> > >> might not be till next week. > >> > >> > >> > >> Jeff > >> > >> _______________________________________________ > >> > >> Data-modeling mailing list > >> > >> Data-modeling at freebase.com > >> > >> > >> > >> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > >> > >> > >> > >> > >> > >> > >> _______________________________________________ > >> > >> Data-modeling mailing list > >> > >> Data-modeling at freebase.com > >> > >> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> _______________________________________________ > >> > >> Data-modeling mailing list > >> > >> Data-modeling at freebase.com > >> > >> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > > > >> > > _______________________________________________ > >> > > Data-modeling mailing list > >> > > Data-modeling at freebase.com > >> > > http://lists.freebase.com/mailman/listinfo/data-modeling > >> > > > >> > > > >> > > >> > _______________________________________________ > >> > Data-modeling mailing list > >> > Data-modeling at freebase.com > >> > http://lists.freebase.com/mailman/listinfo/data-modeling > >> > > >> > >> _______________________________________________ > >> Data-modeling mailing list > >> Data-modeling at freebase.com > >> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090630/019440ea/attachment-0001.htm From spencerkelly86 at gmail.com Tue Jun 30 21:43:55 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Tue, 30 Jun 2009 18:43:55 -0300 Subject: [Data-modeling] Products with ingredients In-Reply-To: References: <32473E87FA9E488982DE6C26F0EA5194@p4> <40A040FCC3624723B96F4F0A6DCB798B@p4> Message-ID: anyone wanna import this? http://www.seas.upenn.edu/~cse400/CIS401_2008/AvidanAckerson/index.html looks great. its a tsv and it seems cc. theres this too but it looks licenced, etc. http://householdproducts.nlm.nih.gov/cgi-bin/household/list?tbl=TblBrands&alpha=A suprised this doesnt exist already! it really should. this is gonna be a great type. *pats on back From brendan at metaweb.com Tue Jun 30 23:46:36 2009 From: brendan at metaweb.com (brendan) Date: Tue, 30 Jun 2009 16:46:36 -0700 Subject: [Data-modeling] the commons / non distinction is a farce In-Reply-To: <20090630154010.GI32383@sphinx.mythic-beasts.com> References: <20090630154010.GI32383@sphinx.mythic-beasts.com> Message-ID: <457F3EE5-7083-46BD-8008-28B7CD8335F6@metaweb.com> I want to address the "having great commons schema" part. I think we should :-) And I think it requires serious work; there should be a group of responsive people (Metawebbies or not) who do it. I'm the "Architecture" admin and also the software QA manager at Metaweb. Not all the Metaweb employee/admins are as exemplary as Jeff. Mea Culpa, I think architecture domain is one that has had a healthy level of activity in schema proposals but not such an exemplary delivery of actual change. Some of the proposals are no brainers and *not* all of those have been executed, they should be. For example, I would like to reconcile the structure type with structure2 (a long standing issue for which I apologize) I think the discussion boards for a type or domain is a good place to hash things out. Sometimes we may need to take it "offline" via email/ IM or, heck, even organize a conference call. I will review the data- modeling list and discussion posts and report back here with a proposal for some real progress in architecture. I've started to create bug/tasks for these items. Since the JIRA system provides a voting process, perhaps that could serve as a way to quantify community need. I like the idea of tracking the usage of "competing" schema as an input that guides schema changes, as well. I just think having the community work toward having great schema moved into the commons and formally "blessing" it has a lot of value to those who want to use freebase. To put it another way, app developers want great, but stable schema. Brendan On Jun 30, 2009, at 8:40 AM, Philip Kendall wrote: > On Mon, Jun 29, 2009 at 11:37:54AM -0300, Spencer Kelly wrote: >> is metaweb ready to democratize control of what makes the browse >> page and >> what doesn't? > > Let me phrase that question a different way: is the Freebase community > ready to take on the responsibility of what makes commons / the browse > page? > > Honestly, I don't think that it is. Look at any type-related > discussion > on Freebase, either on this list or the site itself and, most of the > time, the people that are actually giving critical reviews to a type > are Metaweb staff, usually Jeff. > > While I'm in no way saying that the current mechanisms (or lack > thereof) for getting a type into the commons are perfect, one of the > most important things (IMO) with regards to Freebase's data is having > high quality schemas, and that's not something I'd like to see us > compromise on. > > Cheers, > > Phil > > -- > Philip Kendall > http://www.shadowmagic.org.uk/ > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090630/88adba65/attachment.htm