[Data-modeling] Products with ingredients

Robert Cook robert at metaweb.com
Wed Jun 17 17:28:29 UTC 2009


On Jun 17, 2009, at 9:55 AM, Jeff Prucher wrote:

>
> ----- "Robert Cook" <robert at metaweb.com> wrote:
>
>>> I don't have a good solution to the ordering problem of ingredients-
>>
>>> within-ingredients, but unless we insert a CVT and ask people to
>>> explicitly enter the ingredient order, I don't know how much we
>>> should rely on the index values, even if the client didn't have a
>>> bug with the indexing.  (Actually, an explicit order property would
>>
>>> work if we asked users to enter 1, 2, 3, 3a, 3b, 3c, 4, etc. for
>>> ingredients within ingredients.)
>>
>> Index values really are semantically relevant (see Film performances,
>>
>> for example, where billing order is a major contractual obligation).
>>
>> Also, I think hand numbered system would lead to madness.  And the
>> client can and should be fixed as well.
>>
>> One solution here could be to add a property to the ingredient type,
>>
>> "Contains ingredients", which allows one to store these sub-
>> ingredients when appropriate.  This makes the simple case of data
>> entry straightforward (enter what you see on the package into the
>> product type), while ultimately supporting the allergy use case you
>> mentioned.  And, for the allergy use case, you could add an  
>> additional
>>
>> property to ingredient to the specific allergy it causes.
>
> The trouble there is that it will require users to know *which*  
> "enriched flour" or "chocolate chips" topic to select -- many of  
> these compound ingredients have common names, but it's unreasonable  
> to expect them to be the same from manufacturer to manufacturer (or  
> even from product to product, in many cases).  Unless I completely  
> misunderstand you (which is possible), this actually makes the  
> simple case harder -- rather than simply entering what you see on  
> the box, for all compound ingredients you have to first explore  
> Freebase to determine whether the compound ingredient (with all the  
> same ingredients, in the same order) already exists, and, if it  
> doesn't, to edit the new ingredient topic you've created to add all  
> of its ingredients.  People are far more likely, I think, to grab  
> the first ingredient they find with the right name, and not take the  
> next steps to determine whether it's really the same or not.  (What  
> we really need for this is two-level disambiguation in the client!)

One solution would be to create a topic with a long name -- enter it  
exactly as it appears on the label such as "Enriched flour - (wheat,  
niacin, iron, baby powder, sawdust, DDT)".  And then it's not  
necessary to enter the subingredients or they can be added later.  And  
the subingredients could be made a disambiguator for good measure.

This also makes data input less painful and defers the structure to  
later, which is probably the best way to actually get data into the  
system.

Of course, this approach will in time create too many topics that  
appear when one types "enriched flour" into Freebase Suggest, but if  
the user types in one of the subingredients it should give more  
precise results.

R



More information about the Data-modeling mailing list