[Data-modeling] What's the best way to represent acronyms/initialisms?

Robert Cook robert at metaweb.com
Wed Jan 23 17:50:45 UTC 2008


Shawn -- I think this is the right model, although I would use "text"  
instead of "machine readable string" as the expected type.  Text  
literals can have translations and would accommodate situations where  
the abbreviation is different in other languages.  For example, the  
topic "AIDS" would be "SIDA" in several romance languages.

http://www.freebase.com/view/explore/topic/en/aids (scroll down to  
"outgoing properties")

Сида (/type/text)
AIDS (/type/text)	
에이즈 (/type/text)	
SIDA (/type/text)

These are /type/object/name property values, but could easily be / 
common/abbreviated_type/abbreviation values.

R

On Jan 23, 2008, at 12:23 AM, Shawn Simister wrote:

> Thanks for all the feedback. I had no idea there was so much  
> interest in this topic.
>
> Robert's idea of creating an Abbreviated Topic co-type would be my  
> preference of everything that has been discussed so far. It's  
> simple, easy to use and I also like the idea of expanding its scope  
> to include all abbreviations. The key to this approach would be to  
> have the Freebase autocomplete pick up on these new abbreviation  
> values in the same way that it currently treats the alias property.  
> While I think that a CVT would provide a lot of flexibility, it's  
> just too confusing to attract new abbreviation entries from casual  
> users.
>
> Maybe some of the commonly abbreviated types like Organization could  
> include Abbreviated Topic in their schema to encourage its use. In  
> fact, some types like Unit Profile already have abbreviation  
> properties that could be factored out.
>
> I've published a draft version of the proposed Abbreviated Topic  
> type in my default domain just to make sure we're on the same page  
> about what this would look like.
>
> Shawn
>
> Robert Cook wrote:
>>
>> I've often thought that the "also known as" (alias) field of / 
>> common/topic is too broad and there might be other properties for  
>> capturing alternate names.  One idea would be to have a property  
>> called "abbreviation" that is of type text.  This property could be  
>> on a new type /common/abbreviated_topic and would capture acronyms,  
>> initialisms (that are typically thought of as acronyms) and  
>> abbreviations.
>>
>> Topics like "National Educational Association" would be co-typed  
>> "Abbreviated Topic", and the abbreviation property would contain  
>> "NEA".  For "NASA" it would contain "NASA".  To capture "National  
>> Aeronautics and Space Administration" there would be a property / 
>> common/abbreviated_topic/complete_name
>>
>> Of course, this introduces a denormalization.  There would be two  
>> properties with "NASA" - the display name of the topic and the new  
>> abbreviation property.  I personally think this is fine and I  
>> anticipate that this pattern will happen elsewhere (perhaps /people/ 
>> person/given_name and /people/person/family_name; also /common/ 
>> topic/common_misspellings).
>>
>> Shawn -- would this work for imagined applications?
>>
>> R
>>
>>
>> On Jan 22, 2008, at 10:42 AM, Jeff Prucher wrote:
>>
>>> I agree with Ed that common usage should determine whether a topic  
>>> name should be the expanded form or an acronym/intialism/ 
>>> abbreviation, although this is obviously a judgement call, and  
>>> anyone who feels differently is free to rename the topic.  This  
>>> ability to rename the topic argues against something like the  
>>> "abbreviation" type (http://www.freebase.com/view/schema/user/skud/default_domain/abbreviation 
>>> ), since if someone renames the NASA topic to "National  
>>> Aeronautics and Space Administration", the topic is no longer an  
>>> abbreviation.
>>>
>>> I have two other thoughts. One would be to create a type for  
>>> initialisms and acronyms. It would exist separately from any  
>>> topics that happened to share that acronym, so there would be one  
>>> topic for "NASA", the space agency, and one for "NASA", the  
>>> acronym.  (This would get less weird for shared initialisms like  
>>> "SF", which can refer to San Francisco, science fiction, and who- 
>>> knows-what-all.)  There could be a property that linked to a CVT  
>>> -- one property of the CVT would be the expansion of the acronym  
>>> as a text string, the second property would be an optional link to  
>>> the corresponding Freebase topic.  (It'd be a CVT so that there  
>>> was no confusion about which string went with which topic for  
>>> shared acronyms.)  The main problem with this is that it would be  
>>> confusing, and the acronym topic for NASA (or whatever) would  
>>> start to accrue other types since people would be certain to make  
>>> the wrong selection from autocomplete on occasion.  We'd probably  
>>> also have to deal with a lot of merge requests between the acronym  
>>> topic and the more general topic.
>>>
>>> That said, we'd still want the aliases on the topics that aren't  
>>> of type "acronym" to include the acronyms or expanded names,  
>>> since, as Shawn points out, it affects searches.
>>>
>>> My current take on this (and I'm open to other opinions) is that  
>>> my proposal here is an interesting thought-experiment, but is  
>>> probably not the right way to handle this.
>>>
>>> Jeff
>>>
>>> From: data-modeling-bounces at freebase.com [mailto:data-modeling-bounces at freebase.com 
>>> ] On Behalf Of Ed Laurent
>>> Sent: Monday, January 21, 2008 9:21 PM
>>> To: Freebase data modeling mailing list
>>> Subject: Re: [Data-modeling] What's the best way to  
>>> representacronyms/initialisms?
>>>
>>> It would be nice to have a naming standard and universal acronym  
>>> type that fits all situations. I've been naming the topic as its  
>>> full name or acronym depending on which is more commonly used  
>>> (e.g., ETM+). I always add the full name as a synonym if it the  
>>> topic name is an acronym. I add the acronym as a synonym if the  
>>> topic is named using the full name but the acronym it is commonly  
>>> used. I try to use Kirrily's acronym type when possible but I've  
>>> also listed it as a machine readable string sometimes. Either way,  
>>> I always add an acronym property to the type if one is ever used  
>>> (e.g., satellite sensor).  This approach is not really a standard  
>>> because I flip between Kirrily's type and machine readable string.  
>>> It's also subjective whether or not the user thinks the acronym  
>>> rises to the level of a synonym. However, the info is there for  
>>> when/if a standard is set and this approach seems to work pretty  
>>> well.
>>>
>>> -Ed
>>>
>>>
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20080123/ded7eaed/attachment.htm 


More information about the Data-modeling mailing list