[Data-modeling] What's the best way to represent acronyms/initialisms?
Faye Li
faye at metaweb.com
Thu Jan 24 20:26:27 UTC 2008
Hi,
I'm late to the discussion although I've watching this thread with interest.
Abbreviations/acronyms are inherently vague and the same grouping of
letters can stand for different things in different contexts. It would
be nice to record the complete name for an abbreviation along with the
domain or context where the complete name makes sense.
For example, OOC (out of character) is a term used in both the
role-playing gaming community as well as the domain of fan fiction, yet
means different things to the two groups. If I were looking up the term
OOC, it'd be nice to see a hint that constrains each complete
name/definition to the domain where the definition is recognized.
It's not unlike the way Fictional Character is modeled to include a
"Fictional Universe" property, thus clearly differentiating "The Doctor"
character in the "Doctor Who" series from "The Doctor" character (aka
Emergency Medical Hologram") in the "Star Trek: Voyager" universe.
Use cases I have in mind that will benefit from this modeling approach
include:
1) Looking up the complete name/explanation for an multi-defined
abbreviation/acronym, perhaps the ability to constrain by context
2) Looking up all abbreviations/acronyms in a domain (like looking up
all characters in a Fictional Universe)
-- Faye
Shawn Simister wrote:
> Good point. I've updated the schema accordingly.
>
> Robert Cook wrote:
>> Shawn -- I think this is the right model, although I would use "text"
>> instead of "machine readable string" as the expected type. Text
>> literals can have translations and would accommodate situations where
>> the abbreviation is different in other languages. For example, the
>> topic "AIDS" would be "SIDA" in several romance languages.
>>
>> http://www.freebase.com/view/explore/topic/en/aids (scroll down to
>> "outgoing properties")
>>
>> Сида (/type/text)
>> AIDS (/type/text)
>> 에이즈 (/type/text)
>> SIDA (/type/text)
>>
>> These are /type/object/name property values, but could easily be
>> /common/abbreviated_type/abbreviation values.
>>
>> R
>>
>> On Jan 23, 2008, at 12:23 AM, Shawn Simister wrote:
>>
>>> Thanks for all the feedback. I had no idea there was so much
>>> interest in this topic.
>>>
>>> Robert's idea of creating an Abbreviated Topic co-type would be my
>>> preference of everything that has been discussed so far. It's
>>> simple, easy to use and I also like the idea of expanding its scope
>>> to include all abbreviations. The key to this approach would be to
>>> have the Freebase autocomplete pick up on these new abbreviation
>>> values in the same way that it currently treats the alias property.
>>> While I think that a CVT would provide a lot of flexibility, it's
>>> just too confusing to attract new abbreviation entries from casual
>>> users.
>>>
>>> Maybe some of the commonly abbreviated types like Organization could
>>> include Abbreviated Topic in their schema to encourage its use. In
>>> fact, some types like Unit Profile already have abbreviation
>>> properties that could be factored out.
>>>
>>> I've published a draft version of the proposed Abbreviated Topic
>>> <http://freebase.com/view/schema/user/narphorium/default_domain/abbreviated_topic>
>>> type in my default domain just to make sure we're on the same page
>>> about what this would look like.
>>>
>>> Shawn
>>>
>>> Robert Cook wrote:
>>>> I've often thought that the "also known as" (alias) field of
>>>> /common/topic is too broad and there might be other properties for
>>>> capturing alternate names. One idea would be to have a property
>>>> called "abbreviation" that is of type text. This property could be
>>>> on a new type /common/abbreviated_topic and would capture acronyms,
>>>> initialisms (that are typically thought of as acronyms) and
>>>> abbreviations.
>>>>
>>>> Topics like "National Educational Association" would be co-typed
>>>> "Abbreviated Topic", and the abbreviation property would contain
>>>> "NEA". For "NASA" it would contain "NASA". To capture "National
>>>> Aeronautics and Space Administration" there would be a property
>>>> /common/abbreviated_topic/complete_name
>>>>
>>>> Of course, this introduces a denormalization. There would be two
>>>> properties with "NASA" - the display name of the topic and the new
>>>> abbreviation property. I personally think this is fine and I
>>>> anticipate that this pattern will happen elsewhere (perhaps
>>>> /people/person/given_name and /people/person/family_name; also
>>>> /common/topic/common_misspellings).
>>>>
>>>> Shawn -- would this work for imagined applications?
>>>>
>>>> R
>>>>
>>>>
>>>> On Jan 22, 2008, at 10:42 AM, Jeff Prucher wrote:
>>>>
>>>>> I agree with Ed that common usage should determine whether a topic
>>>>> name should be the expanded form or an
>>>>> acronym/intialism/abbreviation, although this is obviously a
>>>>> judgement call, and anyone who feels differently is free to rename
>>>>> the topic. This ability to rename the topic argues against
>>>>> something like the "abbreviation" type
>>>>> (http://www.freebase.com/view/schema/user/skud/default_domain/abbreviation),
>>>>> since if someone renames the NASA topic to "National Aeronautics
>>>>> and Space Administration", the topic is no longer an abbreviation.
>>>>>
>>>>> I have two other thoughts. One would be to create a type for
>>>>> initialisms and acronyms. It would exist separately from any
>>>>> topics that happened to share that acronym, so there would be one
>>>>> topic for "NASA", the space agency, and one for "NASA", the
>>>>> acronym. (This would get less weird for shared initialisms like
>>>>> "SF", which can refer to San Francisco, science fiction, and
>>>>> who-knows-what-all.) There could be a property that linked to a
>>>>> CVT -- one property of the CVT would be the expansion of the
>>>>> acronym as a text string, the second property would be an optional
>>>>> link to the corresponding Freebase topic. (It'd be a CVT so that
>>>>> there was no confusion about which string went with which topic
>>>>> for shared acronyms.) The main problem with this is that it would
>>>>> be confusing, and the acronym topic for NASA (or whatever) would
>>>>> start to accrue other types since people would be certain to make
>>>>> the wrong selection from autocomplete on occasion. We'd probably
>>>>> also have to deal with a lot of merge requests between the acronym
>>>>> topic and the more general topic.
>>>>>
>>>>> That said, we'd still want the aliases on the topics that aren't
>>>>> of type "acronym" to include the acronyms or expanded names,
>>>>> since, as Shawn points out, it affects searches.
>>>>>
>>>>> My current take on this (and I'm open to other opinions) is that
>>>>> my proposal here is an interesting thought-experiment, but is
>>>>> probably not the right way to handle this.
>>>>>
>>>>> Jeff
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> *From:* data-modeling-bounces at freebase.com
>>>>> [mailto:data-modeling-bounces at freebase.com] *On Behalf Of *Ed
>>>>> Laurent
>>>>> *Sent:* Monday, January 21, 2008 9:21 PM
>>>>> *To:* Freebase data modeling mailing list
>>>>> *Subject:* Re: [Data-modeling] What's the best way to
>>>>> representacronyms/initialisms?
>>>>>
>>>>> It would be nice to have a naming standard and universal
>>>>> acronym type that fits all situations. I've been naming the
>>>>> topic as its full name or acronym depending on which is more
>>>>> commonly used (e.g., ETM+
>>>>> <http://www.freebase.com/view/guid/9202a8c04000641f8000000006acbd1d>).
>>>>> I always add the full name as a synonym if it the topic name
>>>>> is an acronym. I add the acronym as a synonym if the topic is
>>>>> named using the full name but the acronym it is commonly used.
>>>>> I try to use Kirrily's acronym type when possible but I've
>>>>> also listed it as a machine readable string sometimes. Either
>>>>> way, I always add an acronym property to the type if one is
>>>>> ever used (e.g., satellite sensor
>>>>> <http://www.freebase.com/view/schema/user/spatialed/land_cover/satellite_sensor>).
>>>>> This approach is not really a standard because I flip between
>>>>> Kirrily's type and machine readable string. It's also
>>>>> subjective whether or not the user thinks the acronym rises to
>>>>> the level of a synonym. However, the info is there for when/if
>>>>> a standard is set and this approach seems to work pretty well.
>>>>>
>>>>> -Ed
>>>>>
>>>>>
>>>
>>> _______________________________________________
>>> Data-modeling mailing list
>>> Data-modeling at freebase.com <mailto:Data-modeling at freebase.com>
>>> http://lists.freebase.com/mailman/listinfo/data-modeling
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Data-modeling mailing list
>> Data-modeling at freebase.com
>> http://lists.freebase.com/mailman/listinfo/data-modeling
>>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20080124/5130b275/attachment-0001.htm
More information about the Data-modeling
mailing list