[Data-modeling] What's the best way to representacronyms/initialisms?
Robert Cook
robert at metaweb.com
Tue Jan 22 23:34:46 UTC 2008
I've often thought that the "also known as" (alias) field of /common/
topic is too broad and there might be other properties for capturing
alternate names. One idea would be to have a property called
"abbreviation" that is of type text. This property could be on a new
type /common/abbreviated_topic and would capture acronyms, initialisms
(that are typically thought of as acronyms) and abbreviations.
Topics like "National Educational Association" would be co-typed
"Abbreviated Topic", and the abbreviation property would contain
"NEA". For "NASA" it would contain "NASA". To capture "National
Aeronautics and Space Administration" there would be a property /
common/abbreviated_topic/complete_name
Of course, this introduces a denormalization. There would be two
properties with "NASA" - the display name of the topic and the new
abbreviation property. I personally think this is fine and I
anticipate that this pattern will happen elsewhere (perhaps /people/
person/given_name and /people/person/family_name; also /common/topic/
common_misspellings).
Shawn -- would this work for imagined applications?
R
On Jan 22, 2008, at 10:42 AM, Jeff Prucher wrote:
> I agree with Ed that common usage should determine whether a topic
> name should be the expanded form or an acronym/intialism/
> abbreviation, although this is obviously a judgement call, and
> anyone who feels differently is free to rename the topic. This
> ability to rename the topic argues against something like the
> "abbreviation" type (http://www.freebase.com/view/schema/user/skud/default_domain/abbreviation
> ), since if someone renames the NASA topic to "National Aeronautics
> and Space Administration", the topic is no longer an abbreviation.
>
> I have two other thoughts. One would be to create a type for
> initialisms and acronyms. It would exist separately from any topics
> that happened to share that acronym, so there would be one topic for
> "NASA", the space agency, and one for "NASA", the acronym. (This
> would get less weird for shared initialisms like "SF", which can
> refer to San Francisco, science fiction, and who-knows-what-all.)
> There could be a property that linked to a CVT -- one property of
> the CVT would be the expansion of the acronym as a text string, the
> second property would be an optional link to the corresponding
> Freebase topic. (It'd be a CVT so that there was no confusion about
> which string went with which topic for shared acronyms.) The main
> problem with this is that it would be confusing, and the acronym
> topic for NASA (or whatever) would start to accrue other types since
> people would be certain to make the wrong selection from
> autocomplete on occasion. We'd probably also have to deal with a
> lot of merge requests between the acronym topic and the more general
> topic.
>
> That said, we'd still want the aliases on the topics that aren't of
> type "acronym" to include the acronyms or expanded names, since, as
> Shawn points out, it affects searches.
>
> My current take on this (and I'm open to other opinions) is that my
> proposal here is an interesting thought-experiment, but is probably
> not the right way to handle this.
>
> Jeff
>
> From: data-modeling-bounces at freebase.com [mailto:data-modeling-bounces at freebase.com
> ] On Behalf Of Ed Laurent
> Sent: Monday, January 21, 2008 9:21 PM
> To: Freebase data modeling mailing list
> Subject: Re: [Data-modeling] What's the best way to
> representacronyms/initialisms?
>
> It would be nice to have a naming standard and universal acronym
> type that fits all situations. I've been naming the topic as its
> full name or acronym depending on which is more commonly used (e.g.,
> ETM+). I always add the full name as a synonym if it the topic name
> is an acronym. I add the acronym as a synonym if the topic is named
> using the full name but the acronym it is commonly used. I try to
> use Kirrily's acronym type when possible but I've also listed it as
> a machine readable string sometimes. Either way, I always add an
> acronym property to the type if one is ever used (e.g., satellite
> sensor). This approach is not really a standard because I flip
> between Kirrily's type and machine readable string. It's also
> subjective whether or not the user thinks the acronym rises to the
> level of a synonym. However, the info is there for when/if a
> standard is set and this approach seems to work pretty well.
>
> -Ed
>
>
> On Jan 21, 2008 11:55 PM, Shawn Simister <narphorium at gmail.com> wrote:
> Kirrily Robert wrote:
>>
>> ----- "Shawn Simister" <narphorium at gmail.com> wrote:
>>
>>> I have the finished pipe working if anyone wants to try it out.
>>> Its an acronym lookup tool.
>>> http://pipes.yahoo.com/narphorium/acronym
>>> There aren't many acronyms listed in Freebase yet so only the
>>> major ones give any results right
>>> now.
>>>
>> That's very cool! Reminds me I was thinking of creating an FB type
>> for acronyms and initialisms. Might make your pipes a little less
>> maze-like if that existed.
>>
>> K.
>>
> Thanks for the suggestions, I too was thinking that an acronym type
> could help out. As you may have seen, the pipe got a little
> complicated when I tried to factor in topics where the acronym is
> the main title of the topic and the full name is one of the aliases
> (eg. DVD, NASA). A specialized type would definitely help in those
> cases.
>
> On the other hand, using the alias property might be easier to
> understand for a new user editing a topic and, as I understand it,
> the alias properties also help to improve the overall searchability
> of Freebase.
>
> I've extracted about 8,000 acronym/initialism definitions from
> Wikipedia which could be added to existing Freebase topics. Finding
> a standard way of representing this data would be the logical next
> step. Any thoughts?
>
> Shawn
>
> P.S. Congrats on your new job!
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
>
>
> _______________________________________________
> Data-modeling mailing list
> Data-modeling at freebase.com
> http://lists.freebase.com/mailman/listinfo/data-modeling
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20080122/dc45e416/attachment.htm
More information about the Data-modeling
mailing list