[Data-modeling] Distilled & blended spirits

Christopher R. Maden crism at metaweb.com
Fri Jun 20 19:38:15 UTC 2008


Tom Morris wrote:
> Actually, I realized after reading up a little bit on Freebase i18n
> (there isn't any)

Where did you read that?

Check out <URL: 
http://blog.freebase.com/2007/12/04/internationalization-in-freebase/ >. 
  The client just doesn’t expose the internationalization well right now.

> that "poitín" is probably wrong since it's likely
> been automatically tagged as an English word, which it's not.  It also
> offered me an automatic "aka" of "poitin" which I unthinkingly
> accepted.  It's fine to keep this as an alternate search target, but
> it shouldn't be displayed under "also known as" because it's not a
> real word (in any language).
> 
> So what's the current best practice?  Delete "poitín" and promote
> "poteen" to the primary name?  I'm concerned that if I leave it as is,
> it'll be hard to find later (ie there's no query "show me all entries
> tagged as English which contain non-English words").
> 
> On a related note, are entries supposed to be in American English,
> some flavor of "international" English or author's choice?   (Question
> triggered by the use of a naked "Scotch" which I think is an
> Americanism for "Scotch whisky").

The display names should be the thing that English-speaking users are 
most likely to recognize.  In American vs. British conflicts (e.g. 
colo[u]r), like Wikipedia, whoever gets there first wins.  Other 
variations should be present as aliases.

Accented Latin characters, as in poitín, are fine.  Chinese, Greek, 
Hebrew, Arabic, Cyrillic, etc., should generally be avoided in English 
names of topics.

So I would say leave Poitín as the display name (I think most 
English-speaking fans or foes would recognize that name, and the 
Wikipedia article uses that name) and add Poteen as an alias.  But it 
is, to some extent, a matter of subjective opinion; do what you think is 
best there.  The unaccented version Poitin is a somewhat unfortunate 
hack, but until we have nailed down our international searching, it 
should probably be left in place.  (Just don’t take “Ano Nuevo State 
Reserve” too seriously.)

~Chris
-- 
Christopher R. Maden
Data Architect
Metaweb Technologies, Inc.
<URL: http://www.metaweb.com/ >


More information about the Data-modeling mailing list