[Developers] MQL bug WRT case sensitivity
Scott Meyer
sm at metaweb.com
Mon Mar 30 23:57:58 UTC 2009
Kurt Bollacker wrote:
> On Mon, Mar 30, 2009 at 03:57:54PM -0700, Warren Harris wrote:
>> On Mar 30, 2009, at 4:02 PM, Kurt Bollacker wrote:
>>
>>> Are there other good choices?
>> I see that for the IPA example, you're storing this stuff under the
>> "alias" property. Is there any way you could store it as a new
>> property of type rawstring? That would get around the case-
>> insensitivity issue.
>
> I could, but then the usability in the client would be reduced, and
> I'd be splitting aliases between /common/topic/alias and some new
> property. I think a "Right Model(tm)" would be to have the aliases be
> in their own language (e.g. "/lang/ipa"), which would specifically be
> case sensitive.
I'm a bit unclear on the need for case insensitivity in IPA. Your
original failure/example was a problem with the x-like characters
known as "uvular fricative" and "velar fricative" (?, 0x03C7 and
x, 0x0263, if your browser supports unicode).
I don't see any mention of ascii/latin x either upper or lower case
in the following
http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm
http://www.linguistlist.org/unicode/ipa.html
All the IPA characters seem to have unique codes.
Assuming that I'm right about the character encoding issues
I'm not sure how to represent IPA in Freebase. /lang/ipa
seems wrong since it would make the obvious "pronunciation"
property awkward to work with. Under the current MQL,
you'd have to ask for it with an explicit "lang" : "/lang/ipa"
constraint. Otoh, storing it as some other language, fx.
"/lang/en", seems just as bad, although technically correct if
you're describing the (er an?) English pronunciation of an
English word.
Consider the case of a word which is spelled the same
and means the same in two different languages but has
different pronunciations. Pretty typical when one
language borrows a word from another.
-Scott
More information about the Developers
mailing list