[Developers] MQL bug WRT case sensitivity

Scott Meyer sm at metaweb.com
Mon Mar 30 23:57:58 UTC 2009


Kurt Bollacker wrote:
> On Mon, Mar 30, 2009 at 03:57:54PM -0700, Warren Harris wrote:
>> On Mar 30, 2009, at 4:02 PM, Kurt Bollacker wrote:
>>
>>> Are there other good choices?
>> I see that for the IPA example, you're storing this stuff under the  
>> "alias" property. Is there any way you could store it as a new  
>> property of type rawstring? That would get around the case- 
>> insensitivity issue. 
> 
> I could, but then the usability in the client would be reduced, and
> I'd be splitting aliases between /common/topic/alias and some new
> property.  I think a "Right Model(tm)" would be to have the aliases be
> in their own language (e.g. "/lang/ipa"), which would specifically be
> case sensitive.

I'm a bit unclear on the need for case insensitivity in IPA.  Your
original failure/example was a problem with the x-like characters
known as "uvular fricative" and "velar fricative" (?, 0x03C7 and
x, 0x0263, if your browser supports unicode).

I don't see any mention of ascii/latin x either upper or lower case
in the following

http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm
http://www.linguistlist.org/unicode/ipa.html

All the IPA characters seem to have unique codes.

Assuming that I'm right about the character encoding issues
I'm not sure how to represent IPA in Freebase.  /lang/ipa
seems wrong since it would make the obvious "pronunciation"
property awkward to work with.  Under the current MQL,
you'd have to ask for it with an explicit "lang" : "/lang/ipa"
constraint.  Otoh, storing it as some other language, fx.
"/lang/en", seems just as bad, although technically correct if
you're describing the (er an?) English pronunciation of an
English word.

Consider the case of a word which is spelled the same
and means the same in two different languages but has
different pronunciations.  Pretty typical when one
language borrows a word from another.

-Scott



More information about the Developers mailing list