[Developers] MQL bug WRT case sensitivity
Scott Meyer
sm at metaweb.com
Tue Mar 31 22:32:42 UTC 2009
Kurt Bollacker wrote:
>> All the IPA characters seem to have unique codes.
>
> Google for "ASCII IPA". (e.g. find SAMPA and Kirshcenbaum) It's been
> practice for some to use pure ASCII to encode IPA, which treats upper
> and lower case as different. This the problem I have here is that I
> seem to have an unholy mix of IPA representations. Don't blame me, the
> linguists made me do it!
And reason you can translate automatically from ASCII IPA into
unicode IPA is that ASCII IPA is mixed with plain English?
>> Assuming that I'm right about the character encoding issues
>> I'm not sure how to represent IPA in Freebase. /lang/ipa
>> seems wrong since it would make the obvious "pronunciation"
>> property awkward to work with. Under the current MQL,
>> you'd have to ask for it with an explicit "lang" : "/lang/ipa"
>> constraint. Otoh, storing it as some other language, fx.
>> "/lang/en", seems just as bad, although technically correct if
>> you're describing the (er an?) English pronunciation of an
>> English word.
>
> Unfortunately, I have many "Alernate Names" for Languages that are
> mostly (English) words (in ASCII) mixed in with representations of IPA
> for the rest. The most manageable way I know to handle this is to
> make them all aliases in English.
...and there are too many English words which are also valid ASCII IPA?
-Scott
More information about the Developers
mailing list