[Developers] mql_escape and UTF-8

Christopher R. Maden crism at metaweb.com
Wed Jul 30 16:43:47 UTC 2008


Shug Boabby wrote:
> Thanks Chris... I think I'd already worked all that out, but I was
> just wondering if anybody had actually written a Java encoder/decoder
> between UTF-8/MW Hex. I realise it should be simple to convert the
> $000 syntax, but it is troublesome to have to write this code myself.
> I really wish you'd decided to just use the URL encoding scheme as
> that would require no additional work on our side of things (despite
> it looking ugly). It's just not standard enough (although, admittedly,
> prettier).

Besides the fact that URL-encoding is often broken, we deliberately 
chose a variant syntax to avoid double escaping.

If we stored Garc%C3%ADa as a key in the Freebase graph, then the URL to 
access it would involve Garc%25C3%25ADa, which is just egregious.

We could have allowed literal high characters in our keys, but there was 
a feeling that the keys should be as programatically usable as possible 
in their native form, which meant keeping them to ASCII only.

~Chris
-- 
Christopher R. Maden
Data Architect
Freebase.com: <URL: http://www.freebase.com/ >
Metaweb Technologes, Inc. <URL: http://www.metaweb.com/ >


More information about the Developers mailing list