[Developers] Wikipedia key string encoding
Will Moffat
willmoffat at metaweb.com
Mon Jan 14 14:50:10 UTC 2008
Dear Metaweb,
> Max Lansing wrote:
>> Is there any documentation regarding the type of string encoding
>> used for wikipedia keys in freebase records?
Encoding and decoding keys is painful. Why doesn't the API service
convert the Metaweb coding to a standard coding?
(For example using '\u1234' rather than '$1234')
I wrote 2 Javascript functions for keys. It would be nice if I could
just throw them away.
(This code is pretty nasty, especially the eval. Anybody have a better
system?)
regards,
--Will
/* Example encoding and decoding functions in Javascript */
function str2key(key) {
// key may only contain letters, numbers, underscores, hyphens
return key.replace( /[^A-Za-z0-9_-]/g, _dollarEscape );
}
// nasty hack to convert freebase Andrew_$0022Test$0022_Martin to
Andrew_"Test"_Martin
function key2str(str) {
//match a 4-digit hex number starting with $
return str.replace(/\$[0-9a-fA-f]{4}/g,function(str) {
var unicodeStr='"'+str.replace('$','\\u')+'"';
return eval(unicodeStr);});
}
function _dollarEscape(aChar) {
var hex = aChar.charCodeAt(0).toString(16).toUpperCase();
if (hex.length==1) { return "$000"+hex; }
if (hex.length==2) { return "$00" +hex; }
if (hex.length==3) { return "$0" +hex; }
if (hex.length==4) { return "$" +hex; }
return null;
}
More information about the Developers
mailing list