[Developers] Wikipedia key string encoding

Will Moffat willmoffat at metaweb.com
Mon Jan 14 14:50:10 UTC 2008


Dear Metaweb,

> Max Lansing wrote:
>> Is there any documentation regarding the type of string encoding
>> used for wikipedia keys in freebase records?

Encoding and decoding keys is painful. Why doesn't the API service  
convert the Metaweb coding to a standard coding?
(For example using '\u1234' rather than '$1234')

I wrote 2 Javascript functions for keys. It would be nice if I could  
just throw them away.
(This code is pretty nasty, especially the eval. Anybody have a better  
system?)

regards,
--Will




/* Example encoding and decoding functions in Javascript */

function str2key(key) {
     // key may only contain letters, numbers, underscores, hyphens
     return key.replace( /[^A-Za-z0-9_-]/g, _dollarEscape );
}

// nasty hack to convert freebase Andrew_$0022Test$0022_Martin to  
Andrew_"Test"_Martin
function key2str(str) {
     //match a 4-digit hex number starting with $
     return str.replace(/\$[0-9a-fA-f]{4}/g,function(str) {
	var unicodeStr='"'+str.replace('$','\\u')+'"';
	return eval(unicodeStr);});
}

function _dollarEscape(aChar) {
     var hex = aChar.charCodeAt(0).toString(16).toUpperCase();
     if (hex.length==1) { return "$000"+hex; }
     if (hex.length==2) { return "$00" +hex; }
     if (hex.length==3) { return "$0"  +hex; }
     if (hex.length==4) { return "$"   +hex; }
     return null;
}


More information about the Developers mailing list