I wanted to Free Links page to be nice and clear for everyone to use as a reference, so I moved some clutter here.
Excellent! Now, is there a way to automate the conversion of old-style links (and pages) to new-style links? I seem to remember Cliff saying something about doing this. For example, I would like to convert links that look like LarrysText to Larry's Text; then it would be cool also to convert the pages themselves, in this case to Larry's Text (since apostrophes are not allowed in page names). What are the chances of this happening anytime soon? -- Larry Sanger
Fixing the non-ASCII character problem should be relatively easy. A similar problem has already been solved in the world of IMAP in the i18n of mailbox names, and the solution was to use modified UTF-7 encoding. For a complete explanation, please see RFC 2060, section 5.1.3. The short version is that M-UTF-7 uses US-ASCII representation for US-ASCII characters, and "shifts" into modified BASE 64 for all other charsets (UNICODE 16-bit octets).
We might have to use something like UTF-7 to encode URLs for the Japanese pages or for other languages. But for the more common ones that fit within the normal ISO-8859-1 range (Spanish, French, German, et al.) it might be better to anglicise for the purpose of searching. I can imagine, for example, English-speaking people who might want to read a French page about Paris, or a German page about Kurt Gödel. I can imagine people (perhaps even Germans used to English web sites) searching for "Godel" or "Goedel", but I can't imagine them ever searching for "G+APY-del" (the UTF-7 rendering). Of course, the software could do that in the background, but I think taking advantage of 8-bit ISOwould generate more links and hits.