Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jdavidb (1361)

jdavidb
  (email not shown publicly)
http://voiceofjohn.blogspot.com/

J. David Blackstone has a Bachelor of Science in Computer Science and Engineering and nine years of experience at a wireless telecommunications company, where he learned Perl and never looked back. J. David has an advantage in that he works really hard, he has a passion for writing good software, and he knows many of the world's best Perl programmers.

Journal of jdavidb (1361)

Monday August 06, 2007
10:08 AM

Stop URI-encoding everything

[ #34011 ]

Wikipedia handles articles for different subjects with the same name by using parenthetical expressions to disambiguate:

Git (software) vs. Git (album).

However, when you go to the disambiguation page that links to both of those, you see that the parentheses have been encoded: http://en.wikipedia.org/wiki/Git_(software).

My problem is that in my browser history, http://en.wikipedia.org/wiki/Git_(software) and http://en.wikipedia.org/wiki/Git_(software) are completely different objects. And one of them is completely unreadable and unwieldy. Firefox is smart enough to display the links correctly when I hover over them. But it does not know that it can convert that link back to the readable form and make my life easier.

This gets even worse when I try to get links to RSS feeds back out of Google Reader where I have subscribed to them. Occasionally I want to pass such links to someone, or pass them a link where they can view the feed's history in Reader, which is accomplished by prefixing http://www.google.com/reader/view/feed/ to the feed URL. Firefox beautifully allows me to right click on the links Reader provides that would be equal to that string and copy it, but Google has URL-encoded it to death, so colons, backslashes, and who knows what all else have been turned into ugly percent-containing strings that I am scared to transmit over email or, worse, use as the target of a link on a web forum post. The most convenient solution available to me seems to be to go back and manually de-URI-encode the link, and so therefore it doesn't happen much.

I'm sure there's some technical reason why this "has" to be done, but it's obvious that in practice (from the links above) it does not really have to be done most of the time, and I would love to hear someone tell me that this is a technical misunderstanding and that this really isn't required and all these people should change their software.

Footnote of supreme irony: while previewing my post, my two example links using parentheses broke because the space gets lost somewhere along the line. :) But the parentheses are just fine.

Footnote of supreme irony #2: I can't post here examples of the URLs I hate because something DE-URI-encodes them to make them look right! They are in the above, but to really see them you'll have to go to the disambiguation page I linked to.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • The right thing to do is to stop not URI-encoding everything. The browser should use the encoded form as a key everywhere and show it in some contexts when it matters (eg. autocompletion when typing) but show it decoded as an aid when that’s OK.

    (For basis, see also: homographs.)

  • I can think of this as a browser bug: it should treat the URI-encoded and equivalent non-URI-encoded URL as the same URL, for the history.