Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • It’s not Perl that’s line noise, it’s the regex syntax. Last night I wrote this: s{/(?!\.\.)[^/]+/+\.\.(?=/|\z)}{}g

    • I love the x modifier. :-)
      s{
          /           # Leading slash.
          (?!\.\.)    # No parent directories of root.
          [^/]+       # Pick out the directory name in root.
          /+\.\.      # Match any number of slashes, then parent directory.
          (?=/|\z)    # Assert that there is a following slash (or end of string).
      }{}gx
      So it cleans up pathnames to remove /foo/../bar/../baz to be just /baz if I understand that right.
      • Even /x only really helps because of your profuse comments, though.

        You guessed correctly: I needed to normalize HTTP URIs to compare them, and URI [cpan.org]’s canonical method doesn’t finish the job. To be precise, this runs in a 1 while s/// loop, which is necessary to handle paths like foo/bar/baz/quux/../../../bar.

        • It's not just the comments. I also find that breaking the regex into its component groups makes it easier to understand. It's what I do mentally anyway, so it just means that it's done already for me.

          -Dom

      • Hmm, actually, that has a bug. The negative look-ahead must contain a trailing slash, otherwise the pattern will erroneously fail to match something like foo/..fooledya/../bar.

        At first I thought I needed a more complex assertion than just include a trailing slash in there, so I started rewriting the regex extensively, and after I realised that it’s not that complex, I noticed that my comments actually have a noticably different focus from yours, so I decided to keep the result for comparison:

        s{