Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Robrt (1414)

Robrt
  (email not shown publicly)

robert at perl dot org

Journal of Robrt (1414)

Tuesday May 02, 2006
11:37 PM

Line Noise

[ #29497 ]
I wrote this regular expression yesterday: /\._.*/

My thought at that moment? Yes, perl is line noise.

Anyone want to guess what it's for?

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
    • How, pray tell, is this particular regex going to backtrack needlessly? Or are you just cargo culting the “death to dot star” line?

      (That said, the .* in Robrt’s pattern is superfluous, as the pattern will match the exact same things with or without it.)

      • Yes, I was invoking it in the "Cargo Cult" context. The .* at the end was superfluous. He could have put them at both ends: /.*\._.*/. So, yes, I was cargo culting it. I was trying to draw attention to the fact that .* is almost never what you want to use.
        • “Death to dot star” is about backtracking. At the end of the pattern, the .* won’t backtrack. If you put another one at the front, though, it will. What is your point?

          You would have made your case much better if you just said “the .* there is a noop” instead of throwing in something entirely unrelated that happens to be about dot star.

          I was trying to draw attention to the fact that .* is almost never what you want to use.

          But you’re wrong. It is exactly what you wa

  • Let me guess: it has to do with removing those resource fork files that you get on OS X in some circumstances?

  • It’s not Perl that’s line noise, it’s the regex syntax. Last night I wrote this: s{/(?!\.\.)[^/]+/+\.\.(?=/|\z)}{}g

    • I love the x modifier. :-)

      s{
          /           # Leading slash.
          (?!\.\.)    # No parent directories of root.
          [^/]+       # Pick out the directory name in root.
          /+\.\.      # Match any number of slashes, then parent directory.
          (?=/|\z)    # Assert that there is a following slash (or end of string).
      }{}gx

      So it cleans up pathnames to remove /foo/../bar/../baz to be ju

      • Even /x only really helps because of your profuse comments, though.

        You guessed correctly: I needed to normalize HTTP URIs to compare them, and URI [cpan.org]’s canonical method doesn’t finish the job. To be precise, this runs in a 1 while s/// loop, which is necessary to handle paths like foo/bar/baz/quux/../../../bar.

        • It's not just the comments. I also find that breaking the regex into its component groups makes it easier to understand. It's what I do mentally anyway, so it just means that it's done already for me.

          -Dom

      • Hmm, actually, that has a bug. The negative look-ahead must contain a trailing slash, otherwise the pattern will erroneously fail to match something like foo/..fooledya/../bar.

        At first I thought I needed a more complex assertion than just include a trailing slash in there, so I started rewriting the regex extensively, and after I realised that it’s not that complex, I noticed that my comments actually have a noticably different focus from yours, so I decided to keep the result for comparison:

        s{