Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • You are working under the assumption that you won't be able to reuse the existing toolchain to write these tools.

    The approach that everyone seem to be aiming for is that instead of writing PPI all over again for Perl 6, you are supposed to get decent enough support form the actual grammar that ships with the compiler in order to do your own interesting things with it.

    Secondly, since Perl 6 supports separate compilation units in many ways this is actually much simpler than Perl 5 - there is no more possibility for action at a distance. If a unit is expected to compile independently of the other units then all the information on how to compile it must be in the files creating that unit. Granted, you are allowed to twiddle things in BEGIN { } by just running perl, but the sequencing of the operations is less adhoc. This has lots of merit in this particular field: in the absense of any BEGIN { } declarations or importing of macros/grammars from other compilation units (this is something you can statically check for if the unit is already compiled) then you can statically parse Perl 6 as defined in it's core syntax.

    In other cases, if it's safety you're after, in not running the compile time code, then theoretically you just use something like perl 5's Safe on all the macros and grammar extensions.
    • in the absense of any BEGIN { } declarations or importing of macros/grammars from other compilation units (this is something you can statically check for if the unit is already compiled)


      I am probably being thick, but once they code is compiled, haven't you already run the BEGIN block and all its arbitrary contents?
      --
      rjbs
      • The begin block itself has to be fully parsed before it's run.

        Similarly a macro or grammar extension coming from another compilation unit has already been compiled.

        At this point you can examine their code in a manner much like Safe does (existing problems in safe are an implementation issue, not a conceptual one), and run the code with some resource limitation if necessary (if this wasn't possible then we wouldn't have javascript ;-)

        Furthermore, if you deduced by static analysis that these blocks cannot aff
        • > At this point you can examine their code in a manner much like Safe does (existing problems in safe are an implementation issue, not a conceptual one), and run the code with some resource limitation if necessary (if this wasn't possible then we wouldn't have javascript ;-)

          You don't need to run Javascript in order to parse it, since it has (I think) a static syntax.

          Also, this comes down to practicality.

          "What percentage of CPAN can this parsing strategy handle?"

          As a context-free document parser, PPI can
          • With respect to dependencies that will indeed fail to work, but for use Foo that's untrue - Perl 6's importing semantics will support real linkage of symbols for the benefit of compilation units. The method 'import' and glob assignments are not supposed to be the only way to actually import symbols anymore. This solves a lot of issues.

            As for reading files etc in BEGIN - that's also handled differently - there is no guarantee that a BEGIN block will run every single time you run the program, it is fair game
    • > In other cases, if it's safety you're after, in not running the compile time code, then theoretically you just use something like perl 5's Safe on all the macros and grammar extensions.

      Only if you can solve the Halting Problem.

      In Perl 5, even trivial Perl examples involve BEGIN blocks (use strict) and grammar modification (operator/operand switching).

      This problem applies to Perl 5 to.

      Simon Cozens has a never-released parser based on the Perl internal parser.

      It works just fine, as long as the code compi
      • Only if you can solve the Halting Problem.

        No version of the Perl compiler or processor for any version of the language attempts to solve the Halting Problem. They tend to do a pretty good job on most reasonably correct code (and plenty of unreasonably incorrect) code as well. You don't need to solve the Halting Problem. You only need to decide if it's worth it at any particular point to Halt and say "Sorry, I'm not going to continue processing from here," and you can do that if you control the runloo

        • This pretty much covers my point.

          If the code does anything remotely interesting or unusual, you have to abort parsing the document. (worse, you may have to do it after already having spent significant CPU trying).

          Limiting yourself to documents that compile significantly reduces the types of tools you can use.

          I guess in a way this entire post is something of a challenge to prove that a useful non-executing parser can be written for Perl 6.

          Maybe I should formalize it at some point.
          • If the code does anything remotely interesting or unusual...

            ... and non-declarative, which I think you keep overlooking. While I agree that there are ways to write grammar actions that change parsing in unfortunate ways, grammars themselves look more or less statically decidable in ways that regular expressions aren't.

            I won't suggest that they're quite as static as an EBNF grammar is, but they're much, much closer than the Perl 5 parser. It should be possible to identify arity and precedence without

            • Well, I'm assuming that "interesting an unusual" things will be non-declarative.

              Anything declarative becomes "normal" for Perl 6.

              Grammar changes have two overlapping issues.

              There's the BEGIN problem. Lets assume that isn't a problem because grammar changes are declarative and decidable.

              The secondary problem for grammar changes is how to (and if you can) handle syntactic and semantic modelling for the resulting document in such a way as to allow for stuff like $document->find('comments');
      • The halting problem only applies to code you cannot introspect.

        If you have a function, and that's all, then you can't find out what's in it.

        But given a compiled optree, you have much more information.

        If you parse the BEGIN { } block under the current rules, then you wind up with an optree which you can then examine, to see what it does.

        As for simon's project - perl 5's parser was never designed to make this easy, it was designed to emit an interpreter optimized optree. This is very different from the design
        • We probably need to escalate this to a formally trained mathematician here, but as I understand it, it applies to any case with "arbitrary" code whether introspectable or not.

          You CAN prove something will finish in finite time, you just can't prove how long that finite time is, which may be longer than the heat death of the universe.
          • Errm, you would have done well to follow your own advice and stop at the end of the first sentence. :-)

          • No, that is just plain wrong.

            You can very easily prove that the program 1 + 1 will return in finite, and short time, and it has nothing to do with heat death of the universe (very long != infinity).