(I wanted to title this journal entry "The relationship between a language and its toolchain, and why Perl 6 scares the hell out of me" but it didn't fit)
For the record, this is not an anti-Perl6 rant. It is a warning.
For language designers, one of the foundational concepts is how simple the grammar is going to be. Being a human interface device, languages go beyond math (where there is generally a single truth) and engineering (where there's generally a limited set of widely known best practices) and can get into the realm of personal preference and fashion with few clear guides for the "best" way to do something.
The reason this dimension is so important is that the language imposes limitations on the types of tools it is possible to write. Some types of tools simply are not possible to write at all if the grammar has certain features.
I hypothesize that we can break down the human "language experience" for a general-purpose language as being a combination of the language syntax itself, the tools for it, and the size/quality of available libraries. These are by no means all the success factors, but are the ones that in my opinion make up the influence on the individual user.
(I explicitly ignore languages with features like "proof-carrying" that are essential for specific domains like cryptography and make it worth the pain of learning Haskell or proof-carrying Ada)
According to my hypothesis, the danger here then is to add excessive expressiveness to a language with the intention of it being pro-user, at the cost of fatally crippling your toolchain and hurting the user more than the benefits gained at the language level.
The second risk here is what I'll call the "Personal Language Anti-Pattern". I first heard examples of this from some old-school Lisp hackers. The typical way of describing it goes something like this...
"I can write something in 4 days in Lisp that takes most people 20 in some other language. I just spend 3 days modifying Lisp to solve that type of problem, then 1 day solving the problem".
The anti-pattern here is that you end up going way beyond TMTOWTDI. You don't get two, or three, or five ways to do it. You end up with a different language for every single person and every single project.
Forget about maintaining projects written in crufty Perl. Imagine maintaining code where every single project is written in its own mini-language (although they all look a bit lispish).
In Perl 5 we achieved this depressing state with source filters, but mostly managed to keep it under control with culture. "Source filters == bad" is our cultural norm.
The risk with Perl 6 here is that the ability to safely modify the language is going to be taken as permission to modify it early and often.
I'm not talking obvious positives like "use physics;" here, I'm talking My::Project::Lang here, in the same vein as the "God Object" anti-pattern... Where I don't want to end up is "BioPerl... the Language!", where DNA sequences are a known literal (because it starts with a capitalised GTCA).
my $sequence = GATTACA;
This seems like a seductive option, but the cost is you throw away most of your developer tools.
While I had originally thought that "easier to parse" was a part of Perl 6, this has apparently been removed or was never what I thought it was.
What Perl 6 actually is is easier to IMPLEMENT.
That is to say, we won't be in the worst possible situation of having a dynamic grammar that can't be reimplemented at all, because there is no grammar beyond "what the implementation does".
So we will have A grammar, but now it is a grammar that is EXPLICITLY changable.
So consider this step 1 to toolchain bliss.
It's still removes the possibility to implement most useful tools.
So, as I see it anyways, here is the rest of the steps.
Step 2 - Deterministic
The key to the really awesome tools is that you need to have a way to READ the code, without necessarily having to RUN the code.
BEGIN blocks (and everything similar) really screw this up for us.
If you need to execute code to read code, then you need to execute arbitrary code in order to read arbitrary code. And right there the phrase "execute arbitrary code" should be more than enough to explain the problem.
In one hit it creates the limitation that you can only every create tools that run your OWN code, you can never write tools that run anyone else's code.
As an example of how this can hit in unexpected ways, in Perl 5 if you have Komodo (or anything else that does background linting) installed it's quite easy to create a totally innocent-looking link on a webpage that will delete your home directory.
So compile time string-eval has to go (and BEGIN blocks or anything else that does it). It's also the death of having a "sub import" that can do anything you like.
In exchange, you get code you KNOW is going to be parsable, if the entire program is valid. None of this BEGIN { exit if rand > 0.5 } stuff.
Since it is so trivial to implement (one line of code in PIL), for the next parrot release there should be an experimental --xdeterministic flag you can pass to perl6 to forbid compile-time execution.
None of this makes interesting tools more POSSIBLE, it just makes it safe to open up a project from a third party without wondering if it is going to install a root kit or not. Which goes a long way to creating the incentives to write the tools at all.
Step 3 - Finalized Grammar
The other ugly problem is in the idea of having a compile-time-morphing grammar AT ALL, and completely blows away the possibility of having a "PPI6" that is sufficiently complete to handle all documents (and along with it kills perlcritic6 and sqlinjectiondetector6 and perltidy6 and other stuff).
If the syntax and semantics of your document is not stable DURING the document being parsed, how are you supposed to generate any form of semantic model (ala method-name completion or SOAP API auto-generation) or even a syntactic document model (ala PPI).
You can't describe anything, simply because you have no idea what you will need to describe in advance.
Perl 5 is horrible in this respect because the grammar contains an "operator/operand" state that flips back and forth every other character (which is the underlying cause of the "/" problem, and 6 other characters).
But at least it's only a boolean flag, and I could fudge my way around the problem by heuristically guessing well enough.
Once grammar modification is easy, any reasonable likelihood of fudging goes out the window because it can change in so many more ways.
You end up with a language which is expressive as hell, but where the most sophisticated editor you can create is vi (with no syntax highlighting allowed).
Assuming such a thing can be created, my bet for the "use strict;" for Perl 6 would be something like "use v6-static;" which would guarantee the file sticks to the official primary grammar for that document and allow you to safely use source code analysis tools on the file.
This could even be useful without determinism, as it would at least let you prove that the compile-time code wouldn't modify the grammar, and so you could still safely model the source code without having to risk compiling and running the BEGIN block.
It would also mean you solve the other big stepping stone for tools, the ability to do useful things with a document BEFORE it is a legally correct program.
This covers everything from parsing "use Win32;" on Unix to "use Not::Written::Yet". The ideal for a parser is a context-free (only needs one file) symantic parser you can safely fuzz-test without it exploding.
PPI is merely a syntactic parser you can safely fuzz-test.
And be under no illusions that it can be replicated in Perl 6. It was only BARELY possible to implement it for Perl 5 and even I wasn't sure I was going to find a good enough path through all the impossible problems until a few months before it was finished.
Now if we are indeed walking into this trap, there's certainly ways to avoid it.
Removing determinism would be useful (and easy to implement) but possibly impractical as it removes the ability to do platform-adaptive code that picks dependencies at compile-time. Goodbye File::Spec...
Finalizing the grammar by default somehow is far more interesting in my opinion. It enables all syntactic and some of the semantic tools, and will discourage flippant grammar modifications.
If you REALLY need to do something exotic, you should be willing to pay the price for it by doing something like "no determinism;" or "no tools;" so that it's clear to anything parsing your code that it should stay the hell away unless it is either running the code, or willing to take the risk or exploding violently.
Or at least the tool is CONFIGURABLE to act safely, instead of being vulnerable to exploit by default. And before you mention Safe.pm, consider "BEGIN { while(1) { $_++ } }" or see a great talk by an Australian University lecturer that teaches a Perl course called "Safe isn't" in which she explores all the ways her students violate her computer from inside Safe containers.
reusable toolchain? (Score:1)
The approach that everyone seem to be aiming for is that instead of writing PPI all over again for Perl 6, you are supposed to get decent enough support form the actual grammar that ships with the compiler in order to do your own interesting things with it.
Secondly, since Perl 6 supports separate compilation units in many ways this is actually much simpler than Perl 5 - there is no more possibili
Re: (Score:1)
I am probably being thick, but once they code is compiled, haven't you already run the BEGIN block and all its arbitrary contents?
rjbs
Re: (Score:1)
Similarly a macro or grammar extension coming from another compilation unit has already been compiled.
At this point you can examine their code in a manner much like Safe does (existing problems in safe are an implementation issue, not a conceptual one), and run the code with some resource limitation if necessary (if this wasn't possible then we wouldn't have javascript
Furthermore, if you deduced by static analysis that these blocks cannot aff
Re: (Score:1)
You don't need to run Javascript in order to parse it, since it has (I think) a static syntax.
Also, this comes down to practicality.
"What percentage of CPAN can this parsing strategy handle?"
As a context-free document parser, PPI can
Re: (Score:1)
As for reading files etc in BEGIN - that's also handled differently - there is no guarantee that a BEGIN block will run every single time you run the program, it is fair game
Re: (Score:1)
Only if you can solve the Halting Problem.
In Perl 5, even trivial Perl examples involve BEGIN blocks (use strict) and grammar modification (operator/operand switching).
This problem applies to Perl 5 to.
Simon Cozens has a never-released parser based on the Perl internal parser.
It works just fine, as long as the code compi
Re: (Score:1)
No version of the Perl compiler or processor for any version of the language attempts to solve the Halting Problem. They tend to do a pretty good job on most reasonably correct code (and plenty of unreasonably incorrect) code as well. You don't need to solve the Halting Problem. You only need to decide if it's worth it at any particular point to Halt and say "Sorry, I'm not going to continue processing from here," and you can do that if you control the runloo
Re: (Score:1)
If you have a function, and that's all, then you can't find out what's in it.
But given a compiled optree, you have much more information.
If you parse the BEGIN { } block under the current rules, then you wind up with an optree which you can then examine, to see what it does.
As for simon's project - perl 5's parser was never designed to make this easy, it was designed to emit an interpreter optimized optree. This is very different from the design
Re: (Score:1)
You CAN prove something will finish in finite time, you just can't prove how long that finite time is, which may be longer than the heat death of the universe.
One True Grammar (Score:2)
In my "Bird's Eye View of Perl", the talk I give to managers, I talk about Perl being a single language that comes from the same source. The idea of multiple implementations looks good on paper, but it doesn't work out in practice. Besides knowing the core language and its libraries, now the mere mortal users have to wrestle with pecularities of each implementation and grammar. It the reason I s
Re: (Score:1)
Yet, whenever I raise my biggest objection I have against Perl6 (meaningful whitespace), I always get thrown back "well, you can change the grammar you know...".
Re: (Score:2)
Thankfully I haven't seen it come up as much lately. Maybe folks are starting to realize that easily mutable grammars are a powerful and awesome tool but not the sort of thing you want every kid on the block to use.
Re: (Score:1)
At some point you simply HAVE to be able to have an IronPerl6 and JPerl6 simply for long term language flexibility and health.
Re: (Score:2)
I don't see different implementations as necessary to anything. Some people might like it, but in reality people will code to the implementation's features. It happens in Java, Javascript, Lisp, Smalltalk, and probably a lot of others that I haven't used. The conversations at the pub are about who supports what and what you have to do to make good code on one implementation work on another.
It's not so
Re: (Score:2)
Separate grammar from BEGIN blocks? (Score:2)
1) They're safe to execute.
2) The tools can be made aware of grammar changes.
It might not even need to be as restrictive as all that, maybe just that grammar changes happ
Re: (Score:1)
Re: (Score:1)
If "=" is mapped to sub equals, you shouldn't need to run equals while parsing, right?
Re: (Score:1)
Really? It scares you that badly? Wow. (Score:1)
I think it would be great to be able to have parsers/editors/refactor-ers that can "statically" (whatever that means) analyze Perl 6 code and do neat(tm) things with the output. I will write the majority of my Perl 6+ code in the standard grammar using the future best practices for doing so because I think there will be great tools that will give great insigh
Re: (Score:1)
There is a VAST gap between anyone thinks is true and what they can prove.
Personally I DO see reasons why, plenty. Because I spent three years wrestling with them in Perl 5's grammar, several of which are based on mathematically provable impossibilities. And these grammar problems remain in Perl 6 unchanged.
You can't just invoke the "standard grammar" as some kind of magic cure-all.
SOMEONE has to eventually write
Re: (Score:1)
Smart tools (Score:1)
Re: (Score:1)
Perl 6 breaks a ton of existing tools, while relying on the existance of new tools which everybody assumes will exist but nobody has actually proven can be written.
Re: (Score:1)
Besides the fact that the existing tools were written before Perl 6, there is no guarantee they would work with Perl 6 if grammar modifications were disabled. There is no existing tool today that works with perl 6 (other than basic syntax highlighting of various editors). Your entire basis in this thread is about the creation of "new tools which everybody assumes will exist but nobody has actually proven can be written.