Allison:
- almost finished with the tree transformation thing that takes a grammar
- last remaining piece is tricky
- if you use compreg and compile a subroutine on the fly from a string of
PIR text, can you have parameters in that?
- can you accept subroutine parameters?
- do you have to declare it as a named subroutine?
Patrick:
- you have to compile it as a named subroutine
- you can make it anonymous
- give it a name, but put
Allison:
- that's exactly what I need
Patrick:
- you get back a PMC that you can call like any other subroutine in a PMC
Allison:
- I'll have a version of the engine that uses grammars by the end of this
call
- parses the tree grammar file using PGE
- does a tree transformation on the match result
- then loops through the tree (an array of rules, where each rule is a
hash)
- compiles rules for an attribute grammar from that result
- you can compile from a tree grammar file to a working tree grammar
- you use a tree grammar to use this
Luke:
- an abstract syntax?
Allison:
- your attribute grammar
- I can create that from a syntax file
- I do that using attribute grammars
Luke:
- for me, it ended up being less readable than doing it by hand
- but it's fun to bootstrap
Allison:
- it was my way of testing if I can use an attribute grammar
- before I put it into the repository, I want to make sure that I can
parse the output from a PGE match and transform it
- this proves that that is possible
- also Roberta and I finished our final draft of the Artistic Licence 2.0
and the contributor agreement
- the next step is to solicit reviews from a few interested parties,
including the patent lawyer
- then we'll do a general public review
Nick:
- will Perl 5 come under this agreement as well?
Allison:
- we took that into account for the contributor agreement
Nick:
- can you have the Perl 5 cabal review it too?
Patrick:
- PGE has grown by leaps and bounds
- converted it to use a shift-reduce parser internally
- everyone else can use that parser
- probably 25% shorter in parsing code, much faster
- also added support for custom whitespace rules
- create a grammar with its own rule
- now does closures; can embed them in a rule
- can specify what language it'll send to which Parrot compiler
Luke:
- can you give it a lexical pad to start with?
- does Parrot already do that?
Patrick:
- haven't done anything with that yet
- shouldn't be hard
- we can do whatever we want
- I'm still waiting for the dust to settle on the design
Luke:
- suppose I wrote a lexer for a LALR(1) grammar?
- can I have PGE understand that?
Patrick:
- as long as it understands match objects
- you can control how it handles tokens
- "give me a match object"
- "call this subroutine which tells me how much to consume"
- the subroutine can be a rule
- go from recursive-descent to shift-reduce or vice versa at any point
- we don't have to worry about backtracking over Perl comments, for
example
- they're atomic
- pretty straightforward
Luke:
- can you commit yet?
- will that allow us to clean up memory that we don't need?
Patrick:
- not yet
- but I know how to do it
- you can take any match object and call the same rule
- it gives you a match object that starts there
- drop the old object and the GC will clean it up
- continue from where that match started
- I can build that trivially in about 15 minutes now
- I did lookaheads
- I'll do lookbehinds in the near future, with a naive implementation
- particle on IRC has committed around 200 tests for PGE
- I'll clear off a lot of the TODOs there to reward him for his work there
- my next plan is to start parsing Perl 6 expressions
- I'll go as far as possible
- should be able to go quite a ways
- already an expressions parser in the examples directory
- you can get back a parse tree with precedence taken into account
- now that that's a rule, I'll start working on statements
- we'll see where it goes from there
- probably can produce fairly complete parse trees for Perl 6 programs in
two weeks
- may not have all of the esoterics in there, but most will be available
- one question
- I can fairly rapidly build an interpreter based on the match tree
- is that a direction to bother going down?
Allison:
- it would be useful just in a sense of having bits and pieces being
usable right away
Patrick:
- that's the way I want to go
- I'll start building an interpreter
- it may be slow
- my intermediate goal is to get the language interpreter far enough along
that it can run the Pugs test harness
Luke:
- that's what Pugs did for a while
- that sucks
- all of your special cases get into the big, bloated runtime
Allison:
- we're working on the syntax tree at the same time
- we will do that
- it's just nice to have something running
Patrick:
- I think Luke means "don't go too far with it"
Luke:
- you don't want to go too far in that direction
- people could spend their time adding features to a bloated interpreter
rather than building a small, scalable one
- where do you stop?
- where do you throw it out and start with PIL?
Allison:
- when I push the tree transformations two steps further and Patrick can
use that instead of the interpreter
Patrick:
- PIL and the interpreter aren't entirely in conflict
- my goal is the compiler
- I don't want to target PIL directly
- I need something to evaluate my syntax trees relatively interesting
Luke:
- you don't want to go too far, like Pugs in throwing so much away
Allison:
- the best way to check the grammar and PGE is to have something that
actually works while he's at it
Patrick:
- you can help out, Luke
- when you think it's going too far, say so
Luke:
- it goes too far with other contributors, not you
Allison:
- we don't really have other contributors now
Patrick:
- we could get some very quickly
Luke:
- "thanks for your help, but we're throwing it away now"
Patrick:
- we'll just tell them up front
- I'll create a new languages/ directory so it's clear it's not the
official Perl 6 system
- we will watch for that very carefully
- if this system doesn't have a release schedule but the Perl 6 compiler
does, it'll be clearer about our priorities
Luke:
- okay
- one more thing though
- Perl 6 is a complex language
- I think that interpreter will be bigger than you expect it to be
Patrick:
- I don't expect to write a complete interpreter
- I expect to write enough to get me to a compiler in a comfortable
environment
- after writing PGE as a compiler, I know that's how I want to go
Jesse:
- describing it as a development aid and experiment might help
Patrick:
- I'm planning to send out an e-mail later tonight if I can stop coding
long enough to document things
Luke:
- I'm happy that everyone else is making good progress
- I didn't this week mostly because of school
- I was going to write a couple of things, but they're still on my queue
- also going to port Language::AttributeGrammar to Perl 6
Jesse:
- are you blocking on anything else?
Luke:
- only learning my way around the new Parrot
c:
- working on the book
- writing a proof of concept in Perl 5 that I can port to PIR pretty
easily
- requires the OpenGL bindings
Jesse:
- are those maintained?
c:
- I have a proof of concept on my hard drive
- I think it'll be pretty easy
- don't need to wrap them in a nice interface as I did SDL
- should be a quick port
Jesse:
- Autrijus and Liz have been enumerating their desires for S17 and
concurrency in Perl 6
- once they feel like they have a draft they like, should they send it to
the cabal and ask where to go from there?
- maybe the answer will be "no, this isn't what we want!"
Allison:
- yes, that is the right way to go
Nick:
- I remember discussions from Perl 5 several years ago
- someone needs to go over the proposal and see if there are portability
problems and guarantees
- they're talking about STM for this
- seems like a disconnect between what Ponie wants
- the fact that everything in Ponie goes through a vtable means that
there's no speed hit for going through non-shared code
- there'll probably need to be the concept of closing in Parrot
- as soon as you start a thread, you'll have to replace the parent vtables
with shared ones
- this probably needs Chip
- don't know how much Chip has thought about it yet
Allison:
- it sounds like the right thing to do is to pass the draft around to
everyone who's thought about threading in Perl 5
Jesse:
- next week, the call will be at the DST time
- 10 pm UK
- GMT now
Ever since the Great Multimethod Debate, I've been introducing various principles into my arguments. These are my "axioms of semantics", and my hope is to take the hard questions and answer them purely in terms of reasoning from these basic principles. This is the kind of reasoning that makes me love mathematics: to reduce all hard decisions into obvious truths by using proper definitions of terms.
So, anyway, here's the list I have so far. (The other well-known Perl principles like Least Surprise, Waterbed, and Endweight are omitted. Maybe I'll do a writeup about those soon.)
The motivations for the three were: manhattan MMD, the junctive one() type, and manhattan MMD again, respectively. Also note that the first principle precludes the existence of submethods. Now it's clear why I keep asking why we need submethods. It's not entirely clear why I continue not to get any answers[2].
[1] Up to metainformation, such as asking which class the object is in, of course.
[2] To be fair, Damian responded to my query, but he didn't answer my question. He gave more an example of how submethods are used, rather than why they are used.
Larry:
- trying to figure out types
- and everything else
- rewriting the world
- put out new versions of S02 and S06
- already have things I want to change
Luke:
- good thing committing only takes a couple of seconds
Larry:
- also I was in Hungary
- it was thirsty
- they apparently don't believe in caffeine there
- I spent a goodly part of that with a headache
- the conference was pretty okay
- getting ready to give a talk to HP on Thursday
Jesse:
- Perl 6-ish?
Larry:
- variant of my 9.3 State of the Onion talk that I gave at EuroOSCON
- it'll be by teleconference
- my slides will be a Java app running in everyone's browser
- part of the reason I did EuroOSCON with just words is that I wanted it
to still work in case everything blew up
Jesse:
- I'm pleased using the Takahashi method Autrijus found
- it just works
- I can edit them in a text editor
Larry:
- the named arguments are now marked with colon instead of plus
Luke:
- makes many people happy
Larry:
- probably the type sigil is up arrow, rather than a cent sign
- as in caret
- we're trying to iron out all of our terminology for what's a class and
what's an abstract class that isn't really a class
- probably relates to Luke's "a class is not anything"
- think we're making good progress on that
- looks like I'll be home for the next couple of months
- I'll probably accept a position at the place where I'm currently
consulting
- I'll still be spending the majority of my time on Perl 6
Patrick:
- checked in my shift/reduce parser last night
- busy cleaning it up
- it works pretty nicely
- getting ready to parse all sorts of Perl 6 expressions
- then I found a segfault (which I reported earlier)
- I think I'll be able to parse out most Perl 6 expressions now
- including all of the weird whitespace things
Larry:
- can that all be in one spot, like we talked about?
Patrick:
- yes
- register a token and specify if it's whitespace okay or not
- if you need to do fancy things from the bottom-up parser, you can
always call a rule
- it tells the shift/reduce parser "this is what I have, now continue on"
Larry:
- okay
- does the rule vs. token distinction we discussed make sense in this
context?
Patrick:
- for general in Perl 6 rules?
- not at the moment
- I like the way
- pretty nice way of doing it
-
that I needed to parse
(On the cabal list we have been discussing separating the concepts of rule and token for whitespace dwimmery purposes. -L)
- now I'm working on parsing out the ternary operator
- think I have that working in the past five minutes
- next steps are to test and check in some examples
- Perl 6 expressions, Perl 6 rules, general expressions
- parsing gives you a nice match tree with the nodes in the right place
- useful starting place for figuring out the tree transformations we need
to do
- especially true for the grammar engine itself
- once I parse it out to the tree, we want a bunch of transformations and
optimizations
- I'll write PIR code that does that for now
- we can catalog those we need and how we might want our TTL to look
Larry:
- and not box ourselves in so we can't use "use" statements
Patrick:
- also promised to update the milestones document for PGE today or tonight
- just haven't done it yet by being too busy writing code
Allison:
- sent a patch to p6i to update the opcode mismatch error message
c:
- not sure that's the right approach
- I like your second suggestion better
- I can look at it if you like
Allison:
- that's why I sent in the patch, not checking it in directly
- this made my life easier though
Larry:
- are these capturable by something running atop Parrot?
Allison:
- it's an interpreter exception
- working today on finishing the PIR attribute grammar implementation
- coming along nicely
- because of the closures issue, I've made it much more heavily OO
- instead of relying on closures to capture a certain value, I'm passing
around relevant objects
- AttributeGrammar::Rule object, for example
- pass that down to where it needs the information
Luke:
- might be a better approach, given Parrot's architecture
Allison:
- seems to be working better in PIR
Luke:
- which version are you interpreting?
- I released a new one this week
Allison:
- started with the old one
- figure I'll update it later
- the old one was simpler in the model
- which parts did you change?
Luke:
- completely rewrote it
- most of the algorithm's semantics rely on the simple semantics of the
thunk
- used to traverse the tree, find out the attributes, made thunks if I
need them
- new version makes thunks for all the attributes it may need eventually,
starts with the first one, then goes on
c:
- was this to help memory usage?
Luke:
- yes
- it's hard to tell when Perl cleans up memory
- you can't use this technique verbatim in Parrot because of the
difference in how they handle pads
- write destructor code that tells you when things are cleaned up
c:
- it would be nice if destructors worked
- last time I tried, they didn't
- we need some sort of finalization and destruction
- I want to release non-memory resources at a reliable, known point
- I don't really care when Parrot GCs the dead object
Allison:
- block exit, right?
c:
- that's fine with me
Allison:
- you want to know that you have reliable finalization at some point, not
necessarily at the same time as destruction
- I'll check this into my personal repository tonight
Jesse:
- I'll try to come up with an idea for a more public repository for this
sort of thing
Luke:
- mostly school
- lots of tests recently
- rewrote the attribute grammar module
- nice and memory-efficient, at least I think
- algorithmically it has a better bound
- also started the Perl 6 port of Language::AttributeGrammar
- still thinking about how to do that given Perl 6's handling of pads
- much different from Perl 5
- think that's about it
- attribute grammarsr are a great base for the tree transformation
language we're thinking about
Jesse:
- I was in Stockholm for the Nordic Perl Workshop
- presented 205 slides in 23 minutes
- missed the implementation meeting, but it happened
- need to send out mail about the upcoming time change
- that's it from my side
Patrick:
- Allison, are you waiting on anything from PGE on what you're doing?
Allison:
- I'm doing the part that doesn't depend on PGE right now
- after I finish the port, then I'll start trying out transforming PGE
trees
Patrick:
- I'll send you some PGE trees then
Luke:
- I heard that Larry and Autrijus did some work on tuples
- I'd like to see a summary of that
Larry:
- read S02 and S06
- basically the tuple constructor we stole with \()
- we didn't think that we should throw regular parens at that
- the signature constructor is different from that
- it's still
- a tuple is parsed rvalues
- a signature parses differently
- that's how it currently sits
- also there's no special casing about having a comma on the end
- the fact that you are passing it to a function makes it a tuple
- the recognizer for a tuple in a signature is also a backwhacked variable
- Autrijus talked about that part of it
- the slurpy scalar will just slurp one argument at the top level
- I'm sure you'll find other stuff out there, some of which is probably
bogus
Jesse:
- not much conferency coming up
Allison:
- MIT's Little Languages workshop may be the first week of December
- at least, I hope it is
Oh yeah, there was no meeting yesterday because of Euro-OSCON. See you next week.
Larry:
- made it back safely from the cruise
- madly working
- hanging out on the lists a little bit
- haven't had enough time to think through the things I want to think
through
Luke:
- started the rewrite of attribute grammars module
- won't eat up memory as much
- spent hours trying to debug a nasty bug
Larry:
- a Perl 5 bug?
Luke:
- no, in my code somewhere
- also talking with Stevan about the metamodel
- trying to unify that in my head with theories
- not working very well
- but I realize that the theory proposes a type system
- the metamodel proposes an object system
c:
- and a reflection system
Luke:
- Autrijus says he'll talk to Larry in person at EuroOSCON
- I'll follow along on IRC to catch type annotation stuff
Patrick:
- working on PGE some more
- tracked down and fixed a problem with Parrot's coroutine handling
- rewriting PGE captures code so that match objects are true to what they
ought to be
- clean it up, make it faster, make it more accurate
- working on shift/reduce parser component
- this first ought to produce Perl 6 rules
- the recursive descent part is actually the slowest part of the current
process
Larry:
- surprise, surprise
Patrick:
- in the process, I find that match trees look like something I need to do
tree transforms on
- expect to have a real-world example in the next week or so
- we could look at using attribute grammars on it soon
- I'll be publishing the start and the end and the necessary
transformations
- then we can look at them
- looking at a port of Text::Balanced
- turned out to be fairly easy to do in Parrot now
- also posted several questions on the list
Larry:
- haven't had time to go through the whole document yet
Jesse:
- do the docs need updating to reflect that?
Larry:
- undoubtedly
Patrick:
- Damian has drafts of those
- I went through them in some detail
- his latest drafts looked pretty complete to me
- just looking for confirmation on various things
- I can pull out the outstanding issues if you need them
- Damian has another draft I think
- could use an opinion on letting whitespace before a quantifier imply a
rule
Larry:
- in the absence of a better idea, run with it
Patrick:
- if we put a quantifier on a subrule
- the quantifier doesn't include any whitespace between them
- if you put space between the subrule and the quantifier
- currently it's meaningless
- what if it put a whitespace rule between each instance?
Luke:
- maybe a good temporary idea
- maybe we're approaching the
- all other context-free approaches try it upside-down from us
- for quick regexes without many calls to subrules, we're good
- for context-free things, a completely different semantic may be much
handier
Patrick:
- we may need both
Allison:
- had a question for Patrick about my PIR code
- will e-mail him when I remember
Jesse:
- more chatter in the implementation meeting
- Chip's working on the PDDs
- Leo and Nick both seem to be waiting on bits of design
- maybe need someone to put together questions which various design docs
need to say
- Chip would find it easier to write them by answering questions rather
than working in a vacuum
- might need to find some volunteers for them
- been posting transcripts and links to the implementation notes on
use.perl.org
Luke:
- posted a request for comments on annotations to p6l
- the people on the "programming left" responded
- "I don't want any type errors anywhere"
- didn't really generate any interesting thoughts
- mostly a mood-generating thought
- are there any ideas about the integration between annotated and
unannotated code?
Larry:
- seems like one of those things the user deserves to give a knob
- what the default setting is
Luke:
- it's not a trivial knob
- you're working with a complex system
- Autrijus and I talked about this some on IRC
- before we figure out the big issues, we have to figure out the little
issues
c:
- would it be helpful to come up with examples where the syntax,
semantics, and behavior might be a problem?
Luke:
- that was what my thread tried to do, but it didn't go there
Patrick:
- everything gets more complex in the middle ground
- we're not going all type-checked
- some people never want type checking
- most people are probably in the middle
- people need more time to think about that
Jesse:
- is there anything besides free time that are blocking you?
Larry:
- brain cells
Patrick:
- daylight savings time changes are coming
- we should figure out the implication
- the calls will be earlier
Allison:
- last year we postponed the change until December
Jesse:
- I'll send mail about the time change
- we'll figure out something that works
I'm feeling conflicted about theory.pod. While I'm very proud of it for being a great object model and giving precise semantics to things we've been handwaving over for ages, it just isn't Perl. It's too formal and static for Perl; you can't talk it into checking something a different way.
However, it is not possible to add a pure hook system that's reasonable for the everyday user to use. Type inference can't deal with arbitrary code, because it needs to go forwards and backwards and sideways through the annotations to figure things out, whereas arbitrary code can only go forwards. If we had a logic subsystem, we could give them arbitrary code (within that subsystem) because logic programming has the nice property that you can execute a function backwards:
:- append(X, Y, [1,2,3,4,5]).
But maybe instead I need a coproposal, one that really gets down and dirty with Perl 6's internals, for the hook system on top of which theory.pod can be built as a module. If that exists (assuming it is even possible), then I wouldn't feel so uneasy about theories not being arbitrarily flexible.
Now, I've only been a project manager once, so I definitely have no idea what I'm talking about. But two principles occurred to me while I was procrastinating a paper I had to write. They are quite along the same lines as -Ofun, but a little bit more specific:
* Always make people feel like they're making a difference.
* Never make people feel like they're making "the final product".
And the reasoning is: #1 makes them start, #2 makes them finish. Nobody ever finishes "the final product", because it's never quite designed well enough, or quite as flexible, or quite as orthogonal, etc. as they'd like it to be. And of course, before you start writing "the final product", you have to have it all worked out in your head. These emotional responses are extremely counter-XP, and never end up working, of course. So #2 is basically there to shatter those false emotions and get people into XP mode without overengineering.
I'd be interested in hearing from some people with management experience on these thoughts.
There was no Perl 6 meeting today, because of the Perl Whirl. See you next week.
Okay, well, my lazy implementation in Language::AttributeGrammar is really memory hungry. Not only is there the O(mn) overhead usually present in lazy implementations, all data is kept around until the end of the computation. This is silly (but it is still a trade-off).
I have two options for making a reasonable implementation.
1) Use a proper lazy approach that knows how to clean up after itself.
2) Compile the grammar using Kastens's algorithm.
I'll demonstrate option 1 by showing the Branch visitor in the repmin problem:
sub Branch::visit {
my ($self, $globalmin) = @_;
my ($left_min, $left_result) = $self->{left}->visit(thunk { $globalmin->call });
my ($right_min, $right_result) = $self->{right}->visit(thunk { $globalmin->call });
return (thunk { min($left_min->call, $right_min->call) }, # min
thunk { Branch->new($left_result->call, $right_result->call) }); # result
}
Each of the $left_min etc. are thunks. When you ->call a thunk, it runs the closure associated with it and then overwrites the closure with the value that the closure returned. It's the standard lazy evaluation technique.
Now you say, "$left_min? Why not $left->{min}? The code would be clearer." (Also importantly -- and this is the trade-off -- you wouldn't have to know the types of the child nodes). Well, it's about memory. If I put all the thunks in a hash, then the values they return will be cleaned up after I run all the thunks (exhausting the pointers to the lexical pad). However, if I put them in lexicals, then the values will be cleaned up incrementally, as soon as the computation doesn't need them anymore.
For instance, you don't need the local minimum to build the result, so as soon as the local minimum of this branch is returned, there are no more references to $left_min and $right_min and they can be cleaned up. This is more important when you're passing around large data structures, which attribute grammars often do.
The cool thing about this new lazy formulation (which isn't new at all, UUAG does it), is that it's time and memory complexity is exactly the same as Kastens's algorithm. There is linear overhead in the number of nodes, but that has the same complexity as the attribute values themselves. You just allocate them a little earlier. So all that's left are the details.
And that brings me to option 2. The details could be important. Kastens's algorithm is less flexible than the lazy algorithm above because it needs to know more about the input. Kastens's algorithm, though, can compile to an abstractionless representation, where pretty much the only operations are "bsr" and "execute attribute body". In Perl, this will make a little difference, but not one noticable (I may benchmark to back up/invalidate this claim). However, if you are compiling to C or a JIT, this could make a really really huge difference. If Perl 6 gets attribute grammar support (s/If/When/, just a question of whether it's shipped or module), then it's probably a good idea to use Kastens's algorithm, because we can then compile to JITted parrot (or even C).
However, all the lifetime analysis, visitation dependencies, and hard-to-compute bullshit like that in Kastens's algorithm are automatically handled using thunks and garbage collection. That means that a lazy approach is significantly easier to implement. Also, the lazy approach can accept more grammars; that is, grammars that are not "ordered attribute grammars" as in Kastens's paper. But the class of ordered attribute grammars is already quite large, so that's probably not an issue.
I think I'll go with option 1, mainly because it's easier to implement. Funny how big a factor that can be, and how little people understand how big a factor that is.
While reading "Ordered Attribute Grammars, Uwe Kastens, 1980", I came across this description of one of the functions he was using in his example:
include(a,d): "a" is a set of descriptions, "d" is a description; the result is "a U {d}" if a does not contain a description "d'" for the same identifier described by "d", "{a\{d'}} U {d}" otherwise.
(I used U for union and \ for difference)
Stare at that for a little while. You can probably figure out what it's saying within a few minutes. But how about:
include(set,desc): "set" is a set of descriptions, "desc" is a description; if "set" already contains a description with "desc"'s identifier, it returns "set" with that description replaced by "desc". Otherwise it returns "set" with "desc" added.
If the author weren't so attached to keeping everything set-theoretic, you could say that a is a dictionary from names to descriptions, and make it even easier to read.
More recent papers have the same kind of purely mathematical language, formulating everything in terms of sets, using single uppercase letters for set names and single lowercase letters for variables. Anybody who programs that way is scolded by the community for producing unreadable and unmaintainable code. Yet academic papers, which have much more respect than programmers, continue to use this cryptic notation. These papers don't deserve respect.
It's because of this mathematical-notative culture that Haskell uses all the cutting-edge research and the dynamic languages are stuck re-inventing Smalltalk. It's not that the new research is so hard to implement, it's that it's documented incomprehensibly. PhDs in Haskell read papers and write the algorithms they describe just like everybody else, but they're trained in reading this dense academic language.
And of course, when I eventually do my dissertation, if I don't write this way my dissertation will be rejected and I won't get my degree. I won't think of the problem I solve purely in terms of sets and single-letter variables, I'll think of it in terms of computer science concepts, which have English names. I'll have to convolve my ideas into mathematical academic language, only to have other academics untangle them back into everyday computer science concepts as they read.
Computer science academics need to retake Freshman writing.