Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

chromatic (983)

chromatic
  (email not shown publicly)
http://wgz.org/chromatic/

Blog Information [technorati.com] Profile for chr0matic [technorati.com]

Journal of chromatic (983)

Friday February 22, 2008
06:14 PM

Perl 6 Design Minutes for 20 February 2008

[ #35728 ]

The Perl 6 design team met by phone on 20 February 2008. Larry, Allison, Patrick, Jerry, Will, Jesse, and chromatic attended.

Patrick:

  • did a few small things in Rakudo over this week
  • nothing spectacular or amazing
  • will do the Parrot release later this evening
  • expect no problems
  • headed to FOSDEM in Belgium tomorrow to talk about Perl 6
  • I should have some hacking time on airplanes
  • very little chance of being interrupted

Larry:

  • other than having the flu...
  • doing various spec things
  • documented what identifier extensions are
  • straightened out the little mess in role composition as to the difference between role private attributes and the composed class's private attributes
  • something declared with has is intended to be used generically within the class
  • for a role private attribute, you declare it with my
  • goes along with the use of my of giving private attributes (or methods at least) to classes
  • if you declare unary or zero-arity functions with a prototype in Perl 5, that changes the grammar
  • that's no longer the case in Perl 6
  • they're still parsed as list operators
  • you can define them that way if you want, but you have to do it explicitly

Patrick:

  • that's a tremendous amount of sanity

Larry:

  • some of us get there late

Patrick:

  • some of us never get there

Larry:

  • there's a big hole as to how backslash behaves in the interpolation spec
  • what if you backslash something that isn't going to interpolate?
  • the rules are similar to Perl 5, but more generically of course
  • single-quotes assume you want to leave backslashes in
  • double-quotes assume that all backslashes are meaningful, or they're in error
  • working on how sig space interacts with ** on a separator
  • fixing up some of the spec there
  • came up in IRC
  • the big thing this week is working on a program called gimme5
  • chewing gum and bailing wire
  • instead of spitting out Pugs code, it spits out Perl 5 code
  • can now translate STD.pm to Perl 5 and run it under strict and warnings
  • translated cursor.pm to Perl 5
  • have the longest token scanner working for small, uncomplicated rules -- just alternatives
  • need to revamp how longest-token matching works for more complicated rules
  • the match is essentially over a list of alternatives
  • given a rule that expects a term, it's zero or more prefix operators followed by a noun
  • zero-or-more implies that the longest token at that point has to be the union of what prefix and noun matches
  • it doesn't handle embedded alternations yet either
  • the next step is doing what I could not make work under Pugs
  • take the current match state and extract the match object to show to the user
  • the current match state continuation has all of that information in it scattered among a bunch of backlinks
  • it's not the form in which the user wants to access it
  • unless the subscript lookups did some sort of tie-based behavior
  • I don't want to do that
  • we can just matchify things lazily when we know we need that for the user

Allison:

  • finally done with my insane conference run
  • no travel until the first week of March

c:

  • doing a lot of work closing bugs
  • working on open tickets
  • closed a lot of them
  • fixed a lot of little things for the release
  • think I fixed one Tcl segfault
  • working on the Lua GC bug now
  • fixed up a lot of little things for the release

Jerry:

  • doing more work on Rakudo
  • found a weird bug that Patrick thought was GC, but it's not
  • I did find a GC bug earlier today though and submitted a ticket
  • did some other fixups for Parrot working on the release
  • hope to work on the PDD 17 branch soon
  • still have a Windows segfault there
  • building on Linux so I can help
  • started looking at gettext so we can internationalize Parrot
  • I'll check something in for the config after the release
  • after that, it's just a matter of putting in some macros and starting the conversion

Allison:

  • the Linux kernel has some good hooks for i18n
  • everything gets hidden behind a layer of abstraction, so you never directly print error text

Will:

  • did a bunch of non-user-visible commits
  • coding standard updates
  • have a new contributor, Stephen Weeks
  • he's contributed to several languages

Jerry:

  • it feels like it's time to optimize PGE
  • is that correct?

Patrick:

  • there are lots of little improvements
  • biggest improvement is probably the longest token match

c:

  • can we avoid recursive descent with that?

Patrick:

  • not entirely
  • but we can avoid recursing into subrules that can't possibly match

c:

  • I'm all for pruning trees

Patrick:

  • longest token matching isn't trivial
  • as Larry will tell you
  • PGE needs to be able to attach attributes to subroutines
  • I need to be able to ask a rule (or build a DFA) such that we can associate longest tokens with rules
  • ask a rule "What does your DFA look like?"
  • sounds like an attribute on a sub
  • I haven't been able to put together a clean way to do that in Parrot
  • we have subs with attributes or properties
  • but once you take them down to being PIR
  • you need a subroutine that attaches all of the properties once you load that subroutine

Larry:

  • I'm not doing it that way
  • I have an argument that's the state of the predetermination
  • there's an artificial state, ?, that's "tell me your DFA"
  • instead of caching that in the sub/method itself, there's a per-language cache
  • if two different languages call into that rule, they can keep separate caches of their longest token matchers

Patrick:

  • you have a way of calling a method that says "Give me your DFA" instead of "Do a match"?

Larry:

  • yes
  • you don't have to duplicate the dispatch
  • you get the right one
  • we don't try to duplicate the dispatch mechanism
  • we just tell it to do something different than do the match

Patrick:

  • the grammar holds that little bit of DFA
  • keeps track of how they combine together

Larry:

  • when you ask for the DFA for the particular rule, it asks its subrules
  • each of those, the autolexer automatically knows whether it's calculated that
  • it's already in the cache
  • it reuses that bit then
  • reincorporates that into a larger pattern
  • off it goes
  • the tricky thing is making sure that that cache is per-language
  • Pugs doesn't do that now
  • if you try to cache that in the sub as a property, you'd have to keep track of which language used it
  • they have different lexers
  • that's some of the motivation for simplifying the language tweaking things
  • don't want the language to tweak for every unary or zero-array function
  • I didn't want to cache duplicate lexers?

Patrick:

  • part B of my answer is get some sort of profiling available in Parrot
  • see what subs get called and where we spend our time
  • part C is writing the Capture PMC in C
  • that's waiting on the PMC changes

Jerry:

  • we have a profile core in Parrot, -p

Will:

  • it's based on the C level

Jerry:

  • you want subroutines, not ops?

Patrick:

  • yes
  • making a native Capture PMC could be a big win
  • it just redispatches to arrays and hashes
  • that deserves to be its own beast

Larry:

  • do you maintain that match object on the fly as a mutable object, and back things out as you backtrack?
  • my passed-around match state is cloned, immutable objects
  • my continuation system just uses the cloned snapshot
  • it may or may not be faster
  • I know where I need to reconstruct that mutable match object that the user sees
  • that's another possible optimization/pessimization

Patrick:

  • the match object is a convenient place to keep track of backtrack state
  • backing out is basically a delete optimization
  • I could optimize the backtracking
  • function is more important than speed for what I've done so far
  • another big win is getting strings implemented under Parrot such that we're not using UTF-8
  • that's slow
  • because of ICU and Unicode support, we're stuck with UTF-8 at the moment

Larry:

  • I've done enough of that in my life

Patrick:

  • as far as the code it generates, I'm sure people can come up with improvements there
  • the code itself does a lot of optimizations
  • profiling seems to be the big win

c:

  • I have some ideas there
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.