use Perl Log In
Perl 6 Design Minutes for 26 March 2008
- Perl 6 is going pretty well
- mainly catching up on mail
- finding out where people are and what they're doing
- I had a lot of mail
- pleased on the progress on Rakudo
- people are adding more features and looking at it
- still need to look at the new PCT tutorial
- my plan is to continue reviewing the new changes to Rakudo
- going to get back on track writing about what's going on
- no spec work this week
- lots of conversations on p6l about various parsing issues
- relationships of unaries with indexes
- is it safe for me to ignore that thread?
- for now
- whatever we come up with for the operator precedence parser in STD will be the standard way of doing it
- mostly fighting bugs in YAML and losing
- giving up on YAML for lexer storage
- it does not like to deal with Unicode
- is this a particular implementation of YAML?
- all of them
- I'm using my own formatted file at the moment
- fighting bugs in tagged regular expressions
- it loves to coredump for no apparent reason, if you have too many alternatives
- in addition to the difficulties with null patterns and null character sets
- it always coredumps with those
- if I have 28 or so statement control alternatives, it'll run if I delete all but two or three of them
- apparently a forward or back pointer stored in a byte or something else badly
- may have to dive into the C code to fix that
- also finding a few Perl 5 bugs
- I really, really exercise lexical scoping within subroutines in my output
- nest maps very deeply and expects to keep straight lexical variables within those blocks
- something is leaking somewhere
- mostly working around those problems
- Summer of Code stuff now
- a few applications are coming in
- two look good for Parrot and Rakduo
- need to advertise for more
- also had a contact from someone who wants to port OpenGL to Parrot
- not Geoff Broadwell
- seems like a very serious approach
- pushing him to apply for funding from TPF
- also trying to keep on top of mail and catch up
- haven't had any coding time
- doing a lot more managing and answering questions
- to some extent, that's fine
- there are more people involved in the project
- my brain is full of the strings PDD at the moment
- some substantial changes from Simon's original draft
- he had a good perspective
- I'm looking at overall architecture changes
- still supporting what we need to support
- just in a different way
- started some Perl 6-related arguments online; it's been a while
- made a first patch that gives PIR profiling
- it's not a great approach
- it gives some visibility though, and I've found a few places for optimization
- found a ten-percent speedup in PGE in some cases
- Tcl spends most of its time parsing and re-parsing
- also going to go through the bug tracker again and see if we can clear out more stuff
- are you still thinking about applying Warnocked Perl 6/Rakudo patches?
- unless Jerry or Patrick yell
- if you reply to them, I'll take a look at them
- I had architectural concerns about some of them
- don't want people cargo-cult things if we check them in
- but I'll respond to them if you find them and bring them up
- not all are in RT, some were just on the list
- spent a couple of days at EclipseCon
- trying to get Perl 5 as a supported language within Eclipse
- working on a spec, and then we'll shop that around
- next week is a day trip to New York on a potential sponsorship call
- could be significant
- ripping out deprecated items
- hope to get everything we've deprecated out before the next release
- I'm in favor of that
- mostly having conversations about making progress this week
- lots of people are burned out
- we're not hitting milestones that make people feel like they've been productive
- I don't know that we have a good set of milestones in Perl 6
- nor that we could lay out a series of good, dated milestones for Rakudo
- I agree
- but you just keep working away and more things become available to more people
- one blocker is IO
- keeps coming up
- also exceptions
- Larry made a comment somewhere that the design is waiting on the implementations to figure out what they need
- we can go where we need to go
- but we can change later
- there's no point in waiting for now
- there's a draft design for IO
- having an implementation would help people do file IO in Rakudo
- I hear Perl is good about that
- what do you need from Parrot?
- how's the IO PDD?
- it's written but not implemented
- the implementation date is June, I think
- how about the basic stuff?
- open a file and stuff
- that mostly stays the same
- you can start using that interface now
- I'll rip out the guts later when we start implementing the new system
- are you comfortable doing an implementation against what's there today?
- that's one of those areas I'd like to delegate
- I worked on the IO design
- I'm comfortable with Parrot's IO
- I'll read up on Perl 6's IO
- it'll make Rakudo more visible where people can use it and make something work
- reading from and writing to files will help
- Haskell went a long time without that and it's pretty popular
- it comes up on the channel regularly
- we can write our own filters and stuff for test suites in Rakudo instead of Perl 5
- eating our own dog food
- do you have a feeling for the strings PDD delivery and implementation?
- due for implementation June 1
- probably ready for rolling in for mid-June
- Rakudo is holding off on reading Perl 6 source as Unicode waiting for that
- you can probably use some of Simon's optimization techniques in the PDD
- he defines a new string type
- you can use that before the full integration into Parrot
- gives you always a fixed width lookup
- as far as I can tell, that's what's expensive
- if I switched and translated everything over to UCS-2?
- I don't want to implement any C code personally
- what will exist in Parrot and when?
- let's lay on Simon to get something working soon
- I can't guarantee that we'll have something before June 1
- but we can start implementing the new string type right away
- if we can get Simon to do a first draft, that'll help
- I just don't want to switch to a variable-width encoding, which'll make parsing really slow
- if you transcode when something first comes in, you'll take a first hit but not subsequent hits
- the problem with transcoding to UCS-2 right now is that it requires ICU, and we don't have ICU on all platforms right now
- I could potentially add those operations...
- I did that for UTF-8
- you might be able to use the Perl 5 program that spits out Unicode tables into Perl 5 friendly tables
- they turn into bitmaps in a way that you probably don't care about
- could use that to write something based on UCS-4 or UCS-2ish integers
- the UTF-8 code is directly based on those codepoints
- we work only with codepoints at that level
- how much effort do you want to spend, knowing that the new string implementation is coming?
- the lack of Unicode support in Rakudo prevents the French angles
- how much of the Pugs test suite uses those?
- they don't show up much
- it's not a killer feature
- seems like you could go a long way without it being a problem
- which codepoints do you need?
- in a case-insensitive search, we fold everything to a single case
- without ICU, when you hit a codepoint outside of Latin-1, Parrot throws an exception
- we check for downcasing first, which is slow
- or we could trap the exception
- but a downcase on the French quotes is a no-op
- I could catch it
- but it's a bit of painful overhead to add
- with a UTF-16 implementation which matched downcase for Latin-1, would that work?
- or do people expect to use accented characters and have them work?
- short answer: yes
- right now, Parrot downcases ASCII, checks for ICU and downcases, and throws an exception for everything else
- one patch I have is smarter about the non-ASCII codepoint on the ICU part
- if it's Latin-1, then we can figure out how to do it
- that's pretty easy to downcase
- not that many additional codepoints
- if it's outside of that, we can throw the exception
- that range includes the French quotes
- let's see if we can get Simon to do an initial implementation
- one of the milestones was documentation for PCT
- is the PCT tutorial close to that?
- it needs to be in PDD form
- Will's talking to him about that
- I'm happy to work with him on that
- have you been struggling alone with those bugs, Larry?
- have you had help from others?
- when you have a bug in TRE from a DFA that's too large and you try to cut it down, and it goes away when you cut it down, I worry that I'll have to solve it on my own
- AEvar wrapped it for Perl 5
- he may be familiar with it
- it's down in the guts
- in the long run, we may abandon TRE and write our own DFA
- just a question if I can work around it right now
- TRE might need modification anyway
- it gives me the longest token, but it won't give me the second longest token if the first one fails
- not sure how to backtrack into that
- a parallel NFA might be more reasonable in that case
- don't you need all the decreasing order of longest?
- you make a list of all candidate token resolutions
- find the longest unique
- call that and hope it succeeds
- if not, and if you're not ratcheting, you need to try something shorter
- all the way down to nothing
- once you know the longest one, it's a lot easier to find the shortest one
- then you know the lookahead
- that assumes you have a hash to look up the shorter keys in
- there's some value in knowing the longest one
- it'd be better to have an automoton in this case
- I'd like to have this applicable beyond parsing and lexing
- any regular expression-like thing can automatically do DFA-style matching to the extent that it's reasonable
- and gracefully fall over to the other one
- there are various ways of hacking around it that would work for a lexer
- that's not the direction I want to go
- you're after bigger game
- Perl originally integrated regex matching into the language
- we've ignored DFA-style matching for so long, we're late in integrating it
- but I think we can do it better than anyone else so far
- is that a new motto for Perl 6?
- uh oh, another new motto!
- one thing in Rakudo stops us from writing Perl 6 methods and classes in Perl 6
- it's a bug or limitation in PCT, I think
- when you compile Perl 6 code to PIR to create bytecode you can call as a library, it creates subs with the same name as other subs
- the generator for the sub names starts with the same number in every file
- you'll get
- will HLL and namespaces help that?
- does something else in PCT need modification for that?
- namespaces would do a lot for it
- I don't have a good answer for that
- the name generator needs a better universally-unique identifier for its names
- a UUID generator would go a very long way to solving this problem with Parrot in its current state
- anything we can steal?
- it'd be nice to have a good UUID generator in Parrot itself
- are they generated separately and then included in the same file?
- when I say "universally unique" I mean "I unique"
- needs external tracking or something
- sounds like we're solving the wrong problem there
- if there's a way to make identifiers for subs
staticlike in C -- file-scoped but don't leak out, we could use that
- I don't know if the
:anonflag does that
- that just makes sure that they don't get stored in the namespace
- we're typically talking about nested closures
- all in the same compilation unit
- as long as the PIR compiler can make all of those linkages such that there's no runtime symbol lookup, there's no problem
- can we include the name of the sub the closure is in in the generated name?
- that may be a short-term solution
- if you put in the namespace and the name as part of the generated name, does that help?
- does the
:anonflag gives us what we want?
- it may not be enough
- we can test that easily enough
- I'd like to see this problem solved before conference season
- I want developers to be able to jump in and implement things in Perl 6 by then
- seems like a more urgent problem than French quotes
- if it doesn't work, it's a bug in the current implementation of anonymous subs
- is there a ticket for this?
- if there isn't, I'll add one