Stories
Slash Boxes
Comments

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

agent (5836)

agent
  agentzh@yahoo.cn
http://agentzh.spaces.live.com/

Agent Zhang (章亦春) is a happy Yahoo! China guy who loves Perl more than anything else.

Journal of agent (5836)

Sunday September 28, 2008
06:36 AM

Now we have Actions!

On behalf of the OpenResty team, I'm happy to announce that OpenResty 0.5.0 has been released to CPAN, which means OpenResty has hit its 5th milestone indicated by a working Action API.

I've found Acitons very useful in grouping together concurrent AJAX requests, which will make webpages load much faster. Our blog sites are already taking full advantage of this trick:

    http://blog.agentzh.org
    http://www.eeeeworks.org

Also, Actions ensure cascaded requests run in exactly the expected order and the REST interfaces are called (mostly) in the expected way (e.g. from the end users' web browser). There used to be a serious security hole in the above blog sites in past because I had to expose PUT /=/model/Post/~/~ to the Public role for updating the "comments" field in the Post model before we have Actions.

The main server for OpenResty, api.openresty.org, has already been upgraded to 0.5.0. If you want to play with OpenResty directly on our servers, feel free to write to me (agentzh at yahoo dot cn) and get an account for free!

Enjoy!
Saturday September 27, 2008
11:23 PM

pod2html.js: Some JavaScript love for POD in a browser

It's fun to do POD (Plain Old Documentation) in a web browser and I've hacked up a JavaScript implementation for the pod2html utility (actually the output is more like Pod::Simple::HTML).

The pod2html.js script is in OpenResty's SVN repository:

   http://svn.openfoundry.org/openapi/trunk/demo/Onccf/js/pod2html.js

The API is straightforward, for instance,

   var pod = "=head1 Blah\n\nI<Hello>, C<world>!\n";
   var html = pod2html(pod);

The following web site is already making use of it:

   http://agentzh.org/misc/onccf/out/

By sniffing the background AJAX requests (e.g. using Firebug), you can see raw POD is retrieved from the OpenResty server and converted to HTML on-the-fly in your browser.

It's worth mentioning that I had a lot of fun combining Test::Base and JavaScript::SpiderMonkey to test this piece of JavaScript code in pure Perl. You can checkout the test script here:

   http://svn.openfoundry.org/openapi/trunk/demo/Onccf/t/01-pod2html.t

By looking at the (declarative) test cases, it's trivial to see what it can do (and hopefully what it can't) :)

For the record, as of this writing, the following POD directives are supported:

  =headN, =over, =item *, =item NUM., =item TEXT, =back, =begin html, =end html, =begin ANY, =end ANY, =cut (it's a no-op), =encoding ANY (it's a no-op)

and the following POD markups are implemented:

   C<...>, I<...>, B<...>, L<...>, F<...>

I've also implemented the (non-standard) =image directive for convenience. For example,

   =image gate.jpg

will be converted to

   <p><img src="gate.jpg"/></p>

Have fun!

P.S. This journal was originally posted to my personal blog site: http://blog.agentzh.org/#post-93

Tuesday August 05, 2008
06:27 AM

Filter::QuasiQuote 0.01 is now on CPAN

After reading Audrey's blog post mentioning GHC's upcoming quasiquoting feature (as well as that quasiquoting paper), I quickly hacked up a (simple) quasiquoting mechanism for Perl, hence the Filter::QuasiQuote module already on CPAN:

http://search.cpan.org/perldoc?Filter::QuasiQuote

I'm looking forward to using sensible filters in my production code (e.g. OpenResty) and eliminating ugly Perl code for with embedded DSL. For example, instead of writing

    my $sql = "alter table " . quote_identifer($table) . " drop column " . quote($column) . ";";

I can simply write

    use OpenResty::QuasiQuote::SQL;
    my $sql = [:sql| alter table $table drop column $column; |];

Also, a JSON-like DSL can be used to describe valid Perl data structures and to generate the Perl code doing validation.

Filter::QuasiQuote supports subclassing, so the OpenResty::QuasiQuote::SQL module mentioned above could be derived from it. Also, multiple concrete filter classes could be composed in a single Perl source file. Just put a series of use statements together:

    use MyQuote1;
    use MyQuote2;

and it should work. Because it's required that filters always return Perl source aligned in a single line, line numbers won't get corrupted.

Of course, lots of nice consequences of the Haskell quasiquotations will be lost in my implementation, such as type safety. But the Perl version is much more flexible and powerful (by some definition) ;)

It's still in alpha and could be buggy. Feel free to report bugs or send wishlist to the CPAN RT site or directly to me ;)

Enjoy!
Thursday June 19, 2008
10:26 PM

UML::Class::Simple 0.10 released

I've just uploaded UML::Class::Simple 0.10 to CPAN with the highlight of the XMI format support. It will appear on the CPAN mirror near you in the next few hours.

Thanks Maxim Zenin for contributing this feature :) A Japanese user was requesting this in his blog as well. If you're a XMI fanboy, feel free to try it out.

Thursday September 20, 2007
05:47 AM

The SearchAll Firefox Plugin and XUL::App framework

My first $job project is now opensourced. It's a Firefox extension named SearchAll.

SearchAll is a simple side-by-side search engine comparing tool which allows you to search at most 3 different search engines simultaneously and benchmark their performance in the status bar.

With this extension, you can compare 2 search engines or 3 search engines at a time. There's a long list of default search engines that you can choose from (including search.cpan.org!). And you can also enter search engines' URLs which don't appear in the default list yourself.

Currently only the sites' raw HTML pages are shown to the user. We'll add more comprehensive and more intuitive views and graphics for the search results in the near future. Please stay tuned!

This project was initiated and has been regulated by the Yahoo! China ( http://cn.yahoo.com ) company and opensourced under the MIT license.

One of our buzzword (for extension developers) is that there's 0 line of XUL/RDF/XML in our project's source tree. The GUI stuff is totally scripted in Perl. Thanks to Jesse Vincent's Template::Declare module on CPAN.

You can always get the latest source code of this project from the following SVN repository:

   http://svn.openfoundry.org/searchall/

If you like to help, please let us know. We're very willing to deliver the commit bit like the Pugs team ;)

The XPI file that can be installed directly into Firefox can also be found here:

   http://svn.openfoundry.org/searchall/trunk/searchall.xpi

There's a XUL application framework named XUL::App sitting in the same repos and SearchAll is already using it. I'd expect to move XUL::App to a separate repos and rename it to a cooler name (maybe Xifty or Xufty?).

Sorry for the lack of documentation. Please see README for some general ideas :)

I've already submitted this extension to addons.mozilla.org and waiting for the editor's approval.

Enjoy!
Monday October 30, 2006
09:59 AM

Notes for this fortnight (2006-10-18 ~ 2006-10-30)

Oct 18 (to Jack Shen~)

I wrote a UML class diagram generator based on GraphViz. it can parse arbitrary perl OO modules and obtain the inheritance relationships and method/attribute list automatically. it's called UML::Class::Simple. And it's much easier to use than StarUML . you know, dragging mouse to draw diagrams is really painful. yay for automatic image generation!

(Here is one of the sample outputs: http://svn.berlios.de/svnroot/repos/unisimu/fast.png.)

Oct 18 (to Sal Zhong~)

i'm planning to upload UML::Class::Simple to cpan once it's mature enough. will you test it for me? bug reports and patches are most welcome. :)

it's still undecided how to differentiate perl classes' properties from other ordinary methods. i'm also pondering the idea of adding relationships other than inheritance. i'll be delighted if you have some ideas on these matters.

Note that i'm ignoring the Autodia module on CPAN since i'm not in favor of XML and a quite different approach has been taken in my project. anyway, i have to admit it's wise to talk to Autodia 's author and merge these efforts. at last, i must thank Alias for creating PPI and suggesting the use of Class::Inspector. they're invaluable when one wants to extract meta info from the perl world.

Oct 19 (to Jack Shen~)

I've merely finished the slides for recap. they already reach the amount of 44 and the number is still counting. alas, still wondering what to say in the next talk on the design of methods and subroutines. :(

Oct 19 (to Cherry Chu~)

Thanks. the talk went pretty well. it's interesting to see that i had the feeling just before the talk that you would not come. so i was not very surprised by your absence. no problem, there's always ``the next time''. :)

i've been busy making slides for tomorrow's talk. they're still not finished yet. sigh. have to make more slides during the daytime tomorrow. producing so many slides is quickly getting tedious. hehe, you know that feeling, right? ;-)

Oct 22 (to He Shan~)

> hi! I've found a book. IT is so nice that i have been
> reading about it all the afternoon. it is great, just
> like an extended version of "The Practice of
> Programming". it's named "Code Complete".

I've got the feeling that you are currently on the *right* way. you'll definitely become a good hacker if you keep going. hmm, hopefully you'll join us perl camels soon. ;)

Oct 22 (to Jack Shen~)

...LOL. apparently you are not a VB guy. inserting images into ppt slides is straightforward once you know how to record down VBA macros in the PowerPoint environment and browsing the generated code in its VB IDE. Another way to get an answer is searching the web. iirc, the method should be AddPicture or something like that. not sure though, computers are out of my reach right now. :(

...Python is even more powerful than MATLAB, Maple, and Haskell? i doubt that. :)

...I was exclusively hacking on the new tokenizer for Makefile::Parser and completely forgot that i had C# classes tonight. anyway, the next major release of M::P takes precedence over any other things. :)

Oct 23 (to Sal Zhong~)

I've just started to rewrite M::P's codebase (which will hopefully be released as M::P 1.00 soon). Yes, it's long overdue. I've had a pretty good plan for a scalable and extensible gmake implementation based on M::P for long.

The new M::P API will offer parsing results at two different levels:

  • Makefile DOM tree

    It's a syntax-oriented data structure which preserves every single bit of info in the original makefile (including whitespaces and comments). So one can modify some part of the DOM tree, and write the updated makefile back to disk. I think it's useful to some GUI apps which want to edit makefiles via menus and is also beneficial to the gmake => PBS translator.

  • Makefile AST

    The AST desugars the handwaving parts of the DOM tree down to a semantic-oriented data structure for make-like tools to ``run'' it or for some visualizer (e.g. my Makefile::Graphviz) to depict the underlying dependency relations. For the PBS emitter, I think we should work out a special AST for it since the desugaring must be lossless, much like a program correctness proving system.

I'm currently working on the M::P tokenizer and will finish the DOM tree constructor these days. The process should be going pretty fast since it is mostly test-driven.

The first goal is to implement the new M::P APIs and get my pgmake utility pass most of the gmake tests so that I can kick M::P 1.00 out of the door.

I'm stealing a lot of source code and pod from Alias's PPI module. I've noticed that the basic structure of PDOM trees can also fit my needs very well. it's called MDOM in my M::P though. ;-)

Oct 24 (to Sun Xin~)

Take care. translating may drive you mad some day. just have appropriate amount of fun, dude!

Oct 26 (to Jack Shen and Sal Zhong~)

my gnu Makefile DOM builder now supports most kinds of rules, 2 flavors of variable assignments, macro interpolations, and various command and comment syntax. Now it's trivial to add new node types and extend the DOM parser.

i'll add support for double-colon rules, the define/vpath/include/ifeq/ifneq/ifdef/ifndef/... directives, and other missing structures tomorrow. After these additions, the DOM parser will be quite complete and will serve as the solid ground that we keep standing on. constructing the Makefile AST will be much easier if we keep a DOM tree handy.

yay for test-driven development! without TDD or Alias' PPI , i wouldn't have progressed so rapidly. ;-)

Oct 29 (to Sal Zhong~)

When and where shall we take the Java exam?

...Oops, it seems impossible to release UML::Class::Simple tonight. still have several missing features to implement and the pod needs loves too. hmm, christopher may be unhappy since i earlier made the promise to him that i would make the release by *this* weekend. sigh. hopefully i'll get some cycles tomorrow.

...nod nod. but i also gotta review the data mining textbooks for the coming exam. furthermore, i'm planning to hack on two expert systems in the next week. i'll be programming in Prolog, CLIPS , and Perl simultaneously, which must be a lot of fun! yay! :D

Oct 30 (to Sal Zhong~)

I've just talked to Alias, the author of PPI , on #perl. he said that i could borrow as much source code from PPI as i would for my Makefile::DOM module. PPI::Element, PPI::Node, PPI::Token, and PPI::Dumper can be reused by my MDOM directly without many changes. i also briefly introduced the two-level ASTs to him and expressed my appreciation of PPI . It has given me plenty of inspiration on how to push my Makefile::Parser further.

This journal was originally posted as http://agentzh.spaces.live.com/blog/cns!FF3A735632E41548!128.entry

Tuesday October 17, 2006
09:42 AM

Notes for this fortnight (2006-10-01 ~ 2006-10-16)

Oct 1 (to Sun Xin~)

Please check out your mailbox. i sent one journal and 107 slides to you for proofreading yesterday. remember that i've said i would try my best to keep you relatively busy? :)

Oct 4 (to Sal Zhang~)

I've rewritten your Win32::xul2ppt_mec module using Win32::OLE and stevan's excellent Moose module. Now it's named XUL::Image::PPT and the xul2ppt utility has been divided into two separate tools, xul2img.pl and img2ppt.pl. Please check out http://svn.berlios.de/svnroot/repos/unisimu/Perl/XUL-Image-PPT/ for the source code. :)

Regarding the new xul2img utility, the --count and --title options are required. use --help to see the usage. because the XUL => image part is still based on Win32::GuiTest, the user interface is somewhat fragile and cannot be as nice as that of img2ppt. it's still the user's responsibility to open .xul with firefox and not to enter the full view mode (via F11) before running the xul2img tool.

Delay settings like 0.5 sec should also work now since i've switched to Time::HiRes's sleep function. btw, Moose is so cool that writing perl 5 OO code has been exceedingly enjoyable. you know, perl 5's OO was ever a weak or even boring part in the past. Moose has brought me the feeling of using Perl 6 *today*. So don't hesitate and give it a shot! Enjoy~

(agentzh mooses.)

Oct 5 (to Sun Xin~)

Currently i am making slides for my XML talk. the topic is ``XML in the real world''. will send the slides to you for review once they're ready. :)

Oct 6 (to Cherry Chu~)

I will send you a message when i get up tomorrow morning. please keep your phone on, OK? if you get up earlier than i, would you please inform me via a message? thank you. :)

Oct 6 (to Jack Shen~)

The slides for my XML talk are ready now. please check out your mailbox for details. the slides contain a lot of pretty pictures. i've covered hot topics like RSS and AJAX using Google Reader, the Qzone site, and my GetQzone utility as study cases. these topics are extremely exciting! comments on my slides will be appreciated. :) i hope miss zheng will be kind enough to give me more time to explain everything in my slides...hehe.

Oct 7 (to Cherry Chu~)

cherry: moose. :)

cherry: elk! :D

I'm now heading out. :) 7:15 AM. don't be late, cherry.

...yay! cherry++ i'm already waiting for you. :)

...i am home now, cherry! yay! ...I was walking pretty fast. hehe. have a good rest. hopefully you will regain your strength soon. :)

Take care and sleep early, cherry. gotta run to shower and sleep myself. G'night &

Oct 8 (to Sun Xin~)

Cherry and i rode to the yangzhou city yesterday. we favored small roads in the fields over big ones. as a result, we were often followed by barking dogs and blocked by rivers and fields in our way. it was frustrating but also fun. she was amazingly vigorous and charming yesterday...we talked very happily and laughed a lot. you know, it was quite amusing to see she also talked and laughed very loudly, just like me! yay! hooray for cherry's beauty and the enormous parallels between us! hehe.

we've decided to ride to other cities in the next few times. but it's still undecided which city to go first. what's your opinion, man? ;-)

Oct 8 (to Cherry Chu~)

how are you today? i am still a bit tired. sigh.

...wow, nice to hear that. btw, i'm happy to see my friend laye has replied to your journal. he's a talented programmer and now studying in the Fudan university. :) And your ``journal of 70 kilometers'' post reads very well! :)

...nod nod. he was in ujs when he was an undergraduate student. sadly we have never met in person. :(

Oct 9 (to Sal Zhang and Jack Shen~)

Yay! now i can do Java Swing programming in pure perl 5! furthermore, my perl interpreter can now learn new Java libraries all by itself. so i can manipulate *any* Java classes and objects as if they were implemented directly in perl 5. thanks to Inline::Java and Java::Swing. now i'm trying to get them work with pugs (i.e. perl 6). unfortunately, pugs doesn't do auto-importing for perl 5 modules. sigh. maybe i need to write some perl 5 wrappers and glue code there. oh, well...

Oct 9 (to Sal Zhong~)

huh! google++

i will definitely look into its shiny source code search engine the other day. thanks for the info. :)

Oct 9 (to Sun Xin~)

man, i'll (selectively) translate these notes myself because i don't want to occupy too much of your spare time. anyway, i can do the translation work more easily and more accurately. would you please proofread both my english and chinese transcripts for me? i'll be very grateful to your review! ;)

Oct 10 (to Jack Shen~)

I've nailed down the basic syntax of the SXML language. it looks pretty neat. i'll implement converters for XML <=> SXML and HTML <=> SXML. i believe it's important enough for both XML's human reading and human writing.

Oct 10 (to Cherry Chu~)

Moose. will you come to my class this friday evening? :)

Oct 16 (to Sal Zhong~)

jerry gay (the guy also known as particle) is rewriting my smartlinks.pl using Moose . it's really wonderful! he will commit the code to the parrot repos. He said he would introduce smartlinks to the parrot test suite and link the tests to both the Perl 6 Spec and the parrot PDDs. not sure if he still has the crazy plan to port smartlinks.pl to PIR. anyway, as christopher said, the idea of smartlinking has inspired several add-on hacks. hooray!

btw, pugs 6.2.13 is going to release tomorrow. larry is using pugs for his $work! sweet...

This entry was originally posted to

http://agentzh.spaces.live.com/blog/cns!FF3A735632E41548!125.entry

Sunday October 01, 2006
02:27 AM

Notes for this fortnight (2006-09-21 ~ 2006-09-30)

Sep 21 (to Sun Xin~)

our charming XML instructor today asked me to give one or two talks in her class because she thought i was an expert in this domain. i'm very excited and have decided to make some good- looking slides in both english and chinese. i'll work on the new slides for the coming talks in the next few days. i'll send them to you for review once they are ready.

i'll also send you a bunch of _old_slides_ for the talk given in the last term tomorrow. they're in pure english and most of my classmates liked it. i hope you can proofread the old slides since i'm going to publish them on the web.

Our XML instructor is so beautiful that i dare say a lot of boys in the classroom like her very much. she holds great charm for me. i love to talk to her after the class. talking to her in person is really enjoyable. she is an extremely lovely girl. yay for her beauty and good mood!

Sep 22 (to Sun Xin~)

mails sent. remember to use Firefox to access the .xul URL (i.e. the slides) and don't click too fast while reading the slides since loading images can be slow.

btw, i'll use cherry's qzone blogs as a study case in my new slides for the XML talk. the slide-making process can be fun! stay tuned! :)

Sep 22 (to Jack Shen~)

hey, jack. let me talk about microcosmic stuff, such as interface design for individual classes or small class library and you talk about the handwavy macroscopical things like large OO systems. i'll try my best to put enough basic weapons under the audience's belt before your lectures. what's your opinion?

Sep 23 (to Sun Xin~)

my instructors have asked me to give for total 12 talks this semester. that's really wonderful since i can take a more leisure pace during my talks. but i definitely need many more slides and pictures. i'll be talking about XML, Regular expressions (regexes) and object-oriented modeling and design. what cool stuff!

Sep 27 (to Sun Xin~)

i've produced 82 slides these two days, and i am about to crash... for my first talk, there are still 20 slides to go... it's really exciting!!! man!!!

man, i've sent a weird english poetry to you for translating. i need the chinese transcript for my slides. please get back to me ASAP. it's quite urgent since the talk is scheduled on *this* Friday evening. thank you in advance. :)

Sep 27 (to Cherry Chu~)

our XML teacher has asked me to give one or two talks to my classmates in her class. and i am going to use *your* Qzone blogs as a study case in my slides. :)

will you mind my introduction to your Qzone home? btw, an even shorter URL is working now: http://perlcabal.org/agent/cherry.html. feel free to give it a shot.

...glad to hear that. i have 16 lectures to give out this semester. i've been busy making slides for my talks. it is a hard job but is also fun. :)

Sep 29 (to Sun Xin~)

the talk was a big success. the audience laughed a lot and i was often stopped by the girl students' ``wow''. my instructor said after the talk that he was exceedingly impressed. he told me that it had given him great inspiration and determination. he said he had even been pondering giving up his C.S. career, but my talk completely changed his mind.

...nah. cherry didn't come because she is in a different department and i had not invited her to my talk.

...sorry, i didn't show your transcript explicitly in my slides. Audrey offered a translation in ancient chinese right before the talk and i used hers. but your work had helped me a lot. without your translation, i can't grok that poem to such an extend. thank you!

it's worth mentioning that Larry Wall also provided me with an excellent translation in modern english. you know, he is a great linguist. :)

...LOL. lucky indeed. getting so much help and support is like a dream!

Sep 30 (to Cherry Chu~)

Heh, it will rain tomorrow anyway. hopefully the weather will get better when you come back from shanghai. :)

...nope, not that one. i was talking about the art of naming. the talk mentioning your Qzone space will be given on Oct 14, which is about XML in the real world.

Sun Xin asked me last night if cherry had attended my talk. and i explained to him that you are in a different department and sadly i had not sent you an invitation.

The talk was on this Friday evening. and i'll give talks at that time for every week from now on. we are at Z101. welcome joining us! 9th and 10th classes. :)

...nah, chinese speech mostly. every slide contains two versions of the content, the chinese version and the english version. and yeah, there'll be many students in my class. don't worry. :)

it will simply rock if you can come. for another thing, i really hope you can also attend my XML talk because i'm going to show my classmates your blogs there. he he.

the XML talk has been scheduled at 10504, 7th or 8th class. the concrete date is still undecided. i'll tell you once i had talked to our XML instructor. OK?

...(agent does his happy dance.)

the XML talk should be on Thursday afternoon, btw.

cherry, i have the idea of writing journals based on my cellphone messages. it's a great source of materials, you know. of course, i won't public any messages that i've received from others. for example, your replies will be excluded while my messages to you will probably be shown. what's your opinion? will you mind?

This journal has also been posted at http://agentzh.spaces.live.com/blog/cns!FF3A735632E41548!124.entry

Saturday April 08, 2006
02:18 AM

The Genetic Algorithm Used by Audrey

I've been reading #perl6 IRC logs for more than a year and it's a very good way to sync with the rapid Perl 6 development.

Sometimes I find something really, really interesting so I'd like to quote them here. After all, I know there're many Perl 5 programmers (like me!) who love to learn more about Perl 6 and the future of Perl 5.

2006-04-06
-----------
[04:55] <arcady> by the way, how much of "the perl 6 grammar" exists at the moment?
[04:58] <TimToady> depends on how you count, I suppose.  rule syntax is fairly well characterized by now.  a lot of it is specced pretty well, for some definition of pretty that ain't pretty.
[04:59] <arcady> so at least we can have something like the grammar grammar
[04:59] <TimToady> Most of the operator precedence is not done with rules at all.
[04:59] <TimToady> Yes, the grammar grammar is already bootstrapped approximately twice.
[04:59] <arcady> it's kinda hard to keep track of all the stuff going on...
[05:00] <TimToady> there'e very little top-down grammar over the bottom-up expression parser.
[05:00] <TimToady> more top-down involved in scanning complex tokens containing subexpressions.
[05:00] <TimToady> but the main complication remaining is just making sure all the grammatical categories work as envisioned.
[05:01] <TimToady> Then there's just little detail of attaching semantics to the parse...  :)
[05:01] <TimToady> s/little/the little/
[05:02] <arcady> that can be left as an exercise to the implementors : )
[05:02] <TimToady> But audreyt says that Perl 6 now fits in her head, so that should be finished a day or two after the grammar is done.
[05:04] <arcady> that would be most awesome
[05:05] <TimToady> well, even a month or two would be awesome.  a year or so is more likely before we have something really, really solid.  Still, I'm very happy with how it's going.
[05:05] <arcady> well, I'm happy that you're happy, and that it's going
[05:05] <arcady> I guess it's not entirely obvious from here
[05:06] <arcady> and all the various bootstrap efforts and targets and so on are confusing
[05:07] <TimToady> Hmm, yes.  Audrey
[05:07] <TimToady> Audrey's development methodology resembles a flooding algorithm at times...
[05:08] <TimToady> Or maybe a genetic algorithm.

__END__

Yeah, in the last year, I was also confused by the JavaScript, Perl5, and all other backends for Pugs. I was asking in my mind, "why can't autrijus focus on parrot, which is believed to be the only VM Perl 6 should be run on? Isn't it possible that the various backends will slow down the roaring speed of Pugs?" Now I finally understand the approach Audrey has been taking -- genetic algorithm or something like that. Given that there're always more than one way to do it, how can we figure out the "best" way if we haven't tried others yet? And furthermore, never forget that -Ofun is always the meta goal of the Pugs project. :)

parrot may not be the only choice and may not be the best choice, as evidenced by the following conversation:

2006-04-06
-----------
[05:11] <arcady> what's going on with parrot, by the way?
[05:12] <arcady> and how does any of that connect to any of this?
[05:13] <TimToady> Parrot is sort of the other end of the world from me, so I just follow along in p6i mostly.  I hear conflicting things, but I think it'll get there eventually, for some definition of "there".
[05:14] <TimToady> Whether it will be "the" Perl platform or "a" Perl platform, or somewhere in between, remains to be seen.

__END__

P.S. I must admit, Audrey's "Genetic Algorithm" is funny and helpful even in a general sense. I've successfully applied that to most of my open source projects. Multiple approaches and multiple perspectives often lead to surprisingly deep insights. That may be the most useful "algorithm" I learned from Audrey++. ;-)
Friday January 20, 2006
10:03 AM

use HTTP::Proxy to log my web accessing history

Yeah, I visit many websites everyday. what I'm wanting and what I'm always looking for is a facility to automagically keep a record of the URLs and page titles I've just accessed, so that I can analyse the history some time later to find out the focus of my interest in a particular period of time, for example. And it's very likely that I can come up with even more interesting statistical consequences.

The Mozilla browser doubtlessly gives builtin support for accessing history, but unfortunately exporting that history info is not trivial. what I want is not only the URLs, but also the corresponding page titles (if any!) and the visiting time stamp.

Several weeks ago, I happily found that the CPAN module HTTP::Proxy can come to the rescue. What I need to do is just writing several lines of Perl code using that module, running this script at the background as a local HTTP proxy server, and setting my web browser to simply use that. By doing this, my local proxy has a chance to monitor all the HTTP traffic between my browser and the Internet.

It's fun to see that my local proxy server can also use a remote proxy. so my local one then becomes a secondary proxy, no? ;D

The HTTP::Proxy module also supports logging internally, thus my code is even simpler:

use HTTP::Proxy ':log';
my $logfile = ">>$home/myproxy.log";
open my $log, $logfile or
        die "Can't open $logfile for reading: $!";
my $proxy = HTTP::Proxy->new(
        logmask => STATUS,
        logfh => $log,
);

The logmask parameter here controls what kind of things the proxy should record. the STATUS constant indicates only basic URL and response code will be logged. What I get in the log file is something like this:

[Fri Jan 13 17:25:42 2006] (1888) REQUEST: GET http://www.perl.com/
[Fri Jan 13 17:25:53 2006] (1888) RESPONSE: 200 OK
[Fri Jan 13 17:25:53 2006] (1888) REQUEST: HEAD http://www.google.com/mozilla/google.src
[Fri Jan 13 17:25:54 2006] (1888) RESPONSE: 200 OK
[Fri Jan 13 17:25:54 2006] (1888) REQUEST: GET http://www.perl.com/styles/main.css
[Fri Jan 13 17:25:55 2006] (1888) RESPONSE: 304 Not Modified ...

Hmm...very cute! However, HTTP::Proxy's builtin logging mechanism doesn't respect HTML titles. Thus I need to provide a user agent of my own:

package MyUA;
use HTTP::Proxy ':log';
use base 'LWP::UserAgent';
sub send_request {
        my ($self, $request) = @_;
        my $response;
        eval {
                $response = $self->SUPER::send_request( $request );
        };
        if ($@ and not $response) {
                return HTTP::Response->new(500, $@);
        }
        if ($response->is_success) {
                my $type = $response->header('content-type');
                if ($type and $type =~ m[text/html]i) {
                        if ($response->content =~ m[\s*(.*\S)\s*]si) {
                                $proxy->log( STATUS, 'TITLE', $1);
                        }
                }
        }
        return $response;
}

Now we have HTML titles recorded down as well, as witnessed in my log file:

[Tue Jan 17 20:33:47 2006] (2484) REQUEST: GET http://perladvent.org/2004/20th/
[Tue Jan 17 20:33:50 2006] (2484) TITLE: Perl 2004 Advent Calendar: Filesys::Virtual
[Tue Jan 17 20:33:50 2006] (2484) RESPONSE: 200 OK

Then feed the customized user agent to my HTTP::Proxy instance I created earlier:

my $agent = MyUA->new(
        env_proxy => 1,
        timeout => 100,
);
$proxy->agent( $agent );

At last, we enter an infinite loop as every http proxy server:

while (1) {
        eval { $proxy->start(); };
        warn $@ if $@;
}

That's it!

It already works for me, but there're still several pitfalls in this solution:

  • Images won't display in MS Internet Explorer (Mozilla works fine, however)
  • It seems to me that HTTP::Proxy doesn't support forking by default so it leads to poor performance if I request multiple URLs simultaneously. (BTW, Is there a way to switch to a forking engine? I can't find a word in its POD docs.)
  • SSL connection doesn't work on my box.

Have fun!