Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jozef (8299)

jozef
  (email not shown publicly)
http://jozef.kutej.net/
Jabber: jozef@kutej.net

Journal of jozef (8299)

Friday August 28, 2009
02:41 PM

YAPC::EU::2009 feedback - obrigado!

So time passed since YAPC::EU::2009 This was my third YAPC visit and the first one where I had a talk. Was nice, was fun, were a lot of people. :) If anyone haven't been to any YAPC then, yes, it's worth it! It's kind of strange to find out that the CPAN authors are made of flash and bones and it's possible to see them, well and touch them, just not too much please! :D

So what was there to see? My first talk that I attender was about AI::CBR from Darko. His module can be used to add some more intelligence to for example search results. I like it as it was a really practical one. As this YAPC::EU topic was "Corporate Perl" there was a lot of talk about how is Perl used in corporates and what problems does the Perl developers face. From the new thinks, there were talks about MooseX:Declare (by pdcawley), KiokuDB (by nothingmuch), Regexp::Gramars (by Damian), Perl 5.10.1 (by rgs), ... What else did I learn? That it's possible to do speach recognition (by Thomas Netousek), how to write reusable code - 1st rule: don't write it but reuse (from domm), how to Test::Regexp (by Abigail), how to remotely control volunteers (by Karen) and that it's not so bad to give a talk at YAPC...

Regarding my talk I was afraid of three things - the camera, Damian and the audience. The cameras were missing, Damian didn't show up, not too many people show up and well the audience was great!

As I said, I was a worried about the recording of my talk, that if the talk will be not good it will be there on the net for ever. Well and internet never forgets... On the other hand, after I found out there there were no recordings, I was sad that I will be not able to watch my talk later on to compare and see how it was.

Damian is not biting ..., but it was a surprise to see that he marked my talk in his personal schedule.

There were ~20 people that attended my talk. At the same time there was a talk from Paul Fenwick professional trainer and speaker with ~120 attendees... Thanks to "mime" audience, I've enjoyed my talk. I got some interesting questions and direct feedback after the talk. Special thanks to andy.sh for his dependencies question and suggestion to generate Makefile. I'm keeping this in the back of my mind and I think I'll really use this idea.

What was the talk about? It was about how to do static generated content and what can be done with it. Once I have read the Samba documentation for developers and there was written about code generation, that major part of the C code is generated. There was written, that once they started to do code generation they just could not stop. I have felt to the same trap with web content generation. I've tried it and I can not stop until I find out where are the limits. :) The idea is not new, it's here since beginning of the web. MT does it, WebGUI does it, my friend Emmanuel does it :) but still I think it's interesting to speak about it.

During my 40minutes talk I kind of run out of time. I've tried to show too many thinks in too much details. My 40 YAPC::EU minutes are gone, but there is enough time and place on this blog. So I've decided to create a blog series and here are the future titles:

1. make mehappy   # make for fun and profit
2. a hook for you
3. scraping my self
4. less can be more
5. dôveruj ale preveruj
6. feed us back
7. XSLT hammer
8. one Apache child must be enough for everyone
9. 12MB XML -> 2x5k HTML pages   # the way of elephant

11:05 AM

use feature 'state'; # for caching

What about using the fact that the state declared variable will never be reinitialize for caching? Let's say the path, filename and filecontent will never change no matter how many calls are made to the function where it is needed. Then:

state $filename = File::Spec->catfile('path', 'path', 'filename.txt');

will cache the File::Spec call. Unfortunately this doesn't work:

state @content = read_file($filename);
# => Initialization of state variables in list context currently forbidden

In this case the working "cached versions" is more verbose:

state @content; @content = read_file($filename) if not @content;

The only problem of this persistent state is that it is persistent :) so memory of the variable will be freed only at the end of program. (or if undef is assigned, but then the reinitialization doesn't work and there is probably a better way how to do caching then with state) Should be OK for short running scripts.

Thursday July 30, 2009
09:44 AM

>500k lines of Perl code and >2.4k packages

Once upon a time there was >500k lines of Perl code and >2.4k non-CPAN packages in a huge SVN repository. Sounds like fun? Yes, especially for a new-comer :-)

Now? Even more lines and even more packages in even huger SVN repository ;-) but also a tag and daily build system that uses CPAN::Mini::Inject to insert tarballs to local CPAN::Mini mirror to be installable via CPAN shell. In addition it's possible to browse them using CPAN::Mini::Webserver.

How? http://github.com/jozef/HTTP-DAV-Browse/ that allows to walk through SVN WebDAV (check examples/hdb-build-tarballs for complete script), http://github.com/jozef/Build-Daily/ to allow daily/svn_revision based versions and http://github.com/AndyA/CPAN--Mini--Inject/ that has now option to index *.pm files of tarballs (--discover-packages).

Future? Feeds for everyone about failed builds, smoketesting, automated Debian packages builds or ??? Let's see where the fantasy will take us...

Sunday July 19, 2009
04:59 PM

patching Debian /usr/share/perl/5.10.0/CPAN.pm

apt-get install libtest-exception-perl libfile-find-rule-perl libcarp-clan-perl
apt-get install libjson-xs-perl libclass-accessor-perl libwww-perl
cpan -i dbedia::Debian
mkdir /var/cache/apt/apt-pm
apt-pm update
# try:
# apt-pm find Test::More
# will output:
# libtest-simple-perl/0.62-1: /usr/share/perl5/Test/More.pm 0.62
# libtest-simple-perl/0.88+dfsg-1: /usr/share/perl5/Test/More.pm 0.88
# libtest-simple-perl/0.90-1: /usr/share/perl5/Test/More.pm 0.90
# perl-modules/5.10.0-24: /usr/share/perl/5.10.0/Test/More.pm 0.72
# libtest-simple-perl/0.80-1: /usr/share/perl5/Test/More.pm 0.80
cd /usr/share/perl/5.10.0/
wget http://cpansearch.perl.org/src/JKUTEJ/dbedia-Debian-0.02/examples/CPAN.pm.patch
patch -p1 < CPAN.pm.patch

So what? With this patch CPAN shell will install Debian package of module prerequisity if available taking in account also required version. Then it is possible to do `cpan -i Xacobeo` and most of the dependecies (all except Gtk2::SourceView, that is not packaged) will be installed directly from Debian packages.

`apt-pm` has all the Debian .pm files indexed, so feel free to ask about `apt-pm find SVN::Core` which will return 'libsvn-perl/1.6.1dfsg-1' or `apt-pm find Pidgin` that will tell 'pidgin/2.4.3-4lenny2'. :)

09:24 AM

i18n of wikipedia links

The day before yesterday (Friday) we went for some beer (ba.pm social meeting) and we spoke a lot. ;-)

Some time ago I've asked potyl to make some French translations for me. Inside the translations there were also wikipedia links so that it's possible to point to the French wikipedia. (instead of English for English translations). So on Friday potyl told me that the link i18n is a task for program and not for a human. For sure that it IS MACHINE WORK! I should have known, but sometimes "people" don't see the obvious. ;-)

update wikipedia links script

I've wrote that script on my way back to Vienna in the train. And it took no more than 1h. It's universal for en to any language.

Basicaly it's scraping the wikipedia.org. For my ~70 links it should be fine but before you do the same, read "Why not just retrieve data from wikipedia.org at runtime?" - robots has rate limit of 1req/s. Wikipedia also offers untransformed raw database format or the database dumps for the users with the "most interest".

Wednesday July 08, 2009
04:51 AM

my $YAPC::EU::2009::talk;

my $YAPC::EU::2009::talk = 'tested yesterday';
Slides=49, Tests=1, 40 wallclock minutes

Yesterday the Vienna.pm meeting was really full of talks. 3 YAPC talks from Aurum, Pepl and me + a lightning talk made by Daxim. Quite an exhausting evening.

I'll use this blog entry also as an invitation for you to come and listen to my YAPC talk about static content. Yesterday I found out that I had prepared too many slides and will have to remove the less interesting ones. Then the talk will be full of practical examples about how things are made. The examples will show all the features of Bratislava.pm.org. The page is open source, so feel free to browse the repository in the mean while and see you on Wednesday, 5 August 2009 11:55 in Lisbon. :-)

Tuesday July 07, 2009
09:40 AM

simple Debian repository

The folder structure is following:

debian-simple-repo
|-- Makefile
|-- Release.conf
`-- unstable # files here generated
   |-- Packages
   |-- Packages.bz2
   |-- Packages.gz
   |-- Release
   |-- Release.gpg
   |-- Sources
   |-- Sources.bz2
   `-- Sources.gz

Makefile

all: repository

repository:
   dpkg-scanpackages unstable /dev/null > unstable/Packages
   dpkg-scansources unstable /dev/null > unstable/Sources

   bzip2 -c9 unstable/Packages > unstable/Packages.bz2
   gzip -c9 unstable/Packages > unstable/Packages.gz
   bzip2 -c9 unstable/Sources > unstable/Sources.bz2
   gzip -c9 unstable/Sources > unstable/Sources.gz

   apt-ftparchive -c=Release.conf release unstable > unstable/Release
   -rm unstable/Release.gpg
   gpg -abs -o unstable/Release.gpg unstable/Release

clean:
   rm -f unstable/Packages* unstable/Sources* unstable/Release*

Release.conf

APT::FTPArchive::Release::Origin "your@email";
APT::FTPArchive::Release::Label "Test repository";
APT::FTPArchive::Release::Suite "unstable";
APT::FTPArchive::Release::Architectures "i386 source";
APT::FTPArchive::Release::Components "main";

generation

All *.deb, *.dsc, *.diff.gz, *.changes, *.orig.tar.gz has to be copied to the unstable/ folder and `make` executed. This will generate unstable/Packages*, unstable/Sources*, unstable/Release* files. Then just the whole folder as-is copied to a web/ftp server.

usage

/etc/apt/sources.list file on a Debian machine has to be updated and a gpg key has to be added through `apt-key add`.

sources.list

deb http://your.hostname/some/folder/ unstable/
deb-src http://your.hostname/some/folder/ unstable/

beyond simple

For more advanced and more distribution like repository use reprepro. Setting up your own APT repository with upload support - is a good introduction to it.

Sunday July 05, 2009
08:59 AM

Perl6 spec in PDF

If anyone interested here (1.3M) is Perl6 specification in PDF. (one HTML page also available). I've generated those from http://svn.pugscode.org/pugs/docs/Perl6/Spec/ Pod files.

The PDF has 407 pages which seems to me quite huge. How much pages will then the documentation have if spec is 400+?

Wednesday June 10, 2009
04:21 AM

Need new versions? Use Debian!

This is not a joke even for many people Debian stands for - although stable and working but quite old versions. The important is that the Debian stable stays stable for a few years. Ageing versions is the trade of for the least amount of surprises. But actually we all need new/recent versions. Not for everything but for the SW/packages that we really use. So what has Debian to offer in this case?

  1. install from source
  2. back-porting
  3. packaging on-your-own
  4. mixing releases

1. install from source
we'll there is not too much to discuss this, everyone can use CPAN shell, compile Apache or use `./configure && make && make install` like in any other Linux distribution.

2. back-porting
There are already made backports just to take. There are some tutorials how to do it on the web and there is plenty more info just use the search engine. Basically what the process involves is to add sources line to /etc/apt/sources.list with someting like: "deb-src http://ftp.cz.debian.org/debian/ testing main non-free contrib" and `apt-get source libmoose-perl`. If you are lucky `cd libmoose-perl-0.74 && debuild` will create a .deb with Moose version 0.74. If not you will have to back-port some other dependency that is required.

3. packaging on-your-own
There are couple of ways how to package new things. If the CPAN module is already package with some older version, a good start is to get the source of that old version and reuse the debian/ folder from it by copying it to the extracted recent version. In most cases updating debian/changelog and the new dependencies in debian/control will be all the steps for packaging. If the module is not packaged then you can try `dh-make-perl --cpan Moose`. This script will download recent Moose and will try the best it can to prepare all files in debian/ folder. Even better tool is offered by CPANPLUS +CPANPLUS::Dist::Deb. A command `cpan2dist --verbose --format CPANPLUS::Dist::Deb Moose` will create cpan-libmoose-perl with recent dependencies "compatible" with system Perl. More info and an automated repository can be found @http://debian.pkgs.cpan.org/.

4. mixing releases
Mixing releases is a way how to install only minimum subset of testing/unstable packages to the stable release, as everyone has a different needs and taste. How? Set the /etc/apt/sources.list like this:

# lenny (stable)
deb http://ftp.cz.debian.org/debian/ lenny main non-free contrib
deb-src http://ftp.cz.debian.org/debian/ lenny main non-free contrib
deb http://security.debian.org/ lenny/updates main contrib non-free
deb-src http://security.debian.org/ lenny/updates main contrib non-free
# squeeze (testing)
deb http://ftp.cz.debian.org/debian/ testing main non-free contrib
deb-src http://ftp.cz.debian.org/debian/ testing main non-free contrib
# sid (unstable)
deb http://ftp.cz.debian.org/debian/ unstable main non-free contrib
#deb-src http://ftp.cz.debian.org/debian/ unstable main non-free contrib

and /etc/apt/preferences like:

Package: *
Pin: release a=stable
Pin-Priority: 800

Package: *
Pin: release a=testing
Pin-Priority: 700

Package: *
Pin: release a=unstable
Pin-Priority: 600

Now even when running `apt-get update && apt-get dist-upgrade` no new versions from testing will be installed. To install a new version from testing || unstable use `apt-get install -t testing libmoose-perl`. Here is what Moose will bring along with him:

Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
libalgorithm-c3-perl libclass-c3-perl libclass-mop-perl libdata-optlist-perl libdevel-globaldestruction-perl liblist-moreutils-perl
libmro-compat-perl libparams-util-perl libscope-guard-perl libsub-exporter-perl libsub-install-perl libsub-name-perl
libsub-uplevel-perl libtest-exception-perl perl perl-base perl-modules
Suggested packages:
perl-doc libterm-readline-gnu-perl libterm-readline-perl-perl
Recommended packages:
libclass-c3-xs-perl netbase
The following NEW packages will be installed:
libalgorithm-c3-perl libclass-c3-perl libclass-mop-perl libdata-optlist-perl libdevel-globaldestruction-perl liblist-moreutils-perl
libmoose-perl libmro-compat-perl libparams-util-perl libscope-guard-perl libsub-exporter-perl libsub-install-perl libsub-name-perl
libsub-uplevel-perl libtest-exception-perl
The following packages will be upgraded:
perl perl-base perl-modules
3 upgraded, 15 newly installed, 0 to remove and 62 not upgraded.
Need to get 9617kB of archives.
After this operation, 4431kB of additional disk space will be used.
Do you want to continue [Y/n]?
...
Setting up perl-modules (5.10.0-22) ...
Setting up perl (5.10.0-22) ...
Setting up libalgorithm-c3-perl (0.08-1) ...
Setting up libsub-uplevel-perl (0.2002-1) ...
Setting up libtest-exception-perl (0.27-2) ...
Setting up libclass-c3-perl (0.21-1) ...
Setting up libsub-name-perl (0.04-1) ...
Setting up libmro-compat-perl (0.10-1) ...
Setting up libscope-guard-perl (0.03-2) ...
Setting up libsub-install-perl (0.924-2) ...
Setting up libparams-util-perl (0.38-2) ...
Setting up libdata-optlist-perl (0.104-1) ...
Setting up libsub-exporter-perl (0.981-1) ...
Setting up libdevel-globaldestruction-perl (0.02-1) ...
Setting up libclass-mop-perl (0.81-1) ...
Setting up liblist-moreutils-perl (0.22-1+b1) ...
Setting up libmoose-perl (0.74-1) ...

If you want to go even further then just do `apt-get install -t unstable libmoose-perl`:

...
Setting up libclass-mop-perl (0.85-1) ...
Setting up libmoose-perl (0.80-1) ...

Cool, isn't it? And what if something does wrong? (What could possibly go wrong?) You can check what is in the system from testing/unstable simple via `apt-show-versions | grep -E '/(testing|unstable)'`:

libalgorithm-c3-perl/testing uptodate 0.08-1
libclass-c3-perl/testing uptodate 0.21-1
libclass-mop-perl/unstable uptodate 0.85-1
libdata-optlist-perl/testing uptodate 0.104-1
libdevel-globaldestruction-perl/testing uptodate 0.02-1
libmoose-perl/unstable uptodate 0.80-1
libmro-compat-perl/testing uptodate 0.10-1
libparams-util-perl/testing uptodate 0.38-2
libsub-exporter-perl/testing uptodate 0.981-1
libsub-uplevel-perl/testing uptodate 0.2002-1
libtest-exception-perl/testing uptodate 0.27-2
perl/testing uptodate 5.10.0-22
perl-base/testing uptodate 5.10.0-22
perl-modules/testing uptodate 5.10.0-22

Do you want your Moose 0.74 back? Just do `apt-get install libmoose-perl/testing`. Well and it is possible to exercise this sort of "fun" on your own internal company repositories having stable/testing/unstable for your own projects with a possible way how to get back.

Still anyone thinks that CPAN shell must be enough for everyone ;-) It's good for many but when it comes to bigger projects and maintenance of many servers, it's no more enough.

Saturday June 06, 2009
05:15 PM

opl-perl

opl-perl => wtf? well, it should stand for opt perl perl :-)

  • Perl 5.10.0-22 for Lenny
  • can be found in /opt/perl/bin/perl
  • /usr/share/t/* experiment

The goal is to have Perl+modules from Squeeze (Debian/testing) available in Lenny (Debian/stable) whithout touching the system /usr/bin/perl.

Ingredients:

  • 1 Debian Lenny
  • working internet connection
  • piece of curiosity
  • ~5 minutes

Procedure:

gpg --recv-key F80BD927
gpg --fingerprint --list-key F80BD927
# check
# pub 1024D/F80BD927 2008-09-02 [expires: 2018-08-31]
# Key fingerprint = 9F84 0B8D 193E 2052 9343 A470 0B43 A050 F80B D927
gpg --armor --export F80BD927 | sudo apt-key add -

cat >> /etc/apt/sources.list <<__END__
deb http://bratislava.pm.org/debrepo/opl-perl/ unstable/
deb-src http://bratislava.pm.org/debrepo/opl-perl/ unstable/
# needed only when packaging
#deb http://bratislava.pm.org/debrepo/opl-pkg/ unstable/
#deb-src http://bratislava.pm.org/debrepo/opl-pkg/ unstable/
__END__

sudo apt-get update
apt-cache search opl- | perl -lane 'print $F[0]' | xargs sudo apt-get install -f
/opt/perl/bin/perl -V

Comments:

Did I say it's experiment? Or a proof that it's possible and a playground. :-) For example checkout the /usr/share/t/ folder after instalation. I've setup debhelper to copy the whole content of t/ folder of each package there. It seems that it's not a good idea as many tests are depending on having files outside the t/ folder, looking for t/some.pm or are looking for .pm files as for example everywhere used pod.t test. Besides there is a Module::Build with dependecies so it's a solid base to install all the rest of the CPAN via CPAN shell ;-)