Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

dpavlin (4910)

dpavlin
  (email not shown publicly)
http://www.rot13.org/~dpavlin/

Gaddamit! I'm a geek. ok? My page has more to say about me.

Journal of dpavlin (4910)

Tuesday March 11, 2008
06:35 PM

irc-logger - memory augmentation for #irc

Initially created in 2006 this handy tool is best described with original commit message:

IRC bot which replace human memory

Here is a quick run-down through available features:

  • web archive with search
  • irc commands: last, grep/search, stat, poll/count
  • tags// in normal irc messages (tagcloud, filter by tag, export as RSS feed)
  • announce /me messages to Twitter (yes, lame, but that was a year ago)
  • tags are available as html links for embedding (in wikis)
  • RSS feed from messages with tags (also nice for embedding)
  • irssi log import (useful for recovery in case of failure of machine or service :-)
  • announce new messages from RSS feeds (nice for wiki changes, blog entries or commits)

It has grown quite a bit from initial vision to recall last messages on the web (and it does go through some hoops to produce nice web archive). Adding of tags allowed easy recall of interesting topics but in a way now it provides an central hub for different content connected to irc.

It's written in perl using POE and it's probably not best example of POE usage. It is also somewhat PostgreSQL specific but works well for our small community at #razmjenavjestina irc channel. Since I have seen some interest in it this blog post might serve as announce of it's existence.

I will probably add some documentation to it's wiki page and add real muti-channel support (most of code is in there, but web archive needs filtering by channel). If you are interested to /invite it to your channel, drop me a note.

Originally I wrote this a couple days ago, but I made some more progress and code is now no longer so ugly that I can't share it with other perlers, so here it is...

Sunday November 18, 2007
08:37 AM

CWMP and MDAP servers

After nine months of playing with Thompson ADSL modems I have two projects which I wrote in perl, both of which is, as far as I know, first Open Source GPL implementation of those protocols.

MDAP

MDAP is protocol used by Thompson CPE devices to issue commands to CPEs (called ants) using multicast address 224.0.0.103 and port 3235 registered by IANA.

It's very cool idea, since you can connect as many devices as you have network ports or bandwidth, and allthough they all will boot with same IP address (and this create conflicts on IP network), you can still sand commands to each individual device using multicast.

Originally I developed it to flash multiple modems at once, but since I also added simple rules to change IP addresses and issue commands to devices. This essentially enables you to flash devices to some version of firmware and then change configuration a bit and have you test lab ready in few moments.

This project doesn't have a real project page yet, but you can take a look at source if you are interested...

This project also include (but doesn't use yet) simple perl BOOTP and TFTP servers, so in the end it will probably have perl-only solution for MDAP. If you just use included scripts and documentation for setup, it will use binary bootp and tftp server since this configuration was in use for at least half of the year, and I consider it stable.

CWMP

Perl CWMP server is essentially a low-level support for broken idea to communicate with devices using persistent connection SOAP with invalid XML (name spaces in some responses are just invalid) known as TR-069.

This is work-in-progress, and right now it's stable enough to work with multiple devices at once. In essence, it's protocol-violating SOAP server implementing persistent connection handling as described in TR-069 documentation (empty post, even without headers as first request, ehhh....).

Idea is to enable you, the user, to write perl rules against CPE devices.

It's half-way there: It does have disk-based command queue for each device (which is also NFS save, which is nice if you want to have multiple servers) and persistent storage for each CPE internal data tree implemented using DBM::Deep or YAML. When used with YAML, it great way to understand protocol. However, not all methods are implemented, but I hope to have full implementation by end of January 2008.

Friday August 24, 2007
04:12 AM

Exhibit facet browsing

We have few mp3 players which no longer work, but are still under warranty. So idea was to pick another device (which will hopefully work longer). However, on-line shops leave a lot to be desired if you want to just do quick filtering of data.

As a very fortunate incident, I stumbled upon Exhibit from SMILE project at MIT which brought us such nice tools as Timeline and Potluck.

So, I scraped web, converted it to CSV and tried to do something with it. In the process I again re-visited the problem of semi-structured data: while data is separated in columns, one column has generic description, player name and all characteristics in it.

So, what did I do? Well, I started with CPAN and few hours later I had a script which is rather good in parsing semi-structured CSV files. It supports following:

  • guess CSV delimiter on it's own (using Text::CSV::Separator )
  • recognize 10 Kb and similar sizes and normalize them (using Number::Bytes::Human )
  • splitting of comma (,) separated values within single field
  • strip common prefix from all values in one column
  • group values and produce additional properties in data
  • generate specified number of groups for numeric data, useful for price ranges
  • produce JSON output for Exhibit using JSON::Syck

So how does it look?

In the end, it is very similar to the way Dabble DB parses your input. But, I never actually had any luck importing data into Dabble DB, so this one works better for me :-)

This will probably evolve to universal munger from CSV to arbitrary hash structure. What would be good name? Text::CSV::Mungler?

This is a first post in series of posts which will cover one hack a week on my blog. This will (hopefully) force me to write at least one post a week on one side, and provide some historic trace about my work for later.

Sunday August 06, 2006
02:28 PM

Search::Estraier 0.07 now available on CPAN

After several months of testing, several new releases of Hyper Estraier with new features (newest one is masking [excluding] of linked databases when searching) version 0.07 is finally ready.

I would really suggest to all current users to upgrade to latest version. It fixes problems with set_skip and vectors and has, well, set_mask.

Require explicit version of this module when using new features, just to be safe that you are not running against older version (it's only sane behavior).

Wednesday May 10, 2006
12:26 PM

Search::Estraier 0.06 covers whole P2P API

I'm somewhat proud to annouce that new version of Search::Estraier now supports master node commands, thus brining this implementation in sync with P2P documentation of current Hyper Estraier version 1.2.5.

Saturday January 22, 2005
01:20 PM

Biblio::Isis

As I figured out by now, people are not reading journals with single post in them.

I as too lazy to move my perl-related writing here (mostly because I write them off-line) but I should note that I actually wrote perl-only replacement for OpenIsis called Biblio::Isis. It's on CPAN.

Thursday March 18, 2004
01:41 PM

Strict(ly) perl journal

So, I gathered enough courage to actually write my first post on use Perl;. I will try to post only perl related items here (and reference them from my primary weblog).
For a start, I'm writing perl-only module to read ISIS. It's a fun project which started out of frustration with OpenIsis perl bindings. It does read Isis files correctly but not IsisMarc, so I'm not releasing it yet. Is there any interest in it, and what would be proper CPAN namespace for it?