Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Journal of Kake (3534)

Wednesday February 26, 2003
07:03 PM

Plugin fever.

Plugins are cool. File::Find::Rule has them. Jerakeen likes them so much he wrote Bot::BasicBot::Pluggable.

CGI::Wiki is going to have plugins, too. 0.20 (which just escaped, some minutes ago) has (intentionally naive) metadata support, like so:

$wiki->write_node( "Reun Thai", "A restaurant", undef,
{ postcode => "W6 9PL",
category => [ "Thai Food", "Restaurant", "Hammersmith" ] } );
@nodes = $wiki->list_nodes_by_metadata( metadata_type => "category",
metadata_value => "Pub" );

(Earle has been an absolute star, testing out pre-releases and encouraging me to get around to adding this stuff.)

Plugin thoughts, as posted to the grubstreet list:

use CGI::Wiki;
use CGI::Wiki::Plugin::Location;
my $locator = CGI::Wiki::Plugin::Location->new;
my $wiki = CGI::Wiki->new;
$wiki->register_plugin($locator);
$wiki->write_node( "Jerusalem Tavern", "A good pub", $checksum,
{ os_x => 531674, os_y => 181950 } );
# Just retrieve the co-ordinates.
my ( $x, $y ) = $locator->coordinates( node => "Jerusalem Tavern" );
# Find the distance between two nodes.
my $distance = $locator->distance( from => "Jerusalem Tavern",
to => "Calthorpe Arms" );
# Find the nearest five other things that our wiki knows about.
my @others = $locator->find_nearest( node => "Jerusalem Tavern",
number => 5 );

The way the plugins will get hold of this data is by providing an on_write method that'll get called every time $wiki->write_node is called, with arguments like so:

$plugin->on_write( node => $node_name,
version => $version_number,
content => $content,
checksum => $checksum,
metadata => \%user_defined_metadata );

This will happen after the node data is all written, but before any lock is released. The user-defined metadata will already have been stored in the backend but it is available here for you to do what you will with it.

A plugin named, for example, CGI::Wiki::Plugin::Foo::Bar, will have free read-write access to any/all tables in the wiki's storage backend named like plugin_foo_bar* - for non-database backends if anyone ever writes one, there can be a similar namespace protection. So the CGI:Wiki::Plugin::Location plugin might store all its doo-dads in the 'plugin_location_os_coords' table, for example. This does mean that plugin authors have to care about differences between databases, but I'm strongly disinclined to try and invent a database-independent layer to sit on top of CGI::Wiki::Store::Database and cope with all possible imaginable circumstances.

Another plugin that I've put a small amount of thought into is one to cope with a hierarchy of categories.

And can someone tell me which tags to use to put code in so I don't have to put <br /> everywhere please? :)

Tuesday February 04, 2003
07:31 AM

Writing the code was the easy part

Now I have to work out what to call it.

I've done a fair bit of work on a usemod-style formatter for CGI::Wiki, and I've got something that would make a decent 0.01 release (when/if I get a couple of patches applied (no, this is not a nag, just an excuse for being so slow about getting it out)). As part of that, I needed to do something like URI::Find, but which was prepared for the URIs in the text to be wrapped in delimiters and to contain an optional title, since usemod formatting allows marking up external URLs like so:

[http://use.perl.org/ the use.perl site]

which will be turned into the use.perl site.

So, the code exists — URI::Find::Wrapped 0.01 — but I hate the name. It already confused a couple of people on IRC who assumed it had something to do with the way certain MUAs, newsreaders, etc, linewrap URIs if they're too long. It's got to go. But what do I call it instead?

Suggestions so far include:

  • URI::Find::Delimited — possibly my favourite so far.
  • URI::Find::Bracketed — also nice. Mark pointed out that people might want to use things other than brackets as delimiters, but maybe that should be discouraged, in which case using this name would be a good thing.
  • URI::Find::Encapsulated — I'm still thinking about this one.
  • URI::Find::Engated — it's a lovely word, but maybe a little obscure.
  • URI::Find::Wiki — I don't like this, because first of all it doesn't mean anything to people who don't know what a wiki is, and who's to say they won't have use for this code; and secondly the other thing I plan to do with this module (part of a content management thing at work) isn't a wiki.
  • URI::Find::Markedup — does what it says on the tin, but oh, if only “markedup” was actually a single word.
  • URI::Find::Markup — to me this sounds like something focused on finding markup rather than on finding URIs, and my module is definitely focused on the latter.
  • URI::Find::Embedded — I can see where this one is coming from, since the URIs are embedded within their delimiters, but I think “embedded” has the same problem as “wrapped” in that it means too many things.

Help?

Sunday January 05, 2003
06:54 PM

Backwards (in)compatibility

I should have known this was going to bite me in the arse at some point. I needed to add another table to the database schema for CGI::Wiki, to store details of which nodes link to other nodes, so we can do backlinks properly (ie, find out all nodes which link to a given node).

This means anyone upgrading from a pre-0.15 version needs to re-run the setup script on their database, so they have the extra table. If they don't, then their code will complain and die when they try to run it on an existing database.

Unfortunately the setup scripts distributed with pre-0.15 versions of CGI::Wiki have the nasty side-effect of deleting all your data if run on a pre-existing database. This is fixed in 0.15, but it means you really have to be sure which versions of the modules you're running the scripts with, if you're trying to fix up your existing database to work with the latest CGI::Wiki.

This is documented in README, INSTALL, Changes and the pod of Wiki.pm — I hope that's going to save anyone from getting annoyed with me.

On the bright side, we do get real backlinks, which is very cool (blair christensen's idea so thank him if you like it).

This links in to something I've been worrying about for a while — how do you pick the right balance between:

  • releasing early enough that people can use what you're doing, avoid reinventing wheels, and give you good ideas
  • not releasing so early that you have an absolute nightmare trying to keep backwards compatibility when you add new features that you didn't have time to get in before your first public release

The first release of CGI::Wiki took two months to escape from when I first thought of it, and I purposely delayed it until I'd decided how I was going to put version information in the tables. I hadn't even thought of backlinks at that point; it was only after I released it and blair mailed me that I realised they'd be useful. So even if I had delayed it longer, I'd still have the backwards compatibility problem now.

I'm actually less worried about changing the database schema than I am about changing the interface. At some point soon I'm going to start adding the capability to store metadata about the nodes, and right now I'm really feeling that the nodes should have been objects. Which one's neater:

$node = $wiki->retrieve_node("Penderel's Oak");
$node->add_metadata( type => "Categories", data => "Holborn" );
$node->write;

or

%node_data = $wiki->retrieve_node("Penderel's Oak");
$wiki->write_node("Penderel's Oak", $node_data{content}, $node_data{checksum}, { type => "Categories", data => "Holborn");

Now guess which one I've saddled myself with.

However, reasons to be cheerful:

Current to-do list:

  • Get phrase searching working with Search::InvertedIndex
  • Look at the code Roger sent me ages ago (sorry Roger, I'm slack)
  • Release CGI::Wiki::Formatter::Usemod
  • Think about that damned metadata problem
  • Write CGI::Wiki::Formatter::Pod
  • Get my head round Jo's RDF/grubstreet ideas
  • (very speculative) Think more on the recipe DTD/semantic web stuff I was talking about with Earle
Thursday December 19, 2002
07:37 PM

Release in the middle of the night, don't release very often

I've just released CGI::Wiki 0.10 - the observant will notice there are a few versions missing since the last released one, 0.05. This wasn't just to save myself the trouble of uploading a few mistakes before finally getting it right, but to flag a small interface change. I have documented the change in the pod, the example, README and Changes, and added some temporary code to warn if it's called with obsolete parameters, but I still worry that this is going to annoy people. Oh well, I doubt anyone's using this live yet, anyway.

Next thing to do is finish the SQLite backend to Search::InvertedIndex — it's passing about two-thirds of its tests now. Then write a wrapper around Search::InvertedIndex to cope with phrase searching (already have an idea of how to do this, planted there by its author via email).

And at some point I need to prod chromatic again about my suggested changes to Text::WikiFormat -- I need to be able to support usemod-style formatting.

Finally there's my silly modules to attend to — I was foolish enough to write a screenscaper for a Wiki, which although really quite useful for finding food and drink in London, does require quite a lot of babysitting as people change the site around me :) (Code is in the WWW::Grubstreet modules, not CPANned for obvious reasons, but available on my website.)

Sunday December 01, 2002
07:51 PM

Search::InvertedIndex in action

A couple of hours ago I wrote about Search::InvertedIndex. I wasn't expecting to get it done this quick, but I've just finished a first attempt at plugging it into CGI::Wiki. There's a pre-release tarball on my website.

It's pre-release because:

  • It only supports the MySQL backend of Search::InvertedIndex at the moment, and this is kinda pointless as anything other than an experiment, since DBIx::FullTextSearch works fine with MySQL, and can do phrase searching to boot;
  • Search::InvertedIndex is raising warnings — I think the uninitialised value ones are its fault rather than mine, since it's supposed to default to localhost if you don't give it a hostname for the database host; but I'm not so sure about the cleanup warnings.

Tomorrow is soon enough to make it better, though.

04:34 PM

Search::InvertedIndex and related things

(See also the London.pm version of this post.)

On Sun 17 Nov 2002, Kate L Pugh <kake@earth.li> wrote:

Is there anyone here who's used Search::InvertedIndex and can point me to some working example code? I'm having great trouble getting my head around it.

Just for completeness:

I seem to have finally figured it out. Here's an example script if anyone else is interested. Note that script does give a warning about the database not being open when it comes to cleanup time. I've yet to look into that problem properly.

(Before anyone suggests it — yes, I will be sending a documentation patch to the maintainer as soon as I manage to get hold of him.)

All this poking about has also led me to notice that this module doesn't care about the position of keys within the data, so it won't be able to do proper phrase searching. (I would love to be proved wrong here.) What are people using for that kind of thing? DBIx::FullTextSearch can do it, but that's limited to MySQL. I can't make head or tail of WAIT; does anyone know if that might suit my needs? And if so, could they point me to some idiot-proof documentation and/or examples?

04:24 PM

$self->_init;

Having resisted writing a journal for quite some time, I finally figured out a reason to cave in. I've been posting to the London.pm mailing list pretty much since I joined it, and seen lots of useful and interesting discussion. London.pm is however a somewhat limited community, so this is my chance to branch out a bit.

I'm very unlikely to post much here that I haven't also posted to places like London.pm or earthlings, but I will try to cross-pollinate where useful. If the duplication annoys anyone then, well, they don't have to read this. Oh, and I'm also very unlikely to post anything that isn't about Perl.