Stories
Slash Boxes
Comments

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jkondo (7642)

jkondo
  (email not shown publicly)

Journal of jkondo (7642)

Tuesday February 27, 2007
04:58 PM

Text::Hatena 0.20 released.

[ #32524 ]
I uploaded Text::Hatena 0.20. It's quite new version of Text::Hatena.

I rewrote the whole code using Parse::RecDescent and Regexp::Assemble. Number of modules were reduced to 2 from 47 files. Line of codes where changed from 2600 lines to 600 lines. My benchmark marked 300-400% higher performance than ver.0.16.

I also removed some syntaxes which were specific to Hatena Diary.

Now, API for parsing text were changed too. Please be careful to upgrade your
Text::Hatena to version 0.20+.

You can use Text::Hatena simply as below.

my $html = Text::Hatena->parse($text);

And, you can extend your parser like this. You can easily make your original parser which can handle some other format.

package MyParser;
use strict;
use warnings;
use base qw(Text::Hatena);

__PACKAGE__->syntax(q|
    h3 : "\n*" timestamp(?) inline(s)
    timestamp : /\d{9,10}/ '*'
|);

sub h3 {
    my $class = shift;
    my $items = shift->{items};
    my $title = $class->expand($items->[2]);
    return if $title =~ /^\*/;
    my $ret = "<h3>$title";
    if (my $time = $items->[1]->[0]) {
        $ret .= qq|<span class="timestamp">$time</span>|;
    }
    $ret .= "</h3>\n";
}

sub timestamp {
    my $class = shift;
    my $items = shift->{items};
    return $items->[0];
}

1;

You can also extend inline elements like this.

Text::Hatena::AutoLink->syntax({
    'id:([\w-]+)' => sub {
        my $mvar = shift;
        my $name = $mvar->[1];
        return qq|<a href="/$name/">id:$name</a>|;
    },
    'd:id:([\w-]+)' => sub {
        my $mvar = shift;
        my $name = $mvar->[1];
        return qq|<a href="http://d.hatena.ne.jp/$name/">d:id:$name</a>|;
    },
});

I'd like to get your feedback.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.