Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I’m not sure I understand what your code is doing, so I’ll code to your specification instead…

    sub compress_body {
        my ( $html ) = @_;
        my $parser = HTML::TokeParser::Simple->new( \$html );
        my $text = '';
        my $in_body;

        1 while $_ = $parser->get_token and not $_->is_start_tag( 'body' );

        while ( my $token = $parser->get_token ) {
            if( $token->is_text ) {
         

    • The problem is, it gets passed an entire HTML document and has to preserve everything up to the first body tag (inclusive) and after the last body tag (also inclusive).

      • Ah, now suddenly all your contortions make sense.

        sub compress_body {
            my ( $html ) = @_;
            my $parser = HTML::TokeParser::Simple->new(\$html);
            my $out = '';
            my $body = '';

            while ( my $token = $parser->get_token ) {
                if( $token->is_start_tag('body') .. $token->is_end_tag('body') ) {
                    $out .= $token->as_is if $token->is_tag( 'body' );