Stories
Slash Boxes
Comments

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Mark Leighton Fisher (4252)

Mark Leighton Fisher
  (email not shown publicly)
http://mark-fisher.home.mindspring.com/

I am a Systems Engineer at Regenstrief Institute [regenstrief.org]. I also own Fisher's Creek Consulting [fisherscreek.com].
Friday February 24, 2006
04:44 PM

fog - Discover the "Fog Index" of Your Writing

[ #28796 ]

I was writing an email at work one day, but got to thinking is this too wordy? I finally remembered that prose readability is often measured with something called a "fog index" (how foggy is your writing?)

A search on Google turned up Merlyn's article "Discovering incomprehensible documentation" for his Nov 2000 Linux Magazine column. His article describes "fog indexing" his manpages using Lingua::EN::Fathom (one of the wonders of CPAN), a module for computing the fog index of a piece of text. But I wanted something more a simple, general-purpose utility.

So, here is "fog", a simple command-line program for discovering the fog index of text. fog (a thin, tasty wrapper around Lingua::EN::Fathom) can work on either a string, standard input, or a set of files. It is a little rough around the edges, so any suggestions are welcomed.

Here is the source for fog:

#!/usr/bin/perl -w
# List various measures of readability -- "fog indexes".
#
# Just a thin, tasty wrapper around Lingua::EN::Fathom.

# ------ pragmas
use warnings;
use strict;
use Lingua::EN::Fathom;

# ------ variables
my $file_count = 0;                     # count of files to analyze
my $file_name  = "";            # current filename
my $fog_index  = "";                    # fog indexing object (a Lingua::EN::Fathom)
my $ifh        = undef;                 # input file handle
my $input      = "";                    # string to analyze
my @input      = "";                    # array of strings to analyze
my $more_files = 0;                     # TRUE when more than one file to analyze

# ------ process command-line arguments
if (@ARGV < 1) {
    die "usage: fog -|-f FILE1 ...|STRING\n"
}
if ($ARGV[0] eq "-") {
no strict 'subs';
    $ifh = *STDIN;
    @input = <$ifh>;
    $input = join("\n", @input);
    $file_count = 1;
use strict;
} elsif ($ARGV[0] eq "-f" && @ARGV >= 2) {
    $file_count = scalar(@ARGV) - 1;
    if ($file_count > 1) {
        $more_files++;
    }
    shift;
    open($ifh, $ARGV[0]) || die "cannot open $ARGV[0]: $!\n";
    @input = <$ifh>;
    close($ifh);
    $input = join("\n", @input);
    $file_name = $ARGV[0];
} else {
    $input = $ARGV[0];
    $file_count = 1;
}

while ($file_count > 0) {
    $fog_index = new Lingua::EN::Fathom;
    $fog_index->analyse_block($input, 1);
    if ($more_files > 0) {
        print $file_name, ":\n";
    }
    print $fog_index->report, "\n";

    $file_count--;
    if ($file_count > 0) {
        shift;
    $file_name = $ARGV[0];
        open($ifh, $file_name) || die "cannot open $file_name: $!\n";
        @input = <$ifh>;
        close($ifh);
        $input = join("\n", @input);
    }
}

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Your code could be compacted quite a bit, if you change the calling syntax slightly and rely on the while () construct - and the surrounding perl magic :)

    Something like the following

    if (@ARGV < 1) {
        die "usage: fog -|FILE1 ...|-s STRING\n"
    }

    my $fog_index = new Lingua::EN::Fathom;

    if ($ARGV[0] =~ /^-s$/) {
        # string argument
        $fog_index->analyse_block($ARGV[1], 1);
        print $fog_index->report, "\n";
        exit;
    }

    local $/; # slurp whole files

    • Just curious--what was it in the diction/style package that you wanted to improve?
      • Well, there were a couple of points, some of them are probably a matter of personal taste/preferences ...

        • Consistency - I wanted to stick to Mark's command line arguments and allow the string argument. Personally I would have chosen a "clean" 'while ()' type of interface, meaning if I wanted to know the fog index of a string, I would pass it in via the shell with echo "string" | fog. This way the script behaves like a normal unix command line tool in that it can work as a filter for piping or be used to wo
  • Thanks for noticing Lingua::EN::Fathom ;-)

    As for that tiny wrapper, it doesn't look like trying to prove anything (more) about it's capabilities (comparing to Merlyn's article), everything besides Fathom's "report" being just "bloat".

    Some simple one-liners could have been enough to demonstrate Lingua:EN::Fathom's usefulness:

    # fog report for english_textfile:
    perl -MLingua::EN::Fathom -le 'print Lingua::EN::Fathom->new->analyse_file(shift)->report;' english_textfile

    # fog report, "pipe" style:
    p