Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Mark Leighton Fisher (4252)

Mark Leighton Fisher
  (email not shown publicly)
http://mark-fisher.home.mindspring.com/

I am a Systems Engineer at Regenstrief Institute [regenstrief.org]. I also own Fisher's Creek Consulting [comcast.net].
Friday February 24, 2006
03:44 PM

fog - Discover the "Fog Index" of Your Writing

[ #28796 ]

I was writing an email at work one day, but got to thinking is this too wordy? I finally remembered that prose readability is often measured with something called a "fog index" (how foggy is your writing?)

A search on Google turned up Merlyn's article "Discovering incomprehensible documentation" for his Nov 2000 Linux Magazine column. His article describes "fog indexing" his manpages using Lingua::EN::Fathom (one of the wonders of CPAN), a module for computing the fog index of a piece of text. But I wanted something more a simple, general-purpose utility.

So, here is "fog", a simple command-line program for discovering the fog index of text. fog (a thin, tasty wrapper around Lingua::EN::Fathom) can work on either a string, standard input, or a set of files. It is a little rough around the edges, so any suggestions are welcomed.

Here is the source for fog:

#!/usr/bin/perl -w
# List various measures of readability -- "fog indexes".
#
# Just a thin, tasty wrapper around Lingua::EN::Fathom.

# ------ pragmas
use warnings;
use strict;
use Lingua::EN::Fathom;

# ------ variables
my $file_count = 0;                     # count of files to analyze
my $file_name  = "";            # current filename
my $fog_index  = "";                    # fog indexing object (a Lingua::EN::Fathom)
my $ifh        = undef;                 # input file handle
my $input      = "";                    # string to analyze
my @input      = "";                    # array of strings to analyze
my $more_files = 0;                     # TRUE when more than one file to analyze

# ------ process command-line arguments
if (@ARGV < 1) {
    die "usage: fog -|-f FILE1 ...|STRING\n"
}
if ($ARGV[0] eq "-") {
no strict 'subs';
    $ifh = *STDIN;
    @input = <$ifh>;
    $input = join("\n", @input);
    $file_count = 1;
use strict;
} elsif ($ARGV[0] eq "-f" && @ARGV >= 2) {
    $file_count = scalar(@ARGV) - 1;
    if ($file_count > 1) {
        $more_files++;
    }
    shift;
    open($ifh, $ARGV[0]) || die "cannot open $ARGV[0]: $!\n";
    @input = <$ifh>;
    close($ifh);
    $input = join("\n", @input);
    $file_name = $ARGV[0];
} else {
    $input = $ARGV[0];
    $file_count = 1;
}

while ($file_count > 0) {
    $fog_index = new Lingua::EN::Fathom;
    $fog_index->analyse_block($input, 1);
    if ($more_files > 0) {
        print $file_name, ":\n";
    }
    print $fog_index->report, "\n";

    $file_count--;
    if ($file_count > 0) {
        shift;
    $file_name = $ARGV[0];
        open($ifh, $file_name) || die "cannot open $file_name: $!\n";
        @input = <$ifh>;
        close($ifh);
        $input = join("\n", @input);
    }
}

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Your code could be compacted quite a bit, if you change the calling syntax slightly and rely on the while () construct - and the surrounding perl magic :)

    Something like the following

    if (@ARGV < 1) {
        die "usage: fog -|FILE1 ...|-s STRING\n"
    }

    my $fog_index = new Lingua::EN::Fathom;

    if ($ARGV[0] =~ /^-s$/) {
        # string argument
        $fog_index->analyse_block($ARGV[1], 1);
        print $fog_index->report, "\n";
        exit;
    }

    local $/; # slurp whole files

    • Just curious--what was it in the diction/style package that you wanted to improve?
      • Well, there were a couple of points, some of them are probably a matter of personal taste/preferences ...

        • Consistency - I wanted to stick to Mark's command line arguments and allow the string argument. Personally I would have chosen a "clean" 'while ()' type of interface, meaning if I wanted to know the fog index of a string, I would pass it in via the shell with echo "string" | fog. This way the script behaves like a normal unix command line tool in that it can work as a filter for piping or be used to wo
  • Thanks for noticing Lingua::EN::Fathom ;-)

    As for that tiny wrapper, it doesn't look like trying to prove anything (more) about it's capabilities (comparing to Merlyn's article), everything besides Fathom's "report" being just "bloat".

    Some simple one-liners could have been enough to demonstrate Lingua:EN::Fathom's usefulness:

    # fog report for english_textfile:
    perl -MLingua::EN::Fathom -le 'print Lingua::EN::Fathom->new->analyse_file(shift)->report;' english_textfile

    # fog report, "pipe" style:
    p