Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Mark Leighton Fisher (4252)

Mark Leighton Fisher
  (email not shown publicly)
http://mark-fisher.home.mindspring.com/

I am a Systems Engineer at Regenstrief Institute [regenstrief.org]. I also own Fisher's Creek Consulting [comcast.net].
Thursday August 19, 2010
11:43 AM

Consistent GUIs; Or, Using WPF for Good and Not Evil

Using WPF for Good and Not Evil is a nice little write-up on how we, as developers, need to consider why and how we might change the user interface of programs developed in WPF. My take on it is that "Just because you can do something does not mean you SHOULD do something."

(Ob.Perl: Perlesque should let you program directly in WPF by using the .NET libraries.)

Thursday July 29, 2010
11:13 AM

Stupid Lucene Tricks: Document Frequencies and NOT

  1. You can get the document frequency of a term (i.e. how many documents have that term) through Lucene.Index.IndexReader.DocFreq(t As Term) As Integer.
  2. You can get the IndexReader for a Lucene.Search.IndexSearcher through IndexSearcher.GetIndexReader().
  3. If you want to display the document frequencies for the individual keywords of a search, and a piece is a NOT phrase (like -antibiotic in antimicrobial -antibiotic), you cannot use DocFreq() directly. In that case, the document frequency can be computed as:

          DOCFREQ = count of all documents - DocFreq(TERM_NO_NOT)

    as in:

          DOCFREQ = 60227 - DocFreq(New Term("all", "antibiotic"))

    where the NOT piece was -antibiotic and all is the Lucene document field in question.

(Ob. Perl: Although PLucene is now 5 years out of date, Perlesque should eventually let you get at Lucene.NET via a strongly-typed Perl 6.)

Tuesday July 27, 2010
12:25 PM

Desperate Perl; or A Tale of Two Languages

Piers Cawley's A tale of two languages (if you haven't already seen it) speaks to the public perception that Perl remains a desperation language ("Desperate Perl") suited only for gluing things together when nothing else will do.

Meanwhile elsewhere in the real world, there is plenty (possibly a majority IMHO) of maintainable, understandable, well-written, efficient Perl code ("Large Scale Perl" as described by Piers). Worth a read.

(Although I like the name "Desperate Perl" a lot, I think that the names "Scripting Perl" and "Programming Perl" also describe these separate Perl programming styles in a less-emotional fashion (which is occasionally useful.))

Friday July 16, 2010
11:09 AM

Stupid Lucene Tricks: Hierarchies

You can search on hierarchies in Lucene if your hierarchy can be represented as a path enumeration (a Dewey-Decimal-like style of encoding a path, like "001.014.003" for the 3rd grandchild of the 14th child of the 1st branch).

For example, a search phrase like:

    hierarchy:001

would return only the direct children of the 1st branch, while:

    hierarchy:001*

would return all descendents of the 1st branch.

  1. To get only the children of a particular node, you specify only that node, like:

        hierarchy:001.014.003

  2. To get all of the descendents you specify everything that starts with that node:

        hierarchy:001.014.003*

  3. To get only the descendents after the children (grandchildren, etc.), you specify:

        hierarchy:001.014.003.*

Friday July 02, 2010
11:40 AM

pmtools-perl6-0.01

I am pleased to announce version 0.01 of pmtools-perl6, a suite of module tools for Perl 6. (Not quite up on CPAN yet as I write this.)

pmdirs is the only tool in pmtools-perl6 v0.01, as it was the simplest to port (more tools to come...)

On Cygwin (my testing environment), I cannot get the #! to work -- you will need to invoke pmdirs something like this under Cygwin:

    c:/parrot-2.2.0/bin/perl6 d:/cygwin/home/pmtools-perl6-0.01/pmdirs

(If you want to contribute Perl 6 ports of the other pmtools, please let me know.)

The source to pmdirs:

# pmdirs -- print the perl module path, newline separated
# tchrist@perl.com
# mark-fisher@comcast.net

# TODO: use warnings;
use v6;

for (@*INC) {
    say $_;
}

=begin

=head1 NAME

pmdirs - print out module directories

=head1 DESCRIPTION

This just prints out the current @INC path, one directory per line.
This is for people who don't want to parse through C<perl -V> output or
hack up their own calls to C<perl -e>.

=head1 EXAMPLES

    $ pmdirs
    /home/tchrist/perllib/i686-linux
    /home/tchrist/perllib
    /usr/local/devperl/lib/5.00554/i686-linux
    /usr/local/devperl/lib/5.00554
    /usr/local/devperl/lib/site_perl/5.00554/i686-linux
    /usr/local/devperl/lib/site_perl/5.00554
    .

This also works for alternate version of Perl:

    $ filsperl -S pmdirs
    /home/tchrist/perllib
    /usr/local/filsperl/lib/5.00554/i686-linux-thread
    /usr/local/filsperl/lib/5.00554
    /usr/local/filsperl/lib/site_perl/5.00554/i686-linux-thread
    /usr/local/filsperl/lib/site_perl/5.00554
    .

=head1 SEE ALSO

perlrun(1), perlvar(1), lib(3)

=head1 AUTHORS and COPYRIGHTS

Copyright (C) 1999 Tom Christiansen.

Copyright (C) 2006-2010 Mark Leighton Fisher.

This is free software; you can redistribute it and/or modify it
under the terms of either:
(a) the GNU General Public License as published by the Free
Software Foundation; either version 1, or (at your option) any
later version, or
(b) the Perl "Artistic License".
(This is the Perl 5 licensing scheme.)

Please note this is a change from the
original pmtools-1.00 (still available on CPAN),
as pmtools-1.00 were licensed only under the
Perl "Artistic License".

=end

Wednesday June 23, 2010
11:32 AM

Business: Execution vs. Ideas

If you want to start your own business, you need:

  • A product people want to buy; and
  • The willingness to work amazingly hard to get the business going.

These were my major take-aways from Top ten geek business myths, based on the article and my own experiences.

Ideas? Ha!

Don't worry about people stealing an idea; if it's original, you'll have to shove it down their throats. - Howard Aiken

What matters more is execution. In my chosen industry, Microsoft has been a good example of this. There were other, better OSes, but Microsoft made sure to get their OSes out on everyone's desktops, rather than limiting the user's choice of PC. Although Linux has made great strides, it is still more likely that you will find a reasonable driver for an arbitrary piece of PC hardware for Windows than for Linux. Microsoft has had better execution in getting Windows out to as many people as possible. (Heresy, I know.)

Even if you revile their products, many of the largest retailers have worked impressively hard getting their products out to everyone, not just a chosen few.

All of what you know is just a tool (a rather large and handsome tool, admittedly) in the process of getting your own business going. Unless your goal is to be a very small, boutique seller, you want to reach as many people as possible, and brains alone won't get you there.

Read the article, and tell me what you think.

Thursday June 17, 2010
05:58 AM

Stupid Lucene Tricks: Search case-insensitive, Retrieve ca

Sometimes when you build an index in Lucene, you want to structure the index so that people can search without worrying about case (case-insensitive search), but you want the display to contain the original mixed-case data (case-sensitive display). The trick is to split each Lucene field into 2 versions:

  1. A case-insensitive field that is indexed but not stored (Lucene.Net.Documents.Field.Index.ANALYZED and Lucene.Net.Documents.Field.Store.NO).
  2. A case-sensitive field that is stored but not indexed, preferably with a field name similar to that of its case-insensitive cousin field like "Display_Title" and "Title" (Lucene.Net.Documents.Field.Index.NOT_ANALYZED and Lucene.Net.Documents.Field.Store.YES).

Storing only the case-sensitive version reduces the index storage requirement (I have seen around a 40% increase in index size with this trick as compared to both storing and indexing one field).

Friday June 11, 2010
03:07 PM

ZeroMQ: Fastest. Messaging. Ever.

ZeroMQ (or 0MQ) appears to be a fast (8M+ messages/second), Open Source message-passing engine. I don't have a use for it now, but it does look interesting.

(There is no Perl interface for ZeroMQ, but it sounds (without my actually researching the task) like it shouldn't be too hard to clone the Ruby FFI interface for use with Perl.)

Friday June 04, 2010
11:18 AM

Technical Debt and the Stakeholders

Technical Debt (and Technical Debt Decision Making) are a good take on using the concept of technical debt to ensure that your stakeholders understand why you must spend time fixing their system even though it may seem to be working perfectly fine right now. (An example of incurring technical debt is using SQLite when you know that in the long term the system needs to store its data in Postgres or Oracle.)

(The author of these essays, Steve McConnell, and his team at Construx really know their stuff -- if you have chance to take one of their classes, grab it.)

Wednesday December 23, 2009
12:40 PM

Not All User Stories Have Happy Endings

Sometimes, despite your best efforts as a developer, you end up with unhappy users. And that's OK.

In "Consultants: It's not the theory, it's the execution", Chip Camden makes this point:

Sometimes you need to say no to user requests. (Unfortunately, not all user stories have happy endings.)

Whatever you do, there will be times when someone is unhappy with you. It matters not that you are the most talented developer ever known, or the most gifted designer that will ever be seen, someone will not like what you have done. It may be your politics, it may be your attitude, it may have no relation to reality -- you will run into people that you just can't please.

Because defining requirements is so fiendishly difficult, software developers have a special problem in this regard -- and especially when the user does not themselves know what they want, but they will "know it when I see it."

Often if you have one customer, you can completely satify them (but not always). When you have numerous customers, you will never satisfy all of their wants, even if given infinite resources; those wants may very well even be contradictory. (There are people with contradictory wants -- "the software should be so simple that I can modify it if necessary" and "the software should just know what it is that I want at that moment.")

Once you have absorbed this idea (you will not please everyone all of the time), you can then concentrate on writing code, without that fear of displeasing a customer blocking your progress.

(The rest of life is best served by learning this lesson, too.)