Mark Leighton Fisher's Journal Mark Leighton Fisher's use Perl Journal en-us use Perl; is Copyright 1998-2006, Chris Nandor. Stories, comments, journals, and other submissions posted on use Perl; are Copyright their respective owners. 2012-02-09T01:31:29+00:00 pudge Technology hourly 1 1970-01-01T00:00+00:00 Mark Leighton Fisher's Journal Consistent GUIs; Or, Using WPF for Good and Not Evil <p> <a href="">Using WPF for Good and Not Evil</a> is a nice little write-up on how we, as developers, need to consider why and how we might change the user interface of programs developed in WPF. My take on it is that "Just because you can do something does not mean you SHOULD do something."</p><p> <i>(Ob.Perl: Perlesque should let you program directly in WPF by using the<nobr> <wbr></nobr>.NET libraries.)</i> </p> Mark Leighton Fisher 2010-08-19T16:43:59+00:00 others Stupid Lucene Tricks: Document Frequencies and NOT <ol> <li>You can get the document frequency of a term (i.e. how many documents have that term) through <b>Lucene.Index.IndexReader.DocFreq(t As Term) As Integer</b>.</li><li>You can get the <b>IndexReader</b> for a <b>Lucene.Search.IndexSearcher</b> through <b>IndexSearcher.GetIndexReader()</b>.</li><li>If you want to display the document frequencies for the individual keywords of a search, and a piece is a NOT phrase (like <i>-antibiotic</i> in <i>antimicrobial -antibiotic</i>), you cannot use <b>DocFreq()</b> directly. In that case, the document frequency can be computed as:<blockquote><div><p> <tt>&nbsp; &nbsp; &nbsp; DOCFREQ = count of all documents - DocFreq(TERM_NO_NOT)</tt></p></div> </blockquote><p>as in:</p><blockquote><div><p> <tt>&nbsp; &nbsp; &nbsp; DOCFREQ = 60227 - DocFreq(New Term("all", "antibiotic"))</tt></p></div> </blockquote><p> where the NOT piece was <i>-antibiotic</i> and <b>all</b> is the Lucene document field in question.</p></li></ol><p> <i>(Ob. Perl: Although PLucene is now 5 years out of date, <a href="">Perlesque</a> should eventually let you get at Lucene.NET via a strongly-typed Perl 6.)</i> </p> Mark Leighton Fisher 2010-07-29T16:13:30+00:00 others Desperate Perl; or A Tale of Two Languages <p>Piers Cawley's <a href="">A tale of two languages</a> (if you haven't already seen it) speaks to the public perception that Perl remains a desperation language ("Desperate Perl") suited only for gluing things together when nothing else will do.</p><p>Meanwhile elsewhere in the real world, there is plenty (possibly a majority IMHO) of maintainable, understandable, well-written, efficient Perl code ("Large Scale Perl" as described by Piers). Worth a read.</p><p> <i>(Although I like the name "Desperate Perl" a lot, I think that the names "Scripting Perl" and "Programming Perl" also describe these separate Perl programming styles in a less-emotional fashion (which is occasionally useful.))</i> </p> Mark Leighton Fisher 2010-07-27T17:25:47+00:00 links Stupid Lucene Tricks: Hierarchies <p>You can search on hierarchies in Lucene if your hierarchy can be represented as a path enumeration (a Dewey-Decimal-like style of encoding a path, like "001.014.003" for the 3rd grandchild of the 14th child of the 1st branch).</p><p>For example, a search phrase like:</p><blockquote><div><p> <tt>&nbsp; &nbsp; hierarchy:001</tt></p></div> </blockquote><p>would return only the direct children of the 1st branch, while:</p><blockquote><div><p> <tt>&nbsp; &nbsp; hierarchy:001*</tt></p></div> </blockquote><p>would return all descendents of the 1st branch.</p><ol> <li>To get only the children of a particular node, you specify only that node, like:<blockquote><div><p> <tt>&nbsp; &nbsp; hierarchy:001.014.003</tt></p></div> </blockquote></li><li>To get all of the descendents you specify everything that starts with that node:<blockquote><div><p> <tt>&nbsp; &nbsp; hierarchy:001.014.003*</tt></p></div> </blockquote></li><li>To get only the descendents after the children (grandchildren, etc.), you specify:<blockquote><div><p> <tt>&nbsp; &nbsp; hierarchy:001.014.003.*</tt></p></div> </blockquote></li></ol> Mark Leighton Fisher 2010-07-16T16:09:09+00:00 others pmtools-perl6-0.01 <p>I am pleased to announce version 0.01 of <b>pmtools-perl6</b>, a suite of module tools for Perl 6. (Not quite up on CPAN yet as I write this.)</p><p> <b>pmdirs</b> is the only tool in pmtools-perl6 v0.01, as it was the simplest to port (more tools to come...)</p><p>On Cygwin (my testing environment), I cannot get the <b>#!</b> to work -- you will need to invoke pmdirs something like this under Cygwin:</p><blockquote><div><p> <tt>&nbsp; &nbsp; c:/parrot-2.2.0/bin/perl6 d:/cygwin/home/pmtools-perl6-0.01/pmdirs</tt></p></div> </blockquote><p> <i>(If you want to contribute Perl 6 ports of the other pmtools, please let me know.)</i> </p><p>The source to pmdirs:</p><blockquote><div><p> <tt># pmdirs -- print the perl module path, newline separated<br>#<br>#<br> <br># TODO: use warnings;<br>use v6;<br> <br>for (@*INC) {<br>&nbsp; &nbsp; say $_;<br>}<br> <br>=begin<br> <br>=head1 NAME<br> <br>pmdirs - print out module directories<br> <br>=head1 DESCRIPTION<br> <br>This just prints out the current @INC path, one directory per line.<br>This is for people who don't want to parse through C&lt;perl -V&gt; output or<br>hack up their own calls to C&lt;perl -e&gt;.<br> <br>=head1 EXAMPLES<br> <br>&nbsp; &nbsp; $ pmdirs<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/home/tchrist/perllib/i686-linux<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/home/tchrist/perllib<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/usr/local/devperl/lib/5.00554/i686-linux<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/usr/local/devperl/lib/5.00554<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/usr/local/devperl/lib/site_perl/5.00554/i686-linux<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/usr/local/devperl/lib/site_perl/5.00554<br>&nbsp; &nbsp;<nobr> <wbr></nobr>.<br> <br>This also works for alternate version of Perl:<br> <br>&nbsp; &nbsp; $ filsperl -S pmdirs<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/home/tchrist/perllib<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/usr/local/filsperl/lib/5.00554/i686-linux-thread<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/usr/local/filsperl/lib/5.00554<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/usr/local/filsperl/lib/site_perl/5.00554/i686-linux-thread<br>&nbsp; &nbsp;<nobr> <wbr></nobr>/usr/local/filsperl/lib/site_perl/5.00554<br>&nbsp; &nbsp;<nobr> <wbr></nobr>.<br> <br>=head1 SEE ALSO<br> <br>perlrun(1), perlvar(1), lib(3)<br> <br>=head1 AUTHORS and COPYRIGHTS<br> <br>Copyright (C) 1999 Tom Christiansen.<br> <br>Copyright (C) 2006-2010 Mark Leighton Fisher.<br> <br>This is free software; you can redistribute it and/or modify it<br>under the terms of either:<br>(a) the GNU General Public License as published by the Free<br>Software Foundation; either version 1, or (at your option) any<br>later version, or<br>(b) the Perl "Artistic License".<br>(This is the Perl 5 licensing scheme.)<br> <br>Please note this is a change from the<br>original pmtools-1.00 (still available on CPAN),<br>as pmtools-1.00 were licensed only under the<br>Perl "Artistic License".<br> <br>=end</tt></p></div> </blockquote> Mark Leighton Fisher 2010-07-02T16:40:28+00:00 perl6 Business: Execution vs. Ideas <p>If you want to start your own business, you need:</p><ul> <li>A product people want to buy; and</li><li>The willingness to work amazingly hard to get the business going.</li></ul><p>These were my major take-aways from <a href="">Top ten geek business myths</a>, based on the article and my own experiences.</p><p>Ideas? Ha!</p><blockquote><div><p>Don't worry about people stealing an idea; if it's original, you'll have to shove it down their throats. <i>- Howard Aiken</i></p></div> </blockquote><p>What matters more is execution. In my chosen industry, Microsoft has been a good example of this. There were other, better OSes, but Microsoft made sure to get their OSes out on everyone's desktops, rather than limiting the user's choice of PC. Although Linux has made great strides, it is still more likely that you will find a reasonable driver for an arbitrary piece of PC hardware for Windows than for Linux. Microsoft has had better execution in getting Windows out to as many people as possible. (Heresy, I know.)</p><p>Even if you revile their products, many of the largest retailers have worked impressively hard getting their products out to everyone, not just a chosen few.</p><p>All of what you know is just a tool (a rather large and handsome tool, admittedly) in the process of getting your own business going. Unless your goal is to be a very small, boutique seller, you want to reach as many people as possible, and brains alone won't get you there.</p><p>Read the article, and tell me what you think.</p> Mark Leighton Fisher 2010-06-23T16:32:29+00:00 others Stupid Lucene Tricks: Search case-insensitive, Retrieve ca <p>Sometimes when you build an index in Lucene, you want to structure the index so that people can search without worrying about case (case-insensitive search), but you want the display to contain the original mixed-case data (case-sensitive display). The trick is to split each Lucene field into 2 versions:</p><ol> <li>A case-insensitive field that is indexed but not stored (Lucene.Net.Documents.Field.Index.ANALYZED and Lucene.Net.Documents.Field.Store.NO).</li><li>A case-sensitive field that is stored but not indexed, preferably with a field name similar to that of its case-insensitive cousin field like "Display_Title" and "Title" (Lucene.Net.Documents.Field.Index.NOT_ANALYZED and Lucene.Net.Documents.Field.Store.YES).</li></ol><p>Storing only the case-sensitive version reduces the index storage requirement (I have seen around a 40% increase in index size with this trick as compared to both storing and indexing one field).</p> Mark Leighton Fisher 2010-06-17T10:58:39+00:00 others ZeroMQ: Fastest. Messaging. Ever. <a href="">ZeroMQ</a> (or 0MQ) appears to be a fast (8M+ messages/second), Open Source message-passing engine. I don't have a use for it now, but it does look interesting.<p> <i>(There is no Perl interface for ZeroMQ, but it sounds (without my actually researching the task) like it shouldn't be too hard to clone the Ruby <a href="">FFI</a> interface for use with Perl.)</i> </p> Mark Leighton Fisher 2010-06-11T20:07:57+00:00 others Technical Debt and the Stakeholders <p> <a href="">Technical Debt</a> (and <a href="">Technical Debt Decision Making</a>) are a good take on using the concept of technical debt to ensure that your stakeholders understand why you must spend time fixing their system even though it may seem to be working perfectly fine right now. (An example of incurring technical debt is using SQLite when you know that in the long term the system needs to store its data in Postgres or Oracle.)</p><p> <i>(The author of these essays, Steve McConnell, and his team at Construx really know their stuff -- if you have chance to take one of their classes, grab it.)</i> </p> Mark Leighton Fisher 2010-06-04T16:18:33+00:00 others Not All User Stories Have Happy Endings <p>Sometimes, despite your best efforts as a developer, you end up with unhappy users. And that's OK.</p><p>In "<a href=";tag=nl.e101">Consultants: It's not the theory, it's the execution</a>", Chip Camden makes this point:</p><blockquote><div><p>Sometimes you need to say no to user requests. (Unfortunately, not all user stories have happy endings.)</p></div></blockquote><p>Whatever you do, there will be times when someone is unhappy with you. It matters not that you are the most talented developer ever known, or the most gifted designer that will ever be seen, someone will not like what you have done. It may be your politics, it may be your attitude, it may have no relation to reality -- you will run into people that you just can't please.</p><p>Because defining requirements is so fiendishly difficult, software developers have a special problem in this regard -- and especially when the user does not themselves know what they want, but they will "know it when I see it."</p><p>Often if you have one customer, you can completely satify them (but not always). When you have numerous customers, you will never satisfy all of their wants, even if given infinite resources; those wants may very well even be contradictory. (There are people with contradictory wants -- "the software should be so simple that I can modify it if necessary" and "the software should just know what it is that I want at that moment.")</p><p>Once you have absorbed this idea (you will not please everyone all of the time), you can then concentrate on writing code, without that fear of displeasing a customer blocking your progress.</p><p> <i>(The rest of life is best served by learning this lesson, too.)</i> </p> Mark Leighton Fisher 2009-12-23T17:40:17+00:00 others 10,000+ Exceptions/Hour <p>Although the details are<nobr> <wbr></nobr>.NET-specific, <a href="">Are you aware that you have thrown over 40,000 exceptions in the last 3 hours? </a> is a good overview of what happens when you use exceptions for non-exceptional circumstances...</p><p> <i>(Just Say No to using exceptions for flow control.)</i> </p> Mark Leighton Fisher 2009-11-20T17:11:51+00:00 others Checklists, Recipes and Algorithms <p> <a href="">Checklists, Recipes and Algorithms</a> draws from medicine, cooking, and programming to make the point that sometimes you just need to write down what you are going to do. You don't want to impose too much structure (that way lies spending a week to write a 2-line program (see <a href="">analysis paralysis</a>)), but you do need structure -- and explicit structure is easier to get right because it is easier to analyze.</p><p> <i>(The Cardiac Arrest algorithm in the article is especially nicely presented.)</i> </p> Mark Leighton Fisher 2009-11-13T16:56:41+00:00 others Regular Expressions: The Ultimate in Lack of Redundancy <p>The concepts in regular expressions are simple -- one of anything, a numeric character, a class of characters -- so why do so many people have problems with regular expressions? I think it is the lack of redundancy.</p><p>Each concept in regular expressions is expressed in 1 or 2 characters -- "." is one of anything, "*" is one or more of the preceeding thing, and so on. Compare this with C, where matching against a 'b' in a string could be coded compactly as:</p><blockquote><div><p> <tt>match = 0;<br>while (*c++) {<br>&nbsp; &nbsp; if (*c == 'b'){<br>&nbsp; &nbsp; &nbsp; &nbsp; match = 1;<br>&nbsp; &nbsp; &nbsp; &nbsp; break;<br>&nbsp; &nbsp; }<br>}</tt></p></div> </blockquote><p>Although a modern language would cut down the size of that code, it still wouldn't come close to the one character of the corresponding regular expression. And therein lies the problem &#8211; we humans rely on redundancy when interpreting information. In theory, you would never need more than one character to represent any concept in a computer language. Yet, the programming languages that gain general popularity are languages with some amount of redundancy built in -- Java, Perl, Python, C# -- the list goes on and on. Given that we&#8217;ve had minimally-redundant programming languages since programming languages were first conceived of in the 1950&#8217;s (<a href="&amp;%238221;;%238221;">APL</a>, anyone?), if minimally-redundant programming languages were going to take over the world, it would have happened by now -- and it hasn't happened.</p><p>Another example of our need for redundancy is in driving directions. The best driving directions always contain some redundancy -- "you turn at Capitol, which is between Senate and Illinois" -- instead of "you turn at Capitol", as "you turn at Capitol" gives you no idea of where Capitol actually is -- it could be several miles down the road, or 1 block after the previous turn. (I have wondered if giving good directions is a skill similar to that of programming.)</p><p>Reading may be another example -- you can usually get the gist of a paragraph of English text even when only the first and last letters of each word are in their right places (thereby demonstrating that the other letters are mostly redundant).</p><p>Music is possibly another example of the human need for redundancy. Whether it is the de-de-de-dah motif of Beethoven's 5th symphony, the distinctive drum line of Led Zeppelin's "Immigrant Song", or the chorus of Green Day's "21 Guns", music relies on redundancy through repetition. In theory, you should only need to hear each part of a song once to derive full musical enjoyment from the song. But instead, in Beethoven's 5th symphony (where there are no words to require musical backing) Beethoven repeats the motif over and over again. And Beethoven's 5th symphony is widely regarded as one of the crowning achievements of music -- yet it is filled with redundancy through repetition, although there is no theoretical reason for that level of repetition. Or is there?</p><p>Truth is, we humans need a certain level of redundancy in our information before a concept is firmly planted in our heads, whether it is a popular song or the clauses of our national constitution. The reason that I and so many others have found such success with the <a href="">Head First book series</a> is because Head First's use of redundancy (presenting each piece of information in several different ways) helps to ensure that you retain the information in the Head First books.</p><p>Perl's "/x" modifier may turn out as one of the most significant advances in regular expression syntax, because<nobr> <wbr></nobr>/x enables the splitting-up and commenting of your regular expressions -- operations that increase the readability and redundancy of your regular expressions (redundancy because the parts of your regular expressions are now represented by a whitespace-bounded line of text instead of just the regular expression characters (in the common case)).</p><p> <i>(Why we humans need all this redundancy is better left to another day, although I will give you a hint: why do humans still have appendixes?)</i> </p> Mark Leighton Fisher 2009-09-23T18:12:58+00:00 others iMacros: Automation for Firefox <p> <a href="">iMacros for Firefox</a> is the Web automation solution I have been looking for. Why iMacros? Because:</p><ol> <li>You record what you actually do, rather than trying to reconstruct what you did from memory;</li><li>The end product is a vanilla ASCII macro language editable by vi (or the editor of your choice);</li><li>The macro language works at a high level -- for example, HTML element absolute and relative positioning are among the macro language features included;</li><li>You can extract data from a page as text or HTML;</li><li>An iMacro can be used as a Firefox bookmark; and</li><li>The Firefox add-on version is freeware.</li></ol><p>To give an idea of how exciting I found iMacros for Firefox, I immediately wrote 2 iMacros after installing iMacros that I have been wanting to write forever (a downloader for the titles of my blog posts and a front end to our local real estate browser to jump right to the properties in our county).</p><p>I have only scratched the surface of what can be done with the freeware version of iMacros -- if you need web automation, iMacros is worth a look.</p><p> <i>[Flash and Java can be supported through the DirectScreen command (as yet untested by me, as DirectScreen is only avaiable in the paid editions).]</i> </p><p> <i>[Ob.Perl: Once designed, something like iMacros would be relatively easy to write in a Firefox-embedded Perl.]</i> </p> Mark Leighton Fisher 2009-07-17T17:16:19+00:00 journal Test-Driven Development: Some Hard Numbers <p> <a href="">Realizing quality improvement through test driven development: results and experiences of four industrial teams</a> analyzes the TDD experiences of 4 teams at IBM and Microsoft. Nothing surprising here to those who have already experimented with test-driven development (pre-release defect density decrease of 40%-90% combined with a 15&#8211;35% increase in initial development time), but it is good to have some hard numbers on TDD rather than relying solely on anecdotes and hearsay.</p> Mark Leighton Fisher 2009-07-02T16:23:33+00:00 others Why Big Software Projects Fail: The 12 Key Questions <p> <a href="">Why Big Software Projects Fail: The 12 Key Questions</a> by Watts Humphrey clearly talks about how:</p><ol> <li>Requirements management is hard; and that</li><li>When those who perform the work (programmers, graphic designers, etc.) tell you that a task will take X amount of time, you need to listen to them.</li></ol><p>Agile has become popular because finding all requirements at the beginning of a project is nearly impossible for large software projects, especially green-field projects (projects that are the first of their kind). Agile, properly executed, lets you discover requirements by using a running system as a usable prototype for the eventual finished system.</p><p>Agile also requires everyone's involvement in the scheduling, rather than having an arbitrary schedule handed down from on high -- a tactic which I think has contributed the majority of project failures (real Project Management classes will tell you just how big of a no-no is schedule imposition).</p><p>(Mr. Humphrey is worth listening to -- while he led the OS/360 development team, they met their schedule for all 19 releases he oversaw.)</p> Mark Leighton Fisher 2009-06-26T17:02:25+00:00 journal Pretend Project Management <p>Pretend Project Management is when management of the project ignores reality -- from the making of the schedule, to the tracking of actual vs. planned time/money spent, all the way down to the project post-mortem (and mortem it usually is, as the project is often D.O.A.) <a href="">Cargo Cult Methodology: How Agile Can Go Terribly, Terribly Wrong</a> is a real-life example of Pretend Project Management, one worth examining in more detail.</p><p>The first red flag is not hiring a system administrator, while simultaneously not allocating the time for system administration in the schedule. The Iron Triangle of scheduling cannot be violated without doing violence to the schedule -- you cannot have system administration work to do without scheduling time for someone to do that work. So, without a system administrator, the schedule has to change so that other people will get that work done. Generally (IMHO), you cannot take an FTE's amount of work in a schedule and say, "Oh, we'll just do that work during slack times." This always comes back to bite you, usually towards the end of the project when you can least afford it. The Project Manager should have modified the schedule to allow time for system administration by whatever means cut back features (scope), add a system administrator (cost), or stretched out the schedule (time). If management does not let you modify the schedule, then Pretend Project Management is what is actually being practiced.</p><p>Another red flag was "Agile Development" but no time in the schedule for quick incremental deliveries (intervals measured in weeks). Perhaps the essence of Agile Development is quick iterations with immediate responses by the customer. If the iterations are not quick, or the responses are not immediate, then the development process is not Agile, despite protestations to the contrary. (And calling quick iterations "silly" as management did shows serious misunderstanding of the Agile Nature.) Quick iterations maximize the value delivered to the customer, as the feedback from quick iterations keeps development on the correct path. If the feedback loop is too long because of lengthy iterations, management introduces the risk that development will produce a product not needed or wanted by the customer. Agile Development without quick iterations is Pretend Agile Development, and Project Management of Pretend Agile Development is Pretend Project Management, as time has not been allocated in the schedule for real Agile Development. (Hint: 4 weeks rather than 4 months is closer to a useful iteration length.)</p><p>The lack of continuous integration is yet another red flag. Agile development should proceed at a fairly steady pace. Continuous integration helps steady the pace, by preventing small, relatively simple build problems from growing into huge, intractable build problems. In tbe pre-Ethernet days, I once had to integrate months of changes -- trust me, you really don't want to have to do that.</p><p>Those of you with exposure to Project Management training will notice another, massive red flag -- no flexibility in scope, time, or resources. When I have managed projects in the past, resources were fixed, time was fairly-well fixed, while scope was the most variable part of the project. I suspect (without proof) that this is a common pattern, as you sacrifice features to get the project "done" (by some measure).</p><p>From what I have read, Agile Development seems to plan for a fixed number of people (resources) while varying the time and scope of the project. Usually, changing the project scope is the topic for those writing about how Agile differs from other development styles. (This may be because Agile developers have the attitude, "It will be done when it is done, and not a moment before.")</p><p>What conclusions can we draw from this example?</p><ol> <li>Half of the problems had nothing to do with Agile development. No flexibility along any axis in the schedule (scope, time, resources) and lack of (system administration) time in the schedule are just varieties of problems that project managers ran into 50+ years ago.</li><li>Unfortunately, some in management do not grasp that all schedules are approximations -- the Pyramid blocks are not delivered on time, the 1942 warship review for the good ship "X" is cancelled because there is no longer a good ship "X" (or "Y", or "Z"...), Microsoft delays a service pack to fix a security bug, and so on and so on. If you get far enough into using Microsoft Project (as an example tool), you will see multiple start and end times for each task (phrases like early start, late start, scheduled start, actual start). Only when you drive the project risk down to zero can you be certain that each task will start and end on time. Not accounting for project risks leads to deciding there will be no flexibility in the schedule, and inflexible schedules lead to project problems (even if your requirements are up-front perfect).</li><li>The other killer classic scheduling mistake (one that was undoubtably seen back in Roman times and before) is "well, we can't hire someone to do this -- we'll just do it during our slack times". As that work usually tends to pile up, the effects are usually felt at the end of the project when you can least afford it (as mentioned before).</li><li>If you cannot do weekly-to-monthly incremental deliveries, IMHO you are developing in another way -- not in the Agile Way.</li><li>In Extreme Programming (an Agile style), they talk about "Sustainable Pace". This means a pace that team members can comfortably keep up for months or years at a time. Continuous Integration is an essential lubricant for sustainable pace -- without CI, you will waste a lot of time catching up on huge blocks of changes that also change the build process (adding a module/assembly/DLL, etc.) Whether you or not you are doing Extreme Programming, "Sustainable Pace" is a goal worth shooting for.</li></ol> Mark Leighton Fisher 2009-04-24T18:23:06+00:00 others A Pattern of Troubleshooting <p> <a href="">Troubleshooting</a> works through an example troubleshooting situation (possible cardiac problem), then extracts the pattern to follow when you are troubleshooting, whether you are responding to a medical emergency or a debugging a Perl program.</p><p>Worth a look.</p> Mark Leighton Fisher 2009-04-02T16:52:18+00:00 others Grabbing With Your Presentations <p>If you have ever wondered why some presentations grab you while others leave you cold, this TechRepublic download of a chapter from <a href="">Cliff Atkinson's Beyond Bullet Points</a> may provide insight. (The chapter feels like a <a href="">Head First book</a> chapter to me.)</p> Mark Leighton Fisher 2009-03-27T17:59:25+00:00 others Code Contracts for .NET <p> <a href="">Code Contracts for<nobr> <wbr></nobr>.NET</a> "provide a language-agnostic way to express coding assumptions in<nobr> <wbr></nobr>.NET programs" (from their website). This lets<nobr> <wbr></nobr>.NET programmers -- whether C#, VB.NET, Iron Python, or whatever -- verify coding assumptions both statically and dynamically. (This is similar to <a href="">Design by Contract</a> in Eiffel.)</p><p>Code Contracts includes a static checker program for verifying both explicit and implicit (null references, array bounds, etc.) code contracts. Runtime (dynamic) contract checking can use marked series of If-Then-Throw guard clauses as in:</p><blockquote><div><p> <tt>if ( x == null )<br>&nbsp; &nbsp; throw new ArgumentNullException("x");<br>if ( y &lt; 0 )<br>&nbsp; &nbsp; throw new ArgumentOutOfRangeException(...);<br>Contract.EndContractBlock();</tt></p></div> </blockquote><p>(so you don't waste perfectly good guard clauses) as well as the standard, explicit code contracts like:</p><blockquote><div><p> <tt>&nbsp; &nbsp; Contract.Invariant(this<nobr> <wbr></nobr>.y &gt;= 0);<br>&nbsp; &nbsp; Contract.Assert(this<nobr> <wbr></nobr>.x == 3,<br>&nbsp; &nbsp; &nbsp;"Why isn't the value of x 3?");<br>&nbsp; &nbsp; Contract.Requires(x ! = null,<br>&nbsp; &nbsp; &nbsp;"DANGER -- missles fired!");</tt></p></div> </blockquote><p>Code Contracts defaults at runtime to throwing an exception (<b>System.Diagnostics.Contracts.ContractException</b>) when a contract is violated (this behavior is configurable).</p><p>I have not tried Code Contracts (or any code contract mechanism) yet, but the idea is intriguing because it lets the computer do something it does well (exhaustive examination of your code in tedious detail) thereby freeing you to work on the higher-level aspects of your program, just as C freed us from assembly language bookkeeping and Perl/Java/VB.NET etc. free us from C language bookkeeping.</p><p>If anyone has experience with code contracts for any language (positive or negative), please comment.</p><p> <i>(Ob. Perl ref. -- see <a href="">Moose</a> and <a href="">Class::Contract</a> among others...</i> </p> Mark Leighton Fisher 2009-03-20T20:36:11+00:00 others Windows Vista and Multi-Level Security <p>Mark Russinovich's <a href="">Inside Windows Vista User Account Control</a> includes many interesting tidbits for those like me who develop for Microsoft Windows, but to me Windows Vista Integrity Levels are DoD-style <a href="">Multi-Level Security</a> by another name.</p><p>This is ironic, as the Department of Defense seems to be moving away from MLS systems, instead going towards PCs where each PC is at one level of security. (DoD developers, feel free to speak up at this point.)</p><p>Worth a look for Windows developers and OS enthusiasts.</p> Mark Leighton Fisher 2009-03-06T18:49:11+00:00 others Dispose, Finalization, and Resource Management <p>As Perl moves into Garbage Collection territory with Perl 6, <a href="">Dispose, Finalization, and Resource Management</a> -- even though it was written about the<nobr> <wbr></nobr>.NET GC -- is worth a look because all garbage-collected languages must deal with these issues.</p><p>If you ignore these issues, you will spend your time debugging memory allocation/release problems instead of delivering functionality to your customers -- and your customers pay you for the functions you deliver (they only put up with your debugging to get those functions).</p><p> <i>(One way to think about Garbage Collection is that GC is like Perl for memory allocation/release -- GC makes easy memory alloc/release easy, and makes hard memory alloc/release possible.)</i> </p> Mark Leighton Fisher 2009-02-27T17:34:13+00:00 others Top 25 Most Dangerous Programming Errors <p>For those who have not seen this -- <a href="">Top 25 Most Dangerous Programming Errors</a>.</p> Mark Leighton Fisher 2009-02-27T17:19:21+00:00 others Zotero -- Open Source Super-EndNote in your Firefox A comment by <a href="">Jon Duke</a> here at Regenstrief led me to Zotero, an Open Source Super-<a href="">EndNote</a> in your Firefox (EndNote helps you collect and manage citations). Although Zotero was originally developed for humanities researchers, Zotero is useful for anyone who researches on the Web (whether for publication or software development), as it provides easy collecting, organizing, and searching of your personal citation list (think "bookmarks on anabolic steroids"). Zotero provides more functionality if the web page is designed for Zotero (see <a href="">Make your site zotero ready</a>), but any web page can be linked or captured into Zotero for later use. Some Zotero features: <ul> <li>Zotero will automatically gather citation information if it is present on the page. You can collect any page with Zotero, but you may need to fill in some information if the automatic citation info is not present. Note that among other popular websites provides Zotero-compatible citation info.</li> <li>You can collect either just the link to the page, or a snapshot (copy of) the page.</li> <li>Notes let you annotate your citations to any level of details. Notes can also stand alone (i.e. notes not attached to a citation).</li> <li>A Zotero citation can have zero or more attachments.</li> <li>Collections let you gather related items together. A citation can exist in more than one collection.</li> <li>Tagging lets you group your citations in arbitrary ways. Zotero may automatically grab the LC subject headers for book citations and keywords for article citations.</li> <li>You can work with Zotero when off-line (airline travel, anyone?) although linked citations will of course be unavailable except for their Zotero metadata.</li> </ul><p> As a personal example, <a href="">RELMA</a> is moving from VB6 (1998 technology) to VB.NET (2008 technology), so there are lots of good additional features in VB.NET to learn about. I have started using Zotero to track my<nobr> <wbr></nobr>.NET Web citation links for RELMA so those links are ready for when I need them (as in <a href="">the ability to use ASP.NET as a text template engine outside of IIS</a>] (handy for internationalizing RELMA's HTML output)). Try Zotero -- you may like it!</p> Mark Leighton Fisher 2009-02-17T17:38:45+00:00 internet The Important Numbers of Testing: 0, 1, and Many <p>Although there are an infinity of numbers to use in software testing, the 3 important numbers are 0, 1, and Many.</p><p>0, the number of nothingness, comes into play when you don't have anything. C enshrined 0 as the null pointer, though other languages and systems had represented nothing by a memory address of 0 before C. (There were other representations of null they make for interesting reading.) Customers without orders, Webpages without links, forests without oak trees all of these are most easily represented by a 0 inside a computer. Even inside your computer, your programs are not a closed system. You can run out of memory (although Perl eliminates the silly cases of this), you might forget and make a directory unreadable (<b>0</b> files), a compiler error could skip an allocation statement (<b>0</b> elves) the list can go on and on. If you don't consider the case of 0, eventually your software will fail. (Conversely, I once <a href="&amp;%238221;;%238221;">wrote a server in Perl 4</a> that ran for months at a time because I did extensively consider and test the case of 0.)</p><p>1, the first number, is seen when you only have one of something, an idea so common that it becomes the <a href="">Singleton pattern</a> in languages that need a special representation of one and only one instantiation of a class. With 1 of something, everything has to be instantiated, but you don't have the problems of multiple copies of the item in question. If the item is part of a collection of items, I have occasionally seen defects where the collection is not allocated if there is only 1 item. It is probably an artifact of my coding style, but I don't see many defects in my code specific to 1 and only 1 item. When the common cases are 0 and many, I have seen code that fails to work on all of the edge cases of 1 and only 1 item.</p><p>"Many" often just means "more than 1". Usually, your code does nothing different for the 472nd item than it does for the 2nd item. There are times, though, when code(2nd) != code(472nd) (3-column display code comes to mind here). Handling many items involves sizing their containers appropriately. Almost any allocation algorithm can give you space for 1 item only correctly constructed allocation algorithms will always yield the right number of places to contain your items. The familiar fence-post error of array management is but one example of a failed allocation algorithm (and failure to properly test for many items).</p><p> <b>defined()</b> is the special case of accessing an item before it is initialized. A real-world case is a restaurant without any customers. Attempts to access any customer data will only find undefined values. Undefined values can occur when you grab large blocks of data for performance reasons the example restaurant and all its customers from before where you grab so much data in one fell swoop that not all of it is initialized. Incomplete data is not defined. A customer without a cellphone (or without a landline phone) would have an undefined value for that phone field. A broken tire pressure sensor could yield an undefined value when read. Sometimes you can just ignore undefined values, but other times you have to explicitly handle them (think running sums or some statistical operations).</p><p>Although you may have other special numbers to test, you will likely have to test at least 0, 1, and many. The multiples of many, the somethingness of 1, and the nothingness of 0 will need to be tested to ensure adequate test coverage of your code.</p> Mark Leighton Fisher 2008-06-18T10:58:43+00:00 others The Golden Rule of Data Manipulation <p> <a href="">The Golden Rule of Data Manipulation</a> can be summed up as "Concatenation is Easy, But Parsing Is Hard". But we are talking really, really hard here not just lifting a dining room hutch hard, but lifting the Empire State Building hard (in the end game). That hardness has been a large barrier in natural language communication for computers, as parsing an arbitrary sentence is ludicrously hard. AIML have worked around the problem by restricting both the domain of discourse and the variety of sentences recognized, but they have only worked around the problem, not solved it. If you start from a point of concatenating simple, nearly atomic data, your programming task will be much easier (and much more like to lend itself to later parsing, rather than starting at arbitrary parsing of your data). Anyway, read the article!</p> Mark Leighton Fisher 2008-06-18T10:57:43+00:00 others PageRank is Precomputed Relevancy Ranking <p>Google's PageRank is precomputed relevancy ranking, where the heavy lifting of actual relevancy ranking is done by us humans. Why is this important? I was re-reading <a href="">A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART)</a>, which lays out how computerized indexing can beat the best manual indexing by:</p><ul> <li>Using a stop-word list;</li><li>Using a thesaurus (synonyms); and</li><li>Relevancy ranking.</li></ul><p>(It's more complicated than that, but you get the idea.) Relevancy ranking is the hardest part of the indexing job, as there are no clear-cut algorithms for relevancy ranking with both excellent precision and excellent recall (getting all of the documents you want and none of the documents you don't want). Google's PageRank works around the difficulty of relevancy ranking by handing the hardest part the ranking of individual documents to us humans. You can get good results from proper metadata, but metadata is useful only in environments where no one has interest in gaming the metadata (I wonder if it should be called "The Semantic Intranet"? That's where Semantic Web technologies really make sense to me.)</p><p>The original paper is worth a read, especially if you work on software that incorporates search and these days, I suspect that almost any non-embedded program could grow to a point where it incorporates a search mechanism (and an email client, and a web browser you get the point).</p> Mark Leighton Fisher 2008-05-30T17:04:26+00:00 others Good Inheritance and Bad Inheritance <p> <a href="">Inheritance is evil, and must be destroyed</a> is the slightly overwrought title of an article by BernieCode that, nonetheless, expresses an idea that I've long held that most use of inheritance is better represented by either composition (HAS-A rather than IS-A) or by interface implementation/Perl 6 roles (ACT-AS rather than IS-A).</p><p>Inheritance works well for classes that are actually closely related (the canonical example of classes that represent the relationship of various species springs to mind here). What you often want (in my experience) are classes that can act in a certain way for example, a <i>horse</i> and a <i>dog</i> that can act like a <i>pet</i>. The EventManager example in the article above is a particularly good example of where a Perl 6 role/Java interface/etc. solves a problem much more neatly and clearly than inheritance does.</p><p>By the way, <a href="">Solving compositional problems with Perl 6 roles</a> (which I just discovered) also looks like a pretty good resource on this topic, especially for us Perl users.</p> Mark Leighton Fisher 2008-05-23T17:04:52+00:00 others pmtools-1.10 Release <p>Now at a CPAN mirror site near you <a href="">pmtools-1.10</a>. Tom "spot" Callaway of Fedora Core let me know that the Fedora folks were concerned about the fact that pmtools was only licensed under the Perl 5 Artistic License (they were concerned about how well the Artistic License 1.0 would stand up in court). So, pmtools (starting with v1.10) is now dual-licensed like Perl (Artistic and GPL). (My other public Perl stuff is also dual-licensed.) I also added my copyright to pmtools, as I had not added my name to the copyright when I took it over.</p><p>Off-hand, I don't recall why Tom Christiansen used only the Artistic License for pmtools. Anyone with a clue, please drop me a line. (That of course includes you, Tom.)</p> Mark Leighton Fisher 2008-03-07T18:09:54+00:00 cpan Navigational Spaghetti -- What are your thoughts? <p> <a href="">Navigational Spaghetti -- What are your thoughts?</a> presents the dilemma of making program navigation both easy and flexible. Somehow <a href="">MVC</a>/<a href="">MVP</a> come to mind here...</p> Mark Leighton Fisher 2008-01-25T17:17:10+00:00 others