Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Ovid (2709)

Ovid
  (email not shown publicly)
http://publius-ovidius.livejournal.com/
AOL IM: ovidperl (Add Buddy, Send Message)

Stuff with the Perl Foundation. A couple of patches in the Perl core. A few CPAN modules. That about sums it up.

Journal of Ovid (2709)

Thursday June 16, 2005
12:26 PM

Hungarian Notation Rocks!

[ #25237 ]

Yeah, I'll bet that's a journal title you weren't expecting from me. Most Perl programmers object to Hungarian notation for a very simple reason: if we wanted a stronger type system, we'd use a language that provides it. We, for some strange reason, demand the ability to add apples to IP addresses, never mind that we wouldn't actually do such a thing. Thus, you're not going to see this anytime soon:

$strFoo .= $strBar;

Of course, the counter-argument by Hungarian Notation proponents is this:

// this looks fine
result = bar + foo

// this does not
intResult = intBar + strFoo

That just looks wrong and it probably is. Hungarian Notation, therefore, can potentially allow one to spot bugs without having to go back to the type declarations, or worse, scan through every assignment. There are, of course, a variety of problems. If you have to change the type of a variable, finding all instances of it can be miserable. Not only do you have to find everywhere it's used and ensure that it still does what you mean, you also have to change the variable name everywhere. Further, adding all of those prefixes was just makework that really didn't give you a good sense of what the variables were for. Just because two numbers were floats didn't mean you could add one to another.

I want a stronger type system, but not based around data types. I want want based around the domain of values that a variable can have and how it is to be used. To a certain extent, I think many people appreciate the value of this. If a number must be prime, we write &is_prime and throw the number at it. If it must be negative, we write if ($num < 0) { ... }. Of course, this can be very error-prone. We're testing the data after the fact. While this is sometimes necessary, if a particular data domain is necessary and we forget the test, bugs tend to be the result. Objects, being custom data types, allow us to have a bit more control.

# perl 6
my Prime    $prime .= new(5);
my Negative $neg   .= new(-2);

# fails if Prime is correct
try { $prime.val = 57; }

# later
multi method quux(Prime $val)    {...}
multi method quux(Negative $val) {...}

And with that, we can establish a proper domain for a variable and have somewhat less worry about using it incorrectly. It still doesn't tell us how the variable should be used, but it's still a nice, generic way of specifying a useful data type. Which brings us back to Hungarian notation. I can't name every variable $prime or $neg. However, what would happen if we saw the following?

$prmResult = $pntFoo + $prmBar;

Assuming that we understood that "pnt" represented a Point and "prm" represented a Prime, we might get the idea that adding them might be useless. We get a hint that something is wrong by looking at a single line of code. In this case, the notation is not telling us the type, but giving us a clue as to the domain and, possibly, the usage. In fact, the variable $prmBar is probably poorly named (for the "prm", not the typical "Bar"). "prm" doesn't tell me how that variable is to be used, but "pnt" is more useful (ignoring that we don't know how many dimensions there are.) Further, if we decide to later change the data type of a variable but not its usage, the prefix could still be meaningful and provide useful information about how it should be used. (Heck, I'm constantly changing data structures and types without changing the names.) And, according to Joel Spolsky, this was the original intention of Hungarian Notation.

--

A smattering of extra thoughts that should probably be footnotes if I weren't lazy and really wrote this up properly instead just tossing it off quickly with long, run-on sentences that annoy the heck out of you:

  • It's interesting that, years ago, there was a foreshadowing of the power of dynamic typing coming from, of all places, Microsoft.
  • This is also why I object to some OO purists who insist that accessors and mutators are always evil and must be avoided. I often want objects that are nothing more than glorified data types. And yes, I'll cheerfully set and get values on them, thank you.
  • I also don't want to suggest that Hungarian Notation should be used everywhere. Certainly not! But the wider the scope of a variable, the more a programmer will appreciate a sane name that gives a good indication of what it's used for.
  • And if you object, read the Spolsky article, first :)
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • It's for users of primative languages without sigils on their variables. Poor sods. They never know what a variable is without hunting down the declaration.
    • Did you read the article? It talks about two types of Hungarian notation, "Apps" (what the author originally intended and the more useful type) and "Systems" (what others misinterpreted the original type as and more like what you are referring to, but having equally useless information like "integer", "long", "char"). Sigils don't tell you if it's an index to an array, a 3-D point object, or the width of a bitmap.
    • Knowing that a variable is a scalar hardly tells you what it is, other than a scaler. Is it a string? A hash reference? A quadruply nested array of hashes of arrays of hashes? I guess you'll have to hunt down the declaration anyway, even in Perl.
  • One reason why RHN-the-lesser raises the hackles of many a Perl hacker is because we already have RHN built into the language.

    At the heart of it all, Perl has only five types that matter: scalars, arrays, hashes, filehandles and coderefs. (Yes, I skipped two. If you know what they are, you know why I skipped them. ;-) Yes, there are tied variables, but what matters about them is not how their internals differ, but how they automagically hide those internals. And objects are another kettle of fish, but

    • The sigil is only half the battle. Joel's article has a great example of pulling in unsafe data from a Web form. Perl's taint mode aside, we can't always know when it's safe to use a variable. When I was coding CGI apps, I did something like this:

      my $_name = $cgi->param('name'); my ($name) = $_name =~ /($untainting_regex)/;

      In this case, the sigil tells me nothing about whether or not it's safe to use that variable. If everything untaints, I can shove that data in the database. If it doesn't, I h

      • Crud, I hit 'Submit' instead of 'Preview'. Damn. Someone needs to read The Design of Everyday Things [mit.edu]. (Including me, to be fair.)

      • In this case, the sigil tells me nothing about whether or not it's safe to use that variable.

        Right. Because Perl embeds RHN-the-lesser (the kind of stuff in Petzold's book and the Windows API). The entire point of RHN-as-intended is to use common prefixes and principles of composition to describe the meaning of a variable: taintedness, Primality, object behavior, worksafe content, etc.

        The one huge problem with "Reverse Hungarian Notation", as Joel says, is that virtually everyone thinks RHN is RH

  • Assuming that we understood that "pnt" represented a Point and "prm" represented a Prime, we might get the idea that adding them might be useless.

    I'd much rather use a language where I can sensibly declare points and primes as different types and get a nice compile time or runtime error if I try and add them.

    Getting humans to do things that compilers can do seems counterproductive now we have the chance to use vaguely decent languages ;-)

    • Getting humans to do things that compilers can do seems counterproductive now we have the chance to use vaguely decent languages ;-)

      +1

    • I'd much rather use a language where I can sensibly declare points and primes as different types and get a nice compile time or runtime error if I try and add them.

      I'd much rather use a language that handles these kinds of strict typing issues for me and provides sensible error messages when a type error is found. ;-)

      (Yes, I'm using Haskell these days. And whenever I haven't seen a type error in a couple of days, it takes a while to figure out what ghc is trying to tell me.)