Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Ovid (2709)

Ovid
  (email not shown publicly)
http://publius-ovidius.livejournal.com/
AOL IM: ovidperl (Add Buddy, Send Message)

Stuff with the Perl Foundation. A couple of patches in the Perl core. A few CPAN modules. That about sums it up.

Journal of Ovid (2709)

Tuesday July 29, 2008
08:20 AM

Static and Dynamic Types: A Definition

[ #37044 ]

Note that the following was hastily written on my lunch break. It doubtless contains some serious errors. Feel free to correct any of my misconceptions.

I've read a fascinating online article about type systems and it's worth reading a few times. While I'm a far cry from an expert in type theory, I'm quite impressed by the article. The author is clearly well-versed in this topic and has clarified quite a bit about the debate, though I'm disappointed that references are generally not cited (though examples abound).

First, here's the author's definition of static and dynamic types:

A static type system is a mechanism by which a compiler examines source code and assigns labels (called "types") to pieces of the syntax, and then uses them to infer something about the program's behavior. A dynamic type system is a mechanism by which a compiler generates code to keep track of the sort of data (coincidentally, also called its "type") used by the program.

In other words, static typing is not about the kinds of data we have, but about the behavior of a system. Of course, this is often tightly related to the kinds of data. For example, asking for the factorial of an IP address doesn't make a lot of sense and for a proper program, won't be allowed in a static system, but it's trivial to generate a failure like this in a dynamic system. Of course, merely because we have safety (often compile-time) about certain failure modes, this does not guarantee the correctness of the program, but certain kinds of bugs are less common.

Dynamic typing, on the other hand, is primary concerned with tracking the kinds of data used and behavior is up to the programmer. It's not about the syntax and the kinds of data generally cannot be statically inferred. This has an interesting consequence that your type system cannot reason up front about the behavior of any given piece of code. For example, consider this pseudo-code:

total = price + tax;

If $tax is the string "fifteen percent", the above code will not perform as expected in a dynamically typed system but the type system has no way of knowing that this condition will not occur and thus cannot warn you about this failure. However, depending about how the dynamic typing was implemented, that code might continue to run or it might crash.

As a more insidious example, imagine trying to do type inferencing in Perl. Does that array reference contain only integers? You can toss a hashref in there and things will crash badly, but an inferencing system probably can't know this will happen. As a result (and as the author points out), testing tends to be more widely adopted in dynamic languages. Many types of tests are useless in solid static languages, but merely because there is some behavioral safety does not guarantee the correctness of the programs.

Moving on, consider three kinds of programs:

  1. Software is correct and runs without failure.
  2. Software is incorrect and crashes.
  3. Software is incorrect and doesn't crash.

Clearly correct software is preferable, but incorrect software which crashes is far preferable to incorrect software which doesn't crash. This perfectly exemplifies a terrible failure mode in SQL:

SELECT name
FROM   customer
WHERE  age > id

Not only is that completely useless, but it's likely to not crash. Thus, we have reasoning about type systems forced onto the developer when it should be handled by the type system itself. What we want to do is define the kinds of data we handle, the operations allowed with them and the results of those operations. This could allow the database to properly reason about a statement, something they frequently don't do. Thus, since SQL without user-defined types is more concerned about the kinds of data rather than the uses of the data, we can consider this is a very poorly implemented type system (but I've not had enough experience with user-defined types in SQL to know how solid they are).

Another interesting distinction was structural versus nominal typing. In structural typing, we assume a type whenever it is safe to do this (e.g., does it provide what our code needs?). Nominal typing requires this to be explicit. This appears to be very similar to one distinction between mixins and roles (traits). The distinction in question is that a role is named and mixins merely shove methods into your class. As a result, mixins rely on duck typing and hope that the "bark" method really refers to a dog and not a tree. With traits, you can explicitly check the name to have a better sense of safety that the method in question does what you want (it's not perfect, but it's better than mixins). Note that this makes traits similar to static typing, but it's still pushed on the developer, but Perl 6 might alleviate some of that.

Another interesting bit is the discussion of testing. Testing, as we know, doesn't prove the absence of bugs. It merely asserts that under a given set of conditions, certain behaviors hold.

A proper type system, however, can prove that certain types of bugs don't exist (but a given proof can merely show that the conclusion follows from the premises, not that the conclusion is desirable or that the premises are correct). Mark Jason Dominus once pointed out how a type system caught an infinite loop in his program. This is not a well-known feature of good type systems, but it's not unexpected. However, the author goes on to cite examples of type systems used to prove databases are normalized or that a program contains no "index out of bounds" exceptions on arrays, things which most programmers generally don't consider to be part of type systems.

At its core, the article generally defines a type system as a set of syntactic rules which verifies the absence of certain behaviors in a program (meditate about taint checking for a while). The complexity really comes in when we acknowledge that the needed "absent behaviors" vary greatly from program to program. Needless to say, dynamically typed languages leave much of this up to the programmer.

This paper has taught me a lot and reinforced many of my suspicions about type systems. What I think I ultimately want is a sound, implicitly typed language with nominal types, but what I want even more is a paycheck, so I'll stick with Perl for the time being.

It's also worth noting that taint checking and pragrams such as fatal and strict also alter how Perl's type system behaves. Interesting food for thought.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Perl's static type system cares more about containers than values. Ponder that for a while, if you really want to confuse the debate.

    • That's actually a point explicitly raised in the article. Dynamic systems focus tracking what kinds of data are used in a system, not on syntactic evaluation of expressions. See the "Problems with Common Definitions" section which begins with this:

      One common saying, quoted often in an attempt to reconcile static and dynamic typing, goes something like this: Statically typed languages assign types to variables, while dynamically typed languages assign types to values. Of course, this doesn't actually defin

      • That's not quite the same. Perl 5's built-in static types are scalar, array, and hash (and Maybe reference, if you pardon the Haskellish pun). You do get compile-time checks on using them appropriately.

        • The important point is that Perl's type system is still primarily dynamic in nature, though as the article points out, static languages all have dynamic elements and vice versa. Perl's compiler is incapable of arbitrary reasoning about the adding two scalars together. They might be overloaded. They might be overloaded at runtime. They might have strings, they might have numbers, etc.

          Perl's extremely limited type system in fact, must be explicitly requested:

          #!/usr/bin/perl

          print "foo ($test[1]) bar";
          __E

        • While those three are what most people are aware of using, Perl 5 has more static types than that. For example typeglobs. Perl 5 also has a parallel static type system in its attribute system.

          That said, the article did not draw the usual simplistic divisions. And made a point of saying that many languages have both static and dynamic systems, but usually one is more fully developed. Meaning that there are classes of problems that can be solved either through a dynamic type system or a static type system

          • Inside interface boundaries, quite likely, but I suspect that the use of static types will be quite common for published APIs at least in certain kinds of domains. At the point where client code might lob parameters at you and you can’t be certain that you’ll be called correctly, you need checks on the values anyway, and it’s much easier to do that by declaring signatures than by hand-rolling validation code (or even writing validation profiles for Params::Validate or whatever). Even so I

          • However the complex things that you'd want from a type system are mostly done with dynamic typing in Perl.

            Wouldn't it be more fair to that that the things we think we want from a type system are done with dynamic typing in Perl? Since, as the author points out, you can capture and verify domain specific knowledge via a static type system (see the paper on using static types in Haskell to verify that a database is normalized), it appears that what people are typically doing with static types is rather generic, but their advanced expressiveness (such as catching infinite loops) is not possible with Perl 5, Pytho

            • I didn't say all, I said mostly.

              It is true that an advanced static typing system can do things you simply can't do within Perl 5. On the other hand duck typing allows things that can't be easily done within a static system. Think Test::MockObject and Test::MockObject::Extends.

              However the majority of the day to day utility that most people want from a type system can be delivered either way. It is possible that if more people knew what was possible then our wants could change and that would no longer be t

              • Actually, the paper appears to start with Portuguese for the introduction, but the paper itself is in English. That threw me off, too. Aside from that, your points are well taken.