Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • So... what's COBOL like? I've tried skimming a book and tutorial or two on it, but somehow the syntax is too alien to me (not "Algol-like" as they used to say), and is sort of just washed over me. I just couldn't make sense of it.
    • COBOL was revolutionary for its time. If memory serves me correctly, it was an initiative, supported by the US military (though not a US military initiative, despite common misperception), to develop a common language. At the time different computers, if they had a high-level language, had a proprietary one. Several different computer manufacturers and government agencies collaborated on a two-step program (pun not intended) to create a common programming language for businesses. An interim language was to be created (it became COBOL) and a successor language was to follow. The conference produced COBOL, but the successor language never followed. Grace Hopper, widely regarded as the creator of the language, stated that if she had known that COBOL was the final product, she would have done things differently.

      The following is necessarily a brief, and incomplete overview. You can also read this [perlmonks.org], that [perlmonks.org], and the other thing [perl.com].

      The main aim of COBOL was ease of maintainability. Programs take much longer to write, but are (in theory), easy to maintain. Every COBOL program has four sections, called "divisions".

      1. IDENTIFICATION DIVISION.

        Who am I? (program name, author, etc. Largely optional)

      2. ENVIRONMENT DIVISION.

        Where am I? (what files can I access, how are they formatted, etc.)

      3. DATA DIVISION.

        What can I play with? (declare variables)

      4. PROCEDURE DIVISION.

        How do I actually play? (the program)

      The main difficulty in developing programs comes from the DATA DIVISION and the "WORKING-STORAGE SECTION." (See all those periods? Yes, they are significant). If I want, in COBOL, to develop a record describing a person and some personal information, I might do the following (formatting is funky due to the limitations of the textarea):

      01  CUSTOMER.
          05  CUSTOMER-NAME.
              10  LAST-NAME PIC X(40).
              10  FIRST-NAME PIC X(20).
              10  MIDDLE-INIT PIC X(1).
          05  TELEPHONE.
              10  AREA-CODE PIC 9(3).
              10  EXCHANGE PIC 9(3).
              10  NUMBER PIC 9(4).

      Of course, I'd list more than that, but you can see how cumbersome it becomes to create an individual record. Also, note the numbers after the PIC (picture clause). Variables are generally fixed-width, because records in mainframe data files tended to be fixed width (that can vary, but it usually holds true) and those numbers are the number of bytes the data is. This isn't the maximum size. It's the size, period. There were no newlines or carriage returns because if I am working with a file that has 80 character records, I just read in 80 characters at a time. If I see a file that is listed as having a 133 character record, then I knew it was probably a report file because we used 132 character printers and the first character was reserved for what was known as the carriage control.

      This allows for interesting things. Because we have aggregate names for variables, if I have a slot on a report that is exactly 61 characters wide and named "RPT-CUSTNAME, I can do this:

      MOVE CUSTOMER-NAME TO RPT_CUSTNAME.

      That statement moves over the first, last, and middle initials.

      Note some problems with the above variable declarations. They are very inflexible. What if I have a foreign phone number that doesn't fit the US area code, exchange, number format? Well, if it's 10 numbers or fewer, I could fudge it:

      MOVE FOREIGN-NUMBER TO TELEPHONE.

      If the number has more than 10 digits, though, you're out of luck.

      Another design goal of COBOL was to be very human readable:

      DO-THIS.
          MOVE 2 TO VALUE.
          CALL DISPLAY-MY-VALUE.
          STOP RUN.
      DISPLAY-MY-VALUE.
          DISPLAY VALUE.

      Anyone one, even your pointy-haired boss, should be able to figure out what that does. Statements like

          COMPUTE TOTAL = COST + TAX.

      are so ridiculously easy to read that maintenance should be a snap. Except...

      Did you notice anything strange about the first "human readable" code snippet? Yes, all variables are global. Truly global. There is no concept of namespaces or anything funky like that. This is why conventions arise in different shops to name variables in very specific ways. Often, the variables embed the name of the program they are in (WS-GLJ0900R-INIT-TAX) to simulate a namespace. This ensures that a program consisting of many files linked together won't have variable collisions.

      Other problems include file handling. COBOL programs are typically not executed directly, but called from JCL (Job Control Language) that tells the program what files are available, what the format is, what disk space is available, etc. As I mentioned earlier, much of this information is included in the ENVIRONMENT-DIVISION. Thus, if you change one it's easy to forget to change the other. Whoops!

      One of the most difficult tasks in COBOL is munging free-form text (which is why COBOLScript is so ridiculous, given that "free-form text" is virtually a perfect description of the Web). On one program I was working with, the programmer was trying to split apart a CSV file sent from an NT box and force those records into typical mainframe fixed-width fields. The relevant section of code was 150 (!) lines long and broken. I found that the programmer didn't quite know how to use COBOL's "UNSTRING" function (similar to Perl's "split") and in fixing the code, I got it down to about 80 lines. Playing around, I got it down to 10 lines of Perl code, and that was with error checking. That was one of the incidents that convinced me that I didn't need to be programming mainframes any more.

      Oh, and you can obfuscate COBOL, too. In one program, I discovered that the programmer named the variables after different types of alcohol.

          COMPUTE MARTINI = GIN + VODKA.