Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I haven't got to his criticism of NULLs in the book, yet, but I have my own ideas about why they suck. First of all, a NULL is supposed to mean "unknown." The problem comes when you need to represent "known, but empty." This is fine for character fields, where you can set an empty string (Larry calls it a "null string"), and it not null but has no real value, either. But how do you do that for, say, integers? For dates? You could use 0 for integer, but that's an actual value. Y

    • I just finished reading how to handle missing info without nulls [onetel.com]. It looks really long, but there's a goof up in the PDF and the slides have been duplicated. The actual talk is very short and then there's a step-by-step walkthrough of the D-like query they use to get the answer.

      One potentially unsatisfactory approch is to use an extra column to denote the type of null information of another column.

      id   name    salary  salary_reason
      1    Mary    20,000 

      • What's the difference between an "atomic operation" and a "transaction"?
        • by Ovid (2709) on 2005.12.06 14:43 (#45013) Homepage Journal

          A transaction may contain many atomic operations and becomes, itself, an atomic operation. However, the database itself may be in an inconsistent state during a transaction. The relational theory outlined by Date requires that no atomic operations allow the database to be in an inconsistent state. For example, let's say you have an employee relation (table) and salary values are moved into separate tables defining them as either "unknown" or "unemployed" (for previously null values) or "salary_amount" for known values.

          For the "unknown", "unemployed" and "salary_amount" tables, they have a foreign key constraint against the employee id. However, the employee id has a distributed foreign key against the "unknown", "unemployed" and "salary_amount" tables. The way this is traditionally handled is to start a transaction, add an employee and then add an entry to one of the "unknown", "unemployed" or "salary_amount" tables. Then you have to validate that you only have one of those tables with an entry for a given employee. The transaction this is wrapped in requires at least two inserts.

          If multiple inserts were allowed in a single statement, you could make this atomic with a single statement (note the comma):

          INSERT_ROW INTO employee (1234, 'Bob'),
          INSERT_ROW INTO unemployed (1234);

          There are a few problems with this approach. First, DBMS generally (not universally) don't allow this type of insert. Second, SQL is a rather limited language and does not allow for easily or efficiently manipulating data in 6NF (whee! I didn't know about 6NF until yesterday). Also, implementing distributed foreign key constraints generally requires writing rules or triggers to manage them as SQL does not provide this functionality. If you read the PDF I link to, it explains it in more detail.

          I also confess that I might have gotten some details wrong as I'm just learning about some new concepts. However, so many "problems" I've seen with relational databases are not problems with the databases but arbitrary constraints placed on us by SQL. It appears that my biggest objections (no lists in fields and lack of recursion) have nothing to do with relational data. Had DBMS designers worked to create a query language which truly supports relational data I suspect that databases would be much further advanced today. We wouldn't see these horrible XML or object databases as these concepts are natively supported by the relational module but not by popular DBMS.