Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

perrin (4270)

perrin
  (email not shown publicly)

Perrin is a contributor to various Perl-related projects like mod_perl, Template Toolkit, and Class::DBI. He is a frequent speaker at OSCON, YAPC, and ApacheCon, and a contributor to several perl-related books.

Journal of perrin (4270)

Monday August 20, 2007
03:03 PM

the limits of object-relational mappers

[ #34162 ]

At Plus Three, we built a large project using Class::DBI. When we started the project, Class::DBI was the ORM that best met our needs. I applied some patches from the mailing list and added a couple of CPAN modules and some custom code in order to get these features:

- LIMIT support for MySQL on all search queries.
- Ability to retrieve all records from one table with a sort ("retrieve_all_sorted").
- Ability to run any search query as a count instead of returning records.
- A safe version of find_and_create. The existing one was not atomic.

Surprisingly, after these enhancements, Class::DBI 0.96 proved up to the task for the entire project.

Since then Rose::DB::Object and DBIx::Class have matured, and some other interesting things have come along. These have much more complete querying abilities than Class::DBI. They generate multi-table joins with no trouble. They can fetch related objects in one query in order to avoid multiple trips to the db.

The obvious question is, how much different would the code be if it had been written with one of these instead. And the answer? It would be reduced, but not as much as you might imagine.

See, the great thing about Class::DBI is that it's very easy to add custom SQL to your classes. You just set up a SQL statement that returns the fields you want from the current class and give it a name and you have a custom search. You can even generate SQL programmatically at run-time and use it to select objects. It is limited by the fact that it can only result in a list of objects from one class, but in practice that is rarely an issue.

We used this feature extensively. We had a complex, normalized db, and many carefully tuned SQL queries. Looking at the SQL we wrote, I would guess that about half of it is relatively simple JOIN and LEFT JOIN queries that would be eliminated (or automated) by a more capable ORM. The rest though is beyond the capabilities of existing ORMs.

What's in it? Subqueries, both as derived tables (in the FROM clause) and as NOT EXISTS queries. Transaction control, with SELECT...FOR UPDATE. Database-specific extensions like INTERVAL and GROUP_CONCAT. Import/export commands like LOAD DATA INFILE. UPDATE statements that use joins. Temporary table creation. Not all applications require this level of sophistication with SQL, but I suspect all of the ones with large amounts of data and moderately complex schemas do.

In a few cases, an ORM could get the same results by writing the query in a simpler way. However, performance would suffer. You can get on your soapbox and cite some C.J. Date stuff you read about the relational model and how the phrasing of the query shouldn't matter, but in the real world it matters a great deal. And this is as true with Oracle as it with MySQL.

Could ORMs learn this? They could probably learn some of it. They could expand their coverage of SQL to include many of these operations. They could embed some common wisdom about how to optimize certain types of queries for one database or another, although this would still not always be correct.

In the end though, what's the point of an ORM that has all the complexity of SQL? It doesn't gain you anything unless it makes things simpler, which means it has to ignore a large amount of the capabilities of SQL. An ORM is mostly about making the easy work of simple fetches and saves as automated as possible, not about creating an impenetrable shield between your programmers and SQL.

What all this means to me in practical terms is that an easy way to use custom SQL is the most important feature to look for in any ORM. With a simplistic one like Class::DBI you have to go to custom SQL too soon, but even with a more powerful one you will eventually have to go there.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I just don't find SQL that hard, and if you create a neatly packaged data access layer it doesn't get in the way of building your application too much, plus you get all the benefits of writing your own code, without the frustration of having to figure out how to do that stuff in an ORM.
    • Theoretically, a good ORM provides a neatly packaged data access layer. It can certainly get in your way though if you're trying to do things that are mostly reports and don't have any use for turning the results into objects.
    • I tend to put my SQL into a package and call it from that. Neat and tidy and no SQL intermixed with my code. I am sure there are better ways of doing it and I might like some of the ORM stuff but at this point in time I haven't missed it.
      • That's exactly what I do. I've built some pretty high performance apps that way (like in the order of 10m hits/day) and it's worked well for me.
  • When people talk about ORMs, I think they're generally looking at them the wrong way. The more I work with them, the more I think that trying to marry an class and a table is a terrible idea. I recently worked on a system where I could do something like this:

    $server->disable;

    That mapped to a table representing dedicated server, but that class had several other ORM classes it needed to interact with. That particular method hid tons of complexity behind it, including logging, deallocating IP addres

    • I disagree, kind of.

      I agree that a direct marrying of a class to a table can be a bad idea.

      I disagree that the abstraction cannot be a positive thing.

      My non-CPAN ORM ignores the class to table link and models the data at an entity-relationship level.

      From there, it derives BOTH the classes and the tables.

      Some of those classes don't have a table ("Transient Entities", imagine doing FormMail with all the structure and support of an ORM, but not ever needing to actually store the data) or don't write (I have "V
    • That sounds very much like how I think of DBIx::Class. Have you looked at it? I don’t think of DBIC as an ORM so much as I think of it as an OO API for SQL. Its basic unit is the result set – in other words, an OO wrapper around an SQL query. The main win for me is that I can accumulate conditions into queries (including composing result sets into one another) and let DBIC build the resulting SQL for me, instead of having to manually write template-ish SQL-generating logic for highly parametrisa

      • I've looked at DBIx::Class and it looks very nice, but I chose Rose::DB::Object. The project in question was an attempt to apply some discipline to a system where the code was rather typical of "legacy" systems. It was basically a bunch of CGIs and some of it already was a tad slow. One of our developers already had some performance issues with DBIx::Class on another project and I read some experiences others had with its performance. Since Rose is known for how incredibly fast it is (and I love how it

      • The main win for me is that I can accumulate conditions into queries (including composing result sets into one another) and let DBIC build the resulting SQL for me, instead of having to manually write template-ish SQL-generating logic for highly parametrisable queries.

        Any chance you could show an example? This is frequently cited as an interesting feature of DBIC, but there's not much in the docs about it. How is accumulating conditions and letting it build the SQL different from what the other SQL builders
        • It’s sort of a higher-order SQL.

          I can pass a ResultSet to someone else, and they can add constraints, joins, a group clause or such, as they like. This makes it much easier to decouple (and sometimes reuse) code. F.ex. I can build a ResultSet piecemeal along a chain of controllers in URI dispatch. I can even change some aspect of the query from within a template, eg. add a LIMIT clause to a passed-in ResultSet, without automatically having to tie the template to that particular query.

          Such a thing

          • Thanks, that's the best explanation I've heard of it.

            I've mostly used the data structure approach. RDBO includes a query builder which is much more powerful than SQL::Abstract and is able to represent joins and complex conditions reasonably well. I've done work with that where the conditions are built up similar to what you describe and then passed to the query builder to be turned into SQL.
  • Class::DBI does get on my nerves at times, but the thing I absolutely love about it is how easy it is to drop into SQL and back out again. Any SELECT query can return C::DBI based objects. And most of the other annoyances you took care of for us with your patches :)
  • What I really want an ORM for is to take care of the mind-numbing SQL. The stuff you write repeatedly to "get a row with this primary key so I can look at those four columns". That's going to be 80% of your SQL coding in a moderate-sized app.

    But I agree with you... the other 20% can be split into "stuff that can be represented with convoluted Perl" vs "stuff that is better represented as real SQL". The line between those will vary from app to app, but the important thing is that that last category be re

    --
    • Randal L. Schwartz
    • Stonehenge
  • My job is reporting. So while most Perl programmers seem to be hurrying as fast as they can to keep from having to write SQL, I'm going the other way - I don't want the necessary bits of Perl to obscure the big chunks of SQL.

    When it makes sense, I'm happy to recommend an ORM. Most transactional applications fall in that boat. But when I see convoluted APIs to try to build complex SQL in Perl, I always am left thinking, "Why should I learn an API that is as complicated as SQL to get less capability than r