Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

scrottie (4167)

scrottie
  scott@slowass.net
http://slowass.net/

My email address is scott@slowass.net. Spam me harder! *moan*

Journal of scrottie (4167)

Friday August 06, 2010
04:53 PM

My reply to Removing database abstraction

[ #40482 ]

My longwinded response to http://blogs.perl.org/users/carey_tilden/2010/08/removing-database-abstraction.h tml :

Don't get trapped in the mindset of "I have to use X because if I don't, I'm doing it wrong". Quite often, if you don't use X, it's entirely too easy to do it wrong if you don't know what you're doing. You probably don't want to re-implement CGI parameter parsing, for example. But that's not the same thing as saying that you should always use CGI because it's a solved problem so never do anything else. Nothing is a solved problem. mod_perl is a valid contender to CGI and now Plack is a valid contender to mod_perl. FastCGI was and is a valid contender to mod_perl and servers like nginx do awesome thing. Yet, tirelessly, the fans of one explain that the competing ideas are somehow not valid.

Sorry, I'm trying to do proof by analogy here. It isn't valid except to the degree that it is. I'll get to databases in a minute.

Quick recap: there are lots of valid approaches; using an alternative is not the same as re-inventing the wheel.

Furthermore, the heaviest technology is seldom the long term winner. Witness the return to lighter HTTP pipelines. For ages, Apache boasted being a bit faster than IIS, in response to which I could only wonder why Apache was so slow.

Okay, back to databases. DBIx::Class to a relational database is a valid option. It's also very heavy. It alo doesn't really let you scale your web app out unless the database in question is DB2, Oracle, or one of a few of those that runs on a cluster with a lot of processors rather than just one computer. Otherwise you've just added a new bottleneck. DBIx::Class makes it harder to do real relational work -- subqueries, having, or anything hairy. At the very least, you have to create a class file with a lot of boilerplate, reference that file from other files that you made or generated, and stuff the same SQL into there. Abstracting the querying away in simple cases makes it easier to query the database without thinking about it. This leads you to query the database without thinking about it. That's a double edged sword. In some cases, that's fantastic.

Lego blocks make it easy to build things but you seldom buy home appliances built out of Legos. Even more so for Duplo blocks. Some times easy tools are in order; some times, low level engineering with an RPN HP calculator is absolutely in order.

Okay, I'll get back to databases in a minute here, but I want to talk about something outrageous for a moment -- not using a relational database at all.

I wrote and use Acme::State for simple command line and daemonized Web apps. It works well with Continuity and should work with the various Coro based Plack servers for the reason that the application stays entirely in one process. All it does is restore state of variables on startup and save them on exit or when explicitly requested. It kind of goes overboard on restoring state and does a good enough job that it breaks lots of modules if not confined to your own namespace, hence the Acme:: designation.

Similarly, people have used Data::Dumper or Storable directly (Acme::State uses Storable under the hood) to serialize datastructures on startup and exit. In AnyEvent frameworks, it's easy to set a timer that, on expiration, saves a snapshot. Marc Lehmann, the man who created the excellent Coro module, has patches to Storable to make it reenterant and incremental, so that the process (which might also be servicing network requests for some protocol) doesn't get starved for CPU while a large dump is made. His Perl/Coro based multiplayer RPG is based on this idea. With hundreds of users issuing commands a few times a second, this is the only realistic option. If you tried to create this level of performance with a database, you'd find that you had to have the entire working set in RAM not once but several times over in copies in parallel database slaves. That's silly.

You can be very high tech and not use a database. If you're not actually using the relational capabilities (normalized tables, joining tables, filtering and aggregating, etc), then a relational database is a slow and verbose to use (even with DBIx::Class) replacement for dbmopen() (perldoc -f dbmopen, but use the module instead). You're not gaining performance, elegance or scalability, in most cases. People use databases automatically and mindlessly now days to the point where they feel they have to, and by virtue of having to use a database, they have to ease the pain with an ORM.

Anytime someone says "always", be skeptical. You're probably talking to someone who doesn't understand or doesn't care about the tradeoffs involved.

Okay, back to databases. Right now, it's trendy to create domain specific languages. The Ruby guys are doing it and the Perl guys have been doing it for ages. Forth was created around the idea -- the Forth parser is written in Forth and is extensible. Perl 5.12 lets you hijack syntax parsing with Perl in a very Forth-ish style. Devel::Declare was used to create proper argument lists for functions inside of Perl. There's a MooseX for it and a standalone one, Method::Signatures. That idea got moved into core. XS::APItest::KeywordRPN is a demo. Besides that, regular expressions and POD are two other entire syntaxes that exist in .pl and .pm files. It's hypocritical to say that it's somehow "bad" to mix languages. It's true that you don't want your front end graphic designer editing your .pl files if he/she doesn't know Perl. If your designer does know Perl, and the code is small and doesn't need to be factored apart yet, what's the harm? It's possible to write extremely expressive code mixing SQL and Perl. Lots of people have written a lot of little wrappers. Here's one I sometimes use:

http://slowass.net/~scott/tmp/query.pm.txt

Finally, part of Ruby's appeal -- or any new language's appeal -- is lightness. It's easy to keep adding cruft, indirection, and abstraction and not realize that you're slowly boiling yourself to death in it until you go to a new language and get away from it for a while. The Ruby guys, like the Python guys before them, have done a good job of building up simple but versatile APIs that combine well with each other and keep the language in charge rather than any monolithic "framework". Well, except for Rails, but maybe that was a counter example that motivated better behavior. Look at Scrapi (and Web::Scraper in Perl) for an example.

Too much abstraction not only makes your code slow but it makes it hard to change development direction in the future when something cooler, faster, lighter and more flexible comes out. Just as the whole Perl universe spent ten years mired down in and entrenched in mod_perl, so is there a danger that DBIx::Class and Moose will outlive their welcome. POE, which was once a golden child, has already outlived its welcome as Coro and AnyEvent stuff has taken off. Now there are a lot of Perl programs broken up into tiny non-blocking chunks that are five times as long as they could be, and the effort to move them away from POE is just too great. The utility of a package should be weighed against the commitment you have to make to it. The commitment you have to make to it is simply how much you have to alter your programming style. With Moose as with POE, this degree is huge. DBIx::Class is more reasonable. Still, it's a cost, and things have costs.

Thank you for your article.

Regards,
-scott

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • "Inside every large program, there is a small program trying to get out." --Tony Hoare

    Abstractions incur overhead, yep. Fortunately, POE's higher-level abstractions (wheels, components) are entirely optional. The low-level abstract event loop primitives are still there: select_read(), alarm(), sig(), etc. There's a smaller POE inside. It's already out, but it's been a bit neglected.

    History has a bigger role to play than abstractions in the obstruction of change. A long-established, stable module like P

    • The context I was talking about POE in was this: a program written to use it versus one written to do something else looks very different. This means that it's hard to switch styles. A lot of changes have to happen to a lot of the program.

      Everything goes out of fashion eventually.

      That was part of the point of evaluating how entrenched any abstraction will make you. Being entrenched by an abstraction is a mark against it.

      Too much and leaky abstraction is another matter and I whipped on other things there

  • I created ORLite to address precisely the problem you are talking about.

    ORLite tries to use as little abstraction as possible, and in particular there is no abstraction of where clauses.

    my @objects = ORLite::TableName->select( 'END_SQL', 'Bob' );
    where name in (
            select name from people where husband in (
                    select name from people where name like ?
            )
    )
    END_SQL

  • Thank you for the insights, and thank you for turning me on to Plack.

    --
    J. David works really hard, has a passion for writing good software, and knows many of the world's best Perl programmers
  • Someone once wrote "I wrote you a long letter because I did not have time to write a short one" and it sounds like a paradox - but communicating clearly usually means communicating shortly and it takes time to work out what really is your core message. This is the same case with programming libraries - it takes time to realize what are the highly cohesive components and what parts can be decoupled into independent modules. It is a bit unfortunate how the big libraries have advantage in gaining popularity