Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Ovid (2709)

Ovid
  (email not shown publicly)
http://publius-ovidius.livejournal.com/
AOL IM: ovidperl (Add Buddy, Send Message)

Stuff with the Perl Foundation. A couple of patches in the Perl core. A few CPAN modules. That about sums it up.

Journal of Ovid (2709)

Wednesday May 14, 2008
05:58 AM

Time to Speed Up The Tests Again

[ #36414 ]

After a lot of aggressive work improving our test suite performance, adding 50% more tests has tremendously slowed us down.

All tests successful.
Files=48, Tests=14025, 1386 wallclock secs ( 3.06 usr  0.46 sys + 1035.71 cusr 42.77 csys = 1082.00 CPU)
Result: PASS
Number of test programs: 48
Total runtime approximately 23 minutes 5 seconds

Ten slowest tests:
+---------+-----------------------------------+
| Time    | Test                              |
+---------+-----------------------------------+
| 13m 25s | t/acceptance.t                    |
| 5m 36s  | t/aggregate.t                     |
| 0m 36s  | t/standards/use.t                 |
| 0m 24s  | t/system/both/import/log/pager.t  |
| 0m 20s  | t/system/both/import/log/search.t |
| 0m 11s  | t/system/both/import/log/log.t    |
| 0m 11s  | t/unit/db/migrations.t            |
| 0m 11s  | t/unit/fip/record-changes.t       |
| 0m 10s  | t/unit/piptest/pprove/testdb.t    |
| 0m 9s   | t/test_class_tests.t              |
+---------+-----------------------------------+

(Note that the acceptance and aggregate tests both emcompass the vast majority of those 14000 tests and simple aggregation is no longer a win.)

Since we're running at roughly 10 tests a second, I started paying attention to where we can gain wins and it looks like XML::XPath is a likely candidate for performance improvements. In fact, the code has a number of areas where it could be improved substantially. It's not been updated in years and I've contacted Matt Sergeant about having a colleague take it over. This could also be an excellent opportunity to pay attention to its RT queue.

$ dprofpp
Total Elapsed Time = 35.55858 Seconds
  User+System Time = 13.44858 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
3.32   0.446  0.655  39692   0.0000 0.0000  <anon>:/home/poec01/trunk/deps/lib
                                             /perl5//XML/XPath/XMLParser.pm:63
2.94   0.395  1.373   5544   0.0001 0.0002  XML::XPath::XMLParser::parse_start
2.35   0.316  0.385  20180   0.0000 0.0000  <anon>:/home/poec01/trunk/deps/lib
                                             /perl5//XML/XPath/Node.pm:236
2.04   0.275  3.025  30768   0.0000 0.0001  <anon>:/opt/csw/lib/perl/site_perl
                                             /XML/Parser.pm:187
1.84   0.248  1.229  19680   0.0000 0.0001  XML::XPath::XMLParser::parse_char
1.84   0.247  0.247  48278   0.0000 0.0000  Class::Accessor::Grouped::get_simp
                                             le
1.83   0.246  0.598  20196   0.0000 0.0000  <anon>:/home/poec01/trunk/deps_pat
                                             ched/lib/Test/XML/XPath.pm:67
1.80   0.242  0.337  19252   0.0000 0.0000  Class::Accessor::Grouped::get_inhe
                                             rited
1.50   0.202  0.202  11760   0.0000 0.0000  XML::XPath::XMLParser::_namespace
1.42   0.191  1.113  11136   0.0000 0.0001  XML::XPath::Node::ElementImpl::DES
                                             TROY
1.32   0.178  0.178  11200   0.0000 0.0000  XML::XPath::Node::ElementImpl::get
                                             Attributes
1.28   0.172  0.418   1361   0.0001 0.0003  DBIx::Class::ResultSet::new
1.23   0.166  0.869    549   0.0003 0.0016  base::import
1.22   0.164  0.362  13872   0.0000 0.0000  XML::XPath::Node::ElementImpl::app
                                             endChild
1.20   0.162  3.187     24   0.0068 0.1328  XML::Parser::Expat::ParseString

Just that list above shows that almost 16% of our time is spent in XML::XPath.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Is there any reason why you could not switch from XML::XPath to XML::LibXML? The code should be portable, except for a few constants that need to be renamed, I think.

    In any case, if you have spotted improvements that can be made to the XPath engine of XML::XPath (as opposed to its DOM implementation) I would be interested to hear about it for XML::XPathEngine [cpan.org], which is a fork of that engine.

    Thanks

    --
    mirod
    • A quick switch of using XML::LibXML for Test::XML::XPath doubled the test time for one of our tests. Of course, that was a single test run and there could have been other issues involved.

      The major issues I saw for XML::XPath were getting rid of a lot of "shift" statements in favor of list assignments directly from @_. There is at least one wrapper function which should be an alias and a rather curious looking AUTOLOAD. Also, some of the C-style for loops can be worked on.

      Other than that, I haven't lo

      • [Sorry about the tardiness of this reply.]

        If XML::LibXML takes a lot longer than XML::XPath, there’s probably something bad going on.

        In this case, I’d guess that the “something bad” is that XML::LibXML (or rather, the underlying libxml2 library) is loading external DTDs over the network. I can parse an 18-megabyte XML instance from a file in a little under a second on my laptop, once I turn off DTD loading.

        Turning off the ”run very slowly” flag can be done like thi

        • You are correct that something was wrong. Turns out it was a completely unrelated error. Regrettably, we've discovered that we can't just switch over to XML::LibXML because we have two different namespaces and XML::XPath was letting us ignore this issue (this was in place long before I started). We have about 237 YAML documents driving our acceptance tests and they have multiple xpaths embedded which would presumably have to be changed to deal with this :/

          • In the interest of correctness it should be pointed out that XML::XPath is in violation of the XPath spec here, and that matching namespaced elements in XPath always requires binding a prefix to the namespace.

            • We are aware of this. We're just in a bad position with some legacy code and since it hadn't been addressed previously, we need to figure out the easiest way to move forward.