Today I had an interesting report from Web::Scraper user, saying that he has a script that runs really quick (less than 1 sec) on Macbook but so slow (50 secs) on AMD dual CPU machine. Here's the dprof report:
Total Elapsed Time = 47.32165 Seconds
User+System Time = 31.07165 Seconds
%Time ExclSec CumulS #Calls sec/call Csec/c Name
51.6 16.03 16.033 6922 0.0023 0.0023 XML::XPathEngine::NodeSet::new
13.5 4.208 4.208 1777 0.0024 0.0024 XML::XPathEngine::Boolean::True
13.0 4.048 4.048 1723 0.0023 0.0023 XML::XPathEngine::Literal::new
11.3 3.518 3.518 1666 0.0021 0.0021 XML::XPathEngine::Boolean::False
We initially thought it's due to some XS module library issues with dual CPU, but it turned out he was using perl that comes with Fedora, and the rpm version he uses is 5.8.8-10.
As addressed in RH/Fedora bugzilla, perl 5.8.8 rpm prior to 5.8.8-22 has a nasty patch that makes all perl's new() (or bless) call in classes with overloaded methods really slow. HTML::TreeBuilder::XPath (hence Web::Scraper) creates a lot of Nodes on HTML pages and XML::XPathEngine::NodeSet definitely has an overloaded function.
So this is really due to Fedora Perl's patch. If you run into the same issue with Fedora, check your rpm version and upgrade to the latest, or build your own perl which is always a good thing.