The problem is that they're measuring the wrong thing. Is your program taking too much CPU, or is it taking too much time? If you're running on a computer built in the last decade, it's probably time that's the issue. So why do we profile CPU cycles?
Nearly every program outside of the scientific community (or other math-heavy situations) is I/O bound. You don't see the I/O when you profile CPU time. It looks like it's not even there.
I've seen this happen so many times on the Class::DBI, Template Toolkit, and mod_perl lists. People show up all desperate to eliminate every method call or rewrite their templating module in C based on a profile they did. Then you tell them to do one by wall time and they find out that 99.9% of the run time is actually waiting for database calls to come back or something similar.