Would you move software into production if it failed to perform to spec five percent of the time? Of course you would. In fact, you probably have. This sort of things happens all the time and we call it "a bug" (or "a feature" if we want to alienate our user base). So rephrasing the question: would you knowingly move such software into production? Probably not. Admittedly, there's a certain ill-defined level of complexity beyond which bugs are unescapable, but a five percent failure rate is ridiculously high, particularly if the feature that fails is critical to the overall correct operation of the software.
Here's a hint to all developers doing TDD (test driven development): tests are software, too. While we admittedly do some strange things in tests (I override function definitions left and right), for the most part, good software practices apply to tests because tests are software. It doesn't matter if you're not shipping these tests to your customers (though I think you probably should). What matters is that software is software and if you write perfect code for your customers and lousy tests, you still have a substandard product.
Right now, I'm dealing with intermittant software failures in tests. As it turns out, the developers who wrote these tests knew that they would likely fail at some point, but rather than make sure the tests always worked, they accepted that they would usually work. Maybe this is fine for Windows users who accept that the occassional BSOD is an acceptable price to pay for the ability to create Powerpoint presentations, but as a developer, I am the customer for those tests that he's writing. I can't pay that price because shouldn't be expected to keep track of which tests might fail and which might succeed when I'm dealing with thousands of tests. It's like having an error log full of "unitialized" warnings. After a while, you learn to ignore the error log.
Fragile code is also bad, regardless of whether or not it's in a test. We have tests where we compare stack traces in error logs. The tests assume that the trace is going to be an exact match, if someone fixes a module in a completely different section of code, they have problems in running the test suite because they may have affected a stack trace in an apparently unrelated set of tests. Right now, I'm working on testing the stack trace functionality directly and then the stack trace tests will use regular expression rather than expect an exact match. Once done, we'll have much more robust tests.
We also have a several helper functions that are cut and pasted into many different test programs. These should have been refactored, but weren't, so if there's a problem, I need to grep through the codebase and find all of these problems. I don't know if anyone at this company has ever done such a large-scale fix of the code as I'm doing, but having a test suite where the rules of good software development apply would have made this job much easier to deal with.