There are a number of different aspects of scalability. It always starts with performance, which is what we will cover in this article. But it also covers issues such as code maintainability, fault tolerance, and the availability of programming staff.
Is this what scalability means? It's certainly not my definition. Do code maintainability, fault tolerance, and the availability of programming staff have something to do with scalability? They can if your definition of scalability takes human resources into account, which seems reasonable.
The definition I find on Dictionary.com describes scalability as:
How well a solution to some problem will work when the size of the problem increases.
This seems like a better definition. A textbook definition would be something to the effect of, "the ability to scale." This is probably a starting point that everyone can agree to. So why do some people argue that certain technologies (PHP, mod_perl, Java, etc.) don't scale? I have always assumed that these people define scalability as the ability for something to scale well and that they're using their own subjective opinions to define what scales well and what doesn't. This is where things go wrong. It also seems that more and more people use scalability as a measure of performance, when this is not the case either. Something that performs very poorly can still potentially scale very well. Scalability is a relative measurement.
Before I say more, I should describe what I think it means for something to scale well. Consider the following three figures:
The first figure represents a case where the amount of required resources grows exponentially compared to the number of users. This is bad. In the second figure, the amount of required resources grows linearly. This is typical (the rate of growth can vary). In the third figure, the amount of required resources grows logarithmically. This is very nice. Of these three figures, my opinion is that both the second and third represent something that scales well. Because I am a Web developer, a growing number of users is typically when the "size of the problem increases" for me. The term "resources" refers to many things, but most people are concerned with cost. Things that cost money include hardware, software, human resources, and time.
Lastly, let's look at an example. Consider two hypothetical technologies, Technology A and Technology B:
Resources required to build an application that supports 100,000 users a day:
Technology A: 10 servers, 5 developers, and 6 months of development time
Technology B: 40 servers, 10 developers, and 3 months of development time
Resources required to build an application that supports 250,000 users a day:
Technology A: 25 servers, 5 developers, and 9 months of development time
Technology B: 50 servers, 10 developers, and 6 months of development
Which technology do you think scales better? Which appears to be the better choice when no more than 250,000 users a day need to be supported? Should things like maintainability and robustness be taken into consideration? How do you measure these things? If you are making decisions based on your assumptions about the scalability of certain technologies without asking these types of questions, you need to stop making such decisions.
My article about XSS and CSRF was published today (technically yesterday, since it is after midnight) in php|a. When reading through it, I couldn't help my perfectionist tendencies, and I found myself noticing a few minor errors. None of these exist in the original manuscript, but the complexities of the editorial process can sometimes introduce a few problems. I have found this to be true with both book publishers and magazine publishers. Just as with writing code, any step that involves a change can introduce bugs.
The reason I decided to write about this is that php|a offers some nice forums, and each article they publish is given its own forum. This provides a convenient place for follow-up questions and discussions about the article. It also provides a home for article errata.
I have found many articles on the Web with serious errors, and given the likelihood for misinformation to mislead people, it would be nice if there was an easy way for people to find article errata (in cases where the article itself cannot be corrected). I have tried to contact the original author in a few cases, but it seems that most every email address I use for this purpose is outdated.
Would a single source for such article errata be the best solution, or should each publisher/Web site provide its own? I'm not sure, but I may give it some more thought.
NYPHP has announced RAMP training courses. Lasting only three hours each, these courses are intended for people who are already experienced but are looking for advanced instruction on very specific topics. The hope is that you can take a class in the morning and be applying what you have learned the same afternoon.
I will be teaching HTTP and State Management, which I hope will help people use PHP sessions more effectively. Once I cover some fundamental topics, I plan to focus on debugging techniques and methods of improving the security of your sessions.
These courses are scheduled for November 10 and 11 and are being taught in some nice training facilities in New York City.
I wrote a quick PHP script that produces an RSS feed of my blog. You can find it at http://shiflett.org/rss. I'm not positive that it is valid RSS, but it satisfies a few RSS validators that I was able to find online.
Slashdot featured a review of the HTTP Developer's Handbook today. The review itself was short on details and not very flattering, but an overwhelming majority of the comments posted were very complimentary, so that was nice. Here's hoping that the exposure proves to be helpful.
You may have already read on David's blog that he, Adam, and I are speaking at the NYSIA Open Source SIG on Tue, 07 Oct 2003. That's the New York Software Industry Association Open Source Special Interest Group. Acronyms aren't so bad, are they?
I also spent some time converting another chapter of my book from PDF to HTML, and Chapter 11 is almost complete (it only lacks the figures). This chapter explains cookies, and it might at least give people a URL to point to when they answer the thousands of cookie questions that are posted on various mailing lists every day.
Thanks as always to Sams Publishing for allowing me to provide a few chapters as free samples.
I went to the US Open yesterday to watch the Men's Semis. Andre Agassi lost his first two sets and was unable to come back. Andy Roddick had the same start, so it looked like it was going to be a bad day for US fans. Luckily, he made an amazing comeback (made possible by his incredible serve). Overall, it was a fun day.
Andy went on to beat Juan Carlos Ferrero in straight sets for the Men's Championship. His serve reached 141 mph, the tournament high. Amazing.
LV946 Scheduled PHP Attacks and Defense
LV947 Declined Securing PHP Sessions
That sessions talk keeps getting declined. Maybe I can talk Nat into including it in OSCON next year.
On a side note, I'm wondering what the last accepted entry number is. I bet 946 is in the running, as I submitted the talk within minutes of the deadline.
Feeling like the last to do so, I now have a blog on my Web site.