Monday, December 3, 2007

Web Applications and Memory Abuse

As a performance engineer, part of my job is to make our software run really really fast. Another part is to figure out how to make sure that it still runs as fast with 1000 users as it does with 10. That's usually the harder part. You can usually squeeze gains out of each transaction enough to make the single-user cases perform fine. Jeff Atwood describes it very well in this blog post. Says Atwood:
"Everything is fast for small n. Don't fall into this trap. It's an easy enough mistake to make. Modern apps are incredibly complex, with dozens of dependencies...The only way to truly know if you've accidentally slipped an algorithmic big O bottleneck into your app somewhere is to test it with a reasonably large volume of data."
As an example:

A big reason for poor scalability is the abuse of memory. Everything is faster when loaded from memory, right? That's always good, right? Matter of fact, why not put everything the user sees in their session? That way all they have to do is go straight to their own little section of memory so that everything they do is fast fast fast. Hmm. Good when the n is small. Bad when the n is big. Let's say, for kicks that every user session grows to 2MB. That's not unrealistic when considering some of the data-driven applications that you use on a day-to-day basis. Not so bad, right? After all, memory is cheap! Now, let's say your application is running on 32-bit Windows. For reasons explained here, you can only fire up a 1.5 GB JVM. So let's say you have 500 active users on your system. Do the math:

1500MB total heap size - 250MB server bootstrap - ( 500 users * 2MB session ) = 250MB for processing. Now things aren't looking so peachy, eh. However! We can just cluster, right, so we have multiple JVMs running, to take advantage of all that memory! Ha! In the words of Lee Corso, "Not so fast, my friend!". The default method of session propagation is simply to multicast the session changes across cluster nodes. Now you have 1GB of user-specific crap clogging up each of your JVMs. That leaves little heap for actual system processing needs. Now, factoring in the fact that most users don't proactively log out of systems, rather letting their session lapse to timeout, and the fact that many (non-financial) systems have rather long timeout intervals, you could have ALL THAT MEMORY just sitting there for users who may not even be in the same room as the computer that was associated with that session. Bad. Smokey says that "Only you can prevent forest fires". Well I am saying "Only you can prevent OutOfMemoryExceptions!" You could also put it like this: "Friends don't let friends bloat their sessions."

However you say it, you need to understand the trade off between saving processing time on the database and ruining the overall ability of your application server to respond to increasing numbers of sessions and requests.