The global Internet is truly a fantastic tool – an ever-changing amalgam of people and ideas. Unfortunately, not all of the information out there in cyberspace is worthy of the screen space it takes up. I recently came across the personal blog of Marcelo Calbucci, the CEO of Sampa Corp, a startup company focusing on providing web/blog tools for beginners. A recent article, entitled “Web Developers: Speed up your pages!” struck my interest. I am always on the lookout for ways to introduce efficiencies, and I figured if I could pick up a tip or two to add to my arsenal, then it was worth my time to read the piece. Boy was I wrong.
I’m sure all of you are capable of reading the article for yourselves, as well as the various comments that have been posted by his visitors at the end of it, so I am not going to duplicate all of that here. In addition to offering very little the way of real efficiencies, some of the information he provides is in violation of a host of W3C standards, and will invariably create a wealth of maintenance nightmares, sure to keep your development team up until the wee hours should even the simplest problem arise. If web or application performance issues could be solved simply by stripping whitespace and avoiding XHTML, then I would probably have a good 3-4 years of my life back.
There are a few things in his article that have merit, such as image and HTTP compression, however, there are valid instances where you would want to do some of the other things he tells you to avoid.
But I am not here to complain. One of my top management rules states that “thall shall not come before me with a problem, lest ye have at least one idea towards a possible solution“. I come bearing gifts, not simple complaints. Without further adeiu, my list of things web developers can do to realistically “speed up” their pages. These are from experience, and will be presented in a fairly technology-agnostic manner:
Perform a hardware review
We’ll start with the obvious. Make sure that your web servers, database servers, and application servers are all tuned and not stressed or bottlenecked. Running your database server and your web server on the same system? Split ’em up and put them on two separate physical hosts. Application or web servers overloaded? Add more and distribute the load. Not enough RAM in your database server? Running 2 GB worth of processes on a system that only has 1 GB of RAM? Upgrade time. Still running old 5K RPM IDE-2 drives on your application server? Still think that single 1 GHz CPU is enough to get the job done? It may be time to open the wallet.
Perform an OS configuration/environmental review
Are you running the latest versions of your operating system and core tools? New versions often bring better efficiencies (though not always). Are your processes in tune and in synch? Do you have your swap space configured properly? Do you have unnecessary processes running, or processes that are hogging CPU or system bus cycles? Do you have backup jobs running during the middle of peak traffic? I once had an employee who was running a file sharing program on one of our servers – suffice it to say that his appetite for the latest music and MP3s was quickly eclipsed by the demand for better response time by our customers.
Perform a core application review
Are your core applications up-to-date and tuned? Check the installation and configuration of your web server software (Apache, IIS, etc), your database server software (MSSQL, MySQL, Postgres, etc.), and other ancillary applications (ImageMagick, PHP, Perl, etc.). For example, if you are running Apache, explore the
MaxClients, and other such directives. If you are running MySQL, double check the settings in your /etc/my.cnf file, in particular, things such as the
query_cache_size, and other such directives. A finely tuned MySQL server will sing beautiful music for your customers. An ill-configured or ill-tuned MySQL server will bring a quad-CPU mega-beast to its knees after about 5 seconds of live uptime.
As one example, consider this anecdote. After implementing a new look and feel for one of our sites (a big online community with over 20K members at the time), the thing began running like a dog. Server loads were very low (less than 1.0 on the 1 minute load average, even with 200-400 members on at any time). Couldn’t figure it out. Sooooo ….
1) We optimized our new graphics and achieved some space savings there. But, no dice. Site still ran slower than hell.
2) We tweaked our Apache configuration until we were blue in the face. Couldn’t get any love.
As it turned out, our images were not being cached because we had enabled Apache’s “cookie logging” capabilities (mod_usertrack). When you do this, each and every request from your web server will be accompanied by a cookie request. In most browsers, this will automatically render the object (image, page, etc) as un-cacheable. So every image request (and this site was image-heavy) was being served in a non-cached mode. We disabled cookie logging and solved the problem. Page loads went from upwards of 20 seconds per page to less than 2 seconds. Problem solved, life was good.
Perform a network review
Are you running on a busy or overloaded segment of the data center’s network? You are getting regular (raw) bandwidth reports from your hosting provider, right? Are you sure that your switching NIC adapter on your servers are running at 100MBs, and not 10MBs?
Here’s another anecdote for you. One of our sites was running on a dual AMD 2U rackmounted server, with 3GB of RAM. The MySQL database server for this site ran on an identical box. We were experiencing slowdowns, and spent several weeks frantically trying to figure out the cause. With 500-700 users on at any given time, sometimes it would be lightning fast, the next minute tortoise slow. We optimized these boxes until we were having dreams about optimizing servers. It turns out that our service provider had capped our bandwidth to that IP address at 3 mBs/sec, without telling us. And we were peaking, causing significant delays in page load times (15-60 seconds in some cases). Our fixed costs went up slightly each month, as we had to move to a 5 mBs/sec plan, but having them raise our limit to a higher mBs/sec solved the problem.
Perform a security review
How confident are you in the integrity of your systems? Are you being victimized by “denial of service” (DoS) attacks? Are you the victim of an exploit? Has a hacker commandeered your system and using it to share illegal copies of Terminator XI with all of his cyber-companions? If all of your available bandwidth is being siphoned off for non-value-adding purposes, then that is a clearly a problem. If you do not have the expertise in-house to perform routine security audits, I would recommend bringing in an outside provider of such a service. There are a wealth of security tools, bulletins, and updates out there. Staying on top of security issues is a mission-critical task.
Perform code reviews of custom applications
Are you writing efficient code on your dynamically generated pages? Nothing screams “streamlining opportunity” quite like inefficient code. I see this quite often in situations where the core applications were written under the gun or off the cuff, and not a lot of forethought was put into future traffic patterns. It is pretty easy to kludge your way around damn near anything if you only have 10 people using the application. Funny, though, how many applications fall to pieces when hundreds or thousands of people begin pounding away on it.
Any further discussion on this topic here would be out-of-scope given the nature of this article, not to mention the sheer number of different programming languages and tools that are out there. However, historically, some things to look at are:
Perform database/SQL query reviews
Not every programmer is a database analyst, despite the fact that they know how to log into the database as an administrator and create new tables. Are you indexing properly? Are you doing complicated INNER JOINs without the proper indexes? Are you fetching entire rows when all you need are 2 columns?
I recall this one guy who wrote a web application that had 10 different queries on the page, all nested in a series of loops. In the end, each load of this particular page resulted in a over 500MB of database traffic, and page load times in excess of 35 seconds. After some rework of his logic, query structure, and table design, the page resulted in only 50K of database traffic, and loaded in less than 1-2 seconds.
Now, I don’t profess to have all the answers. That’s the wonderful thing about technology – you can never have all the answers. There isn’t a day that goes by that at some point I don’t go “Hmm. I didn’t know that.” However, I am hopeful that these tips will provide some real-world ideas for those of you faced with a performance issue.