The final solution: nginx+apache2 and memcached
Been a while!
But I'll make up for the huge time gap: this post will probably be one of the most useful I'll ever post.
I happen to be running a high traffic website, have been running it for about 5 years now. Over the past few years though, my website has known a major traffic increase which resulted in my servers being regularly cluttered and my website inaccessible. My website profile: an Invision Powerd Board based website (heavily modded though), running under PHP 5 and MySQL 5. Servers are hosted in France at OVH.com.
At first, my reasoning was quite simple: spend more money on a more powerful server. I ran about 5 or 6 server upgrades over the years. I must say it worked at first, since I was running low-end servers. But for the last couple of months the traffic became way too high, which resulted in my website being completely inaccessible for a part of the world (for visitors in remote countries such as Canada, connections frequently timed out) and just plain slow for everyone else. At that time the traffic was: nearly 60K unique visitors/day, about 10 million visits/month.
My server setup: a quad-core with 8GB of ram for Apache, and a quad-core with 4GB of ram for MySQL, both using SATA2 RAID0 HDDs. Connected to eachother with a 1 Gb link.
Well, I've finally settled for a solution that seems to be working great. The website's fast for everyone, even for me over there in China.
1. Running PHP with Fast-CGI
My website is a community based website, which means the site is strongly dynamic. Every page served is PHP, strictly no HTML. The "default" option for serving PHP is to use PHP as an Apache module. The problem with this solution is that for every page served, a new Apache/httpd process has to be loaded in memory. With high traffic website this isn't necessarily a good solution especially if your server doesn't have much RAM.
So the first thing I did was to switch PHP from Apache module to Fast-CGI module.
Those tutorials should help you set up PHP as fast-cgi module for Apache:
[english] http://www.fastcgi.com/drupal/node/5?q=node/10
[french] http://www.sos-dedie.com/2009/01/15/apache-2-worker-et-php-fastcgi/
2. Setting up nginx for serving static content
Apache will be serving dynamic content via PHP as FastCGI module. But on top of Apache, we'll be using another webserver, a very lightweight one, for serving static content. Basically, this means php pages will be served by Apache, but other content (images, javascript, css, static html...) is served by nginx, which is extremely fast and reliable for such things.
How to put such a set up in production? Simple as that: nginx on port 80, Apache on port 8080 (or another) and nginx is configured to redirect all dynamic content to Apache. It's called "using nginx as reverse-proxy".
Here are a few articles about it:
[english] http://kovyrin.net/2006/05/18/nginx-as-reverse-proxy/
[english] http://wiki.joyent.com/accelerators:nginx_apache_proxy
[french] http://www.papygeek.com/software/optimiser-son-serveur-web-avec-nginx/
Both english articles don't cover vhosts issues, so if I ever get comments asking me how to proceed, I'll post a new article about it.
3. Memcached and memcache
I had a bit of trouble figuring that one out as I couldn't really find any article explaining the difference between memcache and memcached. So here's the deal.
- Basically "memcached" (note the trailing "d", which stands for "daemon") is a process that runs on your machine and that allows you to easily cache data in memory--RAM. It's basically a simple and efficient cache manager. It listens on a given port and you can connect to it via...
- memcache: this is the name of the PHP module that allows you to make use of memcached. You're going to need to install this because it doesn't come with PHP! Memcached and memcache can be found in the usual repositories (eg. rpmforge)
[english] http://www.danga.com/memcached/ < Memcached official website
[english] http://www.php.net/memcache < Memcache (PHP module) official website
Installing and configuring both isn't the only thing you have to do. You're going to have to make use of the memcache PHP module functions. That's the trick! But I'll guide you through it.
Here are a couple of handy functions:
memcache_connect ($host, $port, $timeout) : connects to the memcache server you've set up on your machine.
memcache_get ($key) : gets a string from the cache. Returns null if the string was not found.
memcache_set ($key, $data, $flag, $ttl) : save a string into the cache.
memcache_delete ($key) : deletes the string from the cache
Now how to use these functions: this couldn't be any simpler.
- Begin by connecting to the memcached server using memcache_connect()
- Before running any SQL query, ask yourself: can this query be cached? In theory, most queries can be. In my case, I used the memcache functions to optimize my portal page (index.php) which is basically a simple news article display. In other words, the content almost never changes, so this kind of query can definitely be cached.
Here is a simple code example:
// Retrieve data from cache
$articles = memcache_get( "news_articles");
// if $articles is null, it means the data isn't in the cache store
if (!$articles) {
$articles = array();
$result = mysql_query("SELECT * FROM news ORDER BY news_id DESC LIMIT 0,10");
while ($row = mysql_fetch_assoc($result)) $articles[] = $row;
// Save the data into the cache.
// Since memcache_get can only return a string, you'll have to serialize the data before saving it into the cache.
// The data is saved for 1 week as defined with the last parameter.
memcache_set( "news_articles", serialize($articles),
}
// $articles is not null
else {
$articles = unserialize($articles); // unserialize the data
}
// You now have a fully loaded $articles array, ready for display!
Conclusion
I managed to reduce the amount of SQL queries of my main page from an average of 15 to... 5. Every single page of my website loads nearly instantly even during high influx of visitors. Lately I had about 3000 users online simultaneously, and I didn't notice any slowdowns.
So I can safely say that these 3 points described above actually solved all my issues.
What a relief!
Clem
But I'll make up for the huge time gap: this post will probably be one of the most useful I'll ever post.
I happen to be running a high traffic website, have been running it for about 5 years now. Over the past few years though, my website has known a major traffic increase which resulted in my servers being regularly cluttered and my website inaccessible. My website profile: an Invision Powerd Board based website (heavily modded though), running under PHP 5 and MySQL 5. Servers are hosted in France at OVH.com.
At first, my reasoning was quite simple: spend more money on a more powerful server. I ran about 5 or 6 server upgrades over the years. I must say it worked at first, since I was running low-end servers. But for the last couple of months the traffic became way too high, which resulted in my website being completely inaccessible for a part of the world (for visitors in remote countries such as Canada, connections frequently timed out) and just plain slow for everyone else. At that time the traffic was: nearly 60K unique visitors/day, about 10 million visits/month.
My server setup: a quad-core with 8GB of ram for Apache, and a quad-core with 4GB of ram for MySQL, both using SATA2 RAID0 HDDs. Connected to eachother with a 1 Gb link.
Well, I've finally settled for a solution that seems to be working great. The website's fast for everyone, even for me over there in China.
1. Running PHP with Fast-CGI
My website is a community based website, which means the site is strongly dynamic. Every page served is PHP, strictly no HTML. The "default" option for serving PHP is to use PHP as an Apache module. The problem with this solution is that for every page served, a new Apache/httpd process has to be loaded in memory. With high traffic website this isn't necessarily a good solution especially if your server doesn't have much RAM.
So the first thing I did was to switch PHP from Apache module to Fast-CGI module.
Those tutorials should help you set up PHP as fast-cgi module for Apache:
[english] http://www.fastcgi.com/drupal/node/5?q=node/10
[french] http://www.sos-dedie.com/2009/01/15/apache-2-worker-et-php-fastcgi/
2. Setting up nginx for serving static content
Apache will be serving dynamic content via PHP as FastCGI module. But on top of Apache, we'll be using another webserver, a very lightweight one, for serving static content. Basically, this means php pages will be served by Apache, but other content (images, javascript, css, static html...) is served by nginx, which is extremely fast and reliable for such things.
How to put such a set up in production? Simple as that: nginx on port 80, Apache on port 8080 (or another) and nginx is configured to redirect all dynamic content to Apache. It's called "using nginx as reverse-proxy".
Here are a few articles about it:
[english] http://kovyrin.net/2006/05/18/nginx-as-reverse-proxy/
[english] http://wiki.joyent.com/accelerators:nginx_apache_proxy
[french] http://www.papygeek.com/software/optimiser-son-serveur-web-avec-nginx/
Both english articles don't cover vhosts issues, so if I ever get comments asking me how to proceed, I'll post a new article about it.
3. Memcached and memcache
I had a bit of trouble figuring that one out as I couldn't really find any article explaining the difference between memcache and memcached. So here's the deal.
- Basically "memcached" (note the trailing "d", which stands for "daemon") is a process that runs on your machine and that allows you to easily cache data in memory--RAM. It's basically a simple and efficient cache manager. It listens on a given port and you can connect to it via...
- memcache: this is the name of the PHP module that allows you to make use of memcached. You're going to need to install this because it doesn't come with PHP! Memcached and memcache can be found in the usual repositories (eg. rpmforge)
[english] http://www.danga.com/memcached/ < Memcached official website
[english] http://www.php.net/memcache < Memcache (PHP module) official website
Installing and configuring both isn't the only thing you have to do. You're going to have to make use of the memcache PHP module functions. That's the trick! But I'll guide you through it.
Here are a couple of handy functions:
memcache_connect ($host, $port, $timeout) : connects to the memcache server you've set up on your machine.
memcache_get ($key) : gets a string from the cache. Returns null if the string was not found.
memcache_set ($key, $data, $flag, $ttl) : save a string into the cache.
memcache_delete ($key) : deletes the string from the cache
Now how to use these functions: this couldn't be any simpler.
- Begin by connecting to the memcached server using memcache_connect()
- Before running any SQL query, ask yourself: can this query be cached? In theory, most queries can be. In my case, I used the memcache functions to optimize my portal page (index.php) which is basically a simple news article display. In other words, the content almost never changes, so this kind of query can definitely be cached.
Here is a simple code example:
// Retrieve data from cache
$articles = memcache_get( "news_articles");
// if $articles is null, it means the data isn't in the cache store
if (!$articles) {
$articles = array();
$result = mysql_query("SELECT * FROM news ORDER BY news_id DESC LIMIT 0,10");
while ($row = mysql_fetch_assoc($result)) $articles[] = $row;
// Save the data into the cache.
// Since memcache_get can only return a string, you'll have to serialize the data before saving it into the cache.
// The data is saved for 1 week as defined with the last parameter.
memcache_set( "news_articles", serialize($articles),
MEMCACHE_COMPRESSED, 60*60*24*7
);}
// $articles is not null
else {
$articles = unserialize($articles); // unserialize the data
}
// You now have a fully loaded $articles array, ready for display!
Conclusion
I managed to reduce the amount of SQL queries of my main page from an average of 15 to... 5. Every single page of my website loads nearly instantly even during high influx of visitors. Lately I had about 3000 users online simultaneously, and I didn't notice any slowdowns.
So I can safely say that these 3 points described above actually solved all my issues.
What a relief!
Clem
Comments