You need to move this site to a host that can handle the load today. I'm getting timeouts and finally got the home page to load. I can't get any of the other pages to load either.
I told Trafton and company about this because I'm working on-site at WPEngine this week and this looked very relevant to their interests. They specifically want to rescue your site (gratis) so that they can use it as an example of "Wordpress hosting is hard to do right. You don't want it to blow up on launch day. We're very experienced with not blowing up on launch day."
i.e. He doesn't want to sell you on WPEngine, he wants to save your bacon and use the bacon-saving to sell other people on WPEngine ;)
Jason Cohen (their CEO) says "We'd handle the migration immediately and give him a year of free hosting to get a case study out of this."
You should add to your marketing more info about how you are good in handling traffic and most importantly why. In a way that will make sense with both tech and non tech types. (Non tech has no idea what aws is or a cdn is for example). I've looked at your site a few times (even right now) and that isn't the thing that stuck in my mind from your marketing. The takeaway for me was "expert at WP" not necessarily "expert handling of traffic and here's why". I'm not saying it's not mentioned. I just think that point has to be driven home better.
We're (literally) working on a website refresh right now, which hits that and other points. Sample factoid: One of the customers was on 20/20 recently and sustained 2,500 requests per second for 15 minutes.
Boring technical details: Varnish caching, automatic load balancing, redundant servers (beefy physical hardware to avoid poor disk performance on virtualized systems), "Death to KeepAlive", etc.
I would add an additional bullet to the home page adapting a key statement from that page where they said "sit back and be happy you're not having to do all this yourself."
Why our engine never stops
-----------------------------
(Hey, why are we so fast, secure and scalable?)
Sit back and be happy that you're not having to do all this yourself
Thanks for the suggestions. We're still in the process of updating things, but we'll make "Compelling for non-technical users who just want things to not break, comprehensive for technical people who want to understand we're not snow-jobbing them." a priority.
I just grep'd the log and saw that in the three hours since I posted the screen grab there were 1610 unique IP's. So obviously many more people were not able to reach the site.
I got about 12k people through to my blog post "Thanks Louis C.K now here's my dad" post the other day, and then about 4,000 of them clicked through to nickdooley.com (which is on the same server).
I had already increased my MaxClients setting to 120 a few weeks previously when I got about the same number through to "Your templating engine sucks and everything you've ever written is spaghetti code" post but I forgot that with people downloading those mp3 files, the connections would get held open for much longer so I had to put MaxClients up again to 220 and restart apache at one point (which I'm sure kicked a bunch of people off, some of whom would have abandoned trying to load the site at that point).
With MaxClients set at 220 the site was able to serve pages quickly and reliably the rest of the day as well as allowing roughly 5500 people to download my Dad's music files.
I'm running the site on Apache 2.x on a CentOS VM with php 5.2.x. We don't serve static assets with nginx so it's all coming out of the same apache instance so I have 2GB of RAM on that VM (yes I know I really need to get around to sorting that out).
We have a crappy cache in Decal CMS (which we make and which runs all our sites) right now which still serves pages from PHP but does so from cached content.
At some point we're going to move to using Varnish in front of nginx for static assets so that the only requests that get through to apache are to generate the pages after a site publish and requests where a cookie is present (ie. when someone is actually editing their site) but for now we're able to manage with just the shitty hack we've got in place to served cached documents.
Setting your max clients to 220 on that setup is a terrible idea. You will probably cause the machine to start swapping as there is insufficient memory for each PHP process. It will grind to a slow halt. You should calculate your MaxClients based on the amount of load the machine can actually handle, not the amount you wish it could handle :-)
A conservative estimate of 20MB per PHP process would already put the requirements for 220 of them at over 4GB (twice what you have) and that's not allowing anything for the OS and anything else running on the machine. Its not an exact science figuring out the appropriate MaxClients, but you should find out how much the rest of the machine needs, look up the average Apache process size (ps), divide the available memory by that and then reduce it a little. Use this as you starting point for figuring out the MaxClients. Keep an eye on the available free memory for a while (vmstat) and gradually up that limit until you reach a comfortable working level for the setting.
Yeah - what's weird is that it didn't start swapping. I haven't really looked into it, but maybe because so many of the processes were downloading those MP3 files they didn't take up as much memory (although that sounds wrong as I'm pretty sure as far as Apache is concerned, a process is a process regardless of what's being downloaded).
At any rate it worked fine and was maxed at one point so hell knows why. I increased it gradually and was looking at the free memory etc. but it never exhausted it - but ps ax | grep httpd showed circa 200 processes.
I've put it back down to something safer now ;) when I have some spare time I'm going to try and figure out if/why it didn't appear to run out of memory when it looked like it should have.
But then again when I have some spare time I really just want to set up a better hosting environment that doesn't rely so much on apache.
I never got 10K from Hacker News alone, but on my blog I got 10K from HN combined with Reddit, with the links being submitted at aprox. the same time, for 3 articles I wrote.
My website is static, with no MySQL or PHP to speak of (by means of Jekyll), right now hosted on an AWS micro instance, served by Nginx. Prior to this it was hosted on Heroku's free plan, with offloading of static assets to GAE. For free and it didn't even blink.
Seeing how people are talking about caching, load-balancing, clusters, beefy servers and so on, just makes me think how extremely awful and bloated Wordpress is.
I don't think it's fair to conclude that wordpress is awful and bloated based on it performing worse than your setup. Static files and a cdn and no dynamic script are the ultimate optimization. Wordpress compares unfavourably because anything involving dynamic features would compare unfavourably.