"Curiosity is the very basis of education and if you tell me that curiosity killed the cat, I say only the cat died nobly." - Arnold Edinborough

Recently a post of mine was linked on yCombinator and for some reason a lot of the comments talked about the efficiency of WordPress. While it’s technically not related to the subject of the linked post I just want to point out that the performance of WordPress is pretty horrible regardless of whether you use Apache or nginx.

My friend Karl Blessing and I recently talked about WordPress caching plugins. He uses WP SuperCache and I use W3 Total Cache and he subsequently decided to do some WordPress caching benchmarks on the different methods. He’s done an awesome job and generated some pretty graphs for you to look at.

What I took away from the whole thing is that W3 Total Cache and WP SuperCache can offer similar performance if you’re willing to do static file caching, however, W3 Total Cache can offer a cleaner solution with caching in Memcached if you’re willing to sacrifice a bit of performance. The benefit to this, and why I use this method, is that you don’t need complicated rules in your nginx (or Apache) configuration files.

The Big Picture

So you’ve finally decided to make the switch from Apache to nginx. You most likely did this for performance reasons; perhaps all those blogs have been writing about how fast nginx is or perhaps your webmaster friends have been raving about how they can now handle a lot more traffic without spending money on hardware.

This is usually all true, but why exactly is nginx so much faster than the typical Apache setup of the prefork MPM and mod_php? The technical explanation is that nginx is a non-blocking event based architecture while Apache is a blocking process based architecture. To simplify it heavily the theory is like this:

Apache Prefork Processes:

  • Receive PHP request, send it to a process.
  • Process receives the request and pass it to PHP.
  • Receive an image request, see process is busy.
  • Process finishes PHP request, returns output.
  • Process gets image requests and returns the image.

While the process is handling the request it is not capable of serving another request, this means the amount of requests you can do simultaneously is directly proportional to the amount of processes you have running. Now, if a process took up just a small bit of memory that would not be too big of an issue as you could run a lot of processes. However, the way a typical Apache + PHP setup has the PHP binary embedded directly into the Apache processes. This means Apache can talk to PHP incredibly fast and without much overhead, but it also means that the Apache process is going to be 25-50MB in size. Not just for requests for PHP requests, but also for all static file requests. This is because the processes keep PHP embedded at all times due to cost of spawning new processes. This effectively means you will be limited by the amount of memory you have as you can only run a small amount of processes and a lot of image requests can quickly make you hit your maximum amount of processes.

Compare this to the nginx event based method.

Read More

“No input file specified” or “Primary script unknown” in the error log is one of the most frequently encountered issues in nginx+PHP. People on serverfault and in the #nginx IRC channel asks for help with this so often that this post is mostly to allow me to be lazy and not have to type up the same answer every time.

This is actually an error from PHP and due to display_errors being 0ff people will often just get a blank page with no output. In a typical setup PHP will then send the error to stderr or stdout and nginx will pick up on it and log it in the nginx error log file. Thus people spend a ton of time trying to figure out why nginx isn’t working.

The root cause of the error is that PHP cannot find the file nginx is telling it to look for, and there are a few common cases that causes this.

Read More

Edit: Part 2 is now available.

This is the first entry in a short series I’ll do on caching in PHP. During this series I’ll explore some of the options that exist when caching PHP code and provide a unique (I think) solution that I feel works well to gain high performance without sacrificing real-time data.

Caching in PHP is usually done on a per-object basis, people will cache a query or some CPU intensive calculations to prevent redoing these CPU intensive operations. This can get you a long way. I have an old site which uses this method and gets 105 requests per second on really old hardware.

An alternative that is used, for example in the Super Cache WordPress plug-in, is to cache the full-page data. This essentially mean that you create a page only once. This introduces the problem of stale data which people usually solve by checking whether data is still valid or by using a TTL caching mechanism and accepting stale data.

The method I propose is a spin on full-page caching. I’m a big fan of nginx and I tend to use it to solve a lot of my problems, this case is no exception. Nginx has a built-in Memcached module, with this we can store a page in Memcached and have nginx serve it – thus never touching PHP at all. This essentially turns this:

Read More

I love PHP, I really do… just, sometimes you fall into these situations where you just can’t help bang your head against the table until even the table starts bleeding.

I’m currently working on a blog series about intelligent caching with PHP. During the preparation for this I had to create a framework which could demonstrate the caching concepts I was going to be discussing, and what better than using namespaces for it. Namespaces makes auto loading easy as pie, they give you an incredible freedom in class naming due to the extra encapsulation layer and in general they’re just brilliant. Except for this.

$controller = '\Evil\Controller\News';
$data = $controller::dataKeyInvalidates($invalidate);

It simply wouldn’t work. I was certain it worked fine without namespaces so for kicks I tried this:

$data = \Evil\Controller\News::dataKeyInvalidates($invalidate);

And it ran without problems. Fun. What eventually let me figure out what was going on was by echoing the variable passed to my auto loader. When I used a variable to reference the class name the auto loader was passed “\Evil\Controller\News” when I didn’t the auto loader was passed “Evil\Controller\News”. Obviously some magic goes on inside PHP that translates \Evil\Controller\News into “Evil\Controller\News”. Setting $controller to ‘Evil\Controller\News’ made it work perfectly fine.

So to recap this:

// Does not work.
$controller = '\Evil\Controller\News';
$data = $controller::dataKeyInvalidates($invalidate); 

// Works.
$controller = 'Evil\Controller\News';
$data = $controller::dataKeyInvalidates($invalidate); 

// Works.
$data = \Evil\Controller\News::dataKeyInvalidates($invalidate); 

// Does not work.
$data = Evil\Controller\News::dataKeyInvalidates($invalidate);

Disclaimer

I have to admit up front that I know the author of this book and originally introduced him to nginx, that said, the friendship between us will not affect this review.

Conclusion

I always skip to this part when I read reviews, so I figure I might as well start with it, the details will follow after the conclusion.

Nginx HTTP Server is a new book by Clément Nedelcu and published by Packt Publishing, it’s the first English book about nginx to be available for purchase. Previously users had to depend on the wiki and various blogs for documentation; before that IRC and the mailing list. All aspects of nginx have undergone huge growth, from the feature set to the user base and finally to the documentation. This book fills an important area in nginx documentation that has been missing, a hard copy of coherent logical steps that documents the nginx HTTP server. The wiki is a great source of information, but it’s difficult to print out easily readable material to bring on the train or bus to work, or even if you just want to sit down with a cup of tea and read.

Read More