Translate: 
EnglishFrenchGermanItalianPolishPortugueseRussianSpanish

Running PHP on NFS: huge performance problems and one simple solution.

Drupal, Joomla and other popular PHP scripts use a PHP language constructs called include_once() and require_once(). This constructs includes other PHP files, but only once. In order to do so, it differentiates between files with their full path using the lstat system call.

Unfortunately this leads to a lot of lstat calls which are not cached using NFS. This in turn can slow down larger PHP applications even tenfold.

Lstat and network latency problem

Using strace utility to count lstats on one Apache + WordPress run leads to results like this:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 86.47    0.036700          38       957        58 lstat
  7.85    0.003333         476         7           chmod
  5.52    0.002345         156        15           recvfrom
  0.16    0.000066           1        96           munmap
  0.00    0.000000           0         3           read
  0.00    0.000000           0        25           write
  0.00    0.000000           0        12           open
  0.00    0.000000           0        14           close
  0.00    0.000000           0         1           stat
  0.00    0.000000           0        13           fstat
  0.00    0.000000           0        13           poll
  0.00    0.000000           0         7           lseek
  0.00    0.000000           0         6           mmap
  0.00    0.000000           0         2           rt_sigaction
  0.00    0.000000           0         2           access
  0.00    0.000000           0         1           setitimer
  0.00    0.000000           0         9           sendto
  0.00    0.000000           0         1           shutdown
  0.00    0.000000           0        14           flock
  0.00    0.000000           0        22        22 readlink
  0.00    0.000000           0         6         1 futex
------ ----------- ----------- --------- --------- ----------------
100.00    0.042444                  1226        81 total

As you can see, over 86% of request time has been consumed by lstat system calls. In effect, on tested site page load time was 12 seconds while on local file system just 1,5 second.

Step#1: increase the size of realpath cache

The solution is to increase realpath_cache_size in php.ini. This feature caches the real path of PHP files and it is used to cache the checks that include_once() performs.

Unfortunately, the default php.ini file sets this to 16K which makes it close to useless. You can solve this problem by changing realpath_cache_size in php.ini, like so:

; ...
; Determines the size of the realpath cache to be used by PHP. This value should
; be increased on systems where PHP opens many files to reflect the quantity of
; the file operations performed.
realpath_cache_size=1M
 
; Duration of time, in seconds for which to cache realpath information for a given
; file or directory. For systems with rarely changing files, consider increasing this
; value.
realpath_cache_ttl=300
; ...

In this case increasing the size of realpath cache result in a 600%-1000% performance improvement.

Unfortunately there is a long standing bug in PHP, which disables realpath cache when open_basedir or safe_mode are in use. In order to solve the situation, you can disable these two settings, or read step #2 below.

Step #2: install additional PHP extension

This step is necessary only if you are trying to use realpath cache with open_basedir setting enabled.

I created a PHP extension which bypass the security open_basedir checks implemented in PHP core. You can download this module from this link.

In order to use this extension, you have to compile it first:

unzip realpath_turbo.zip
cd realpath_turbo
phpize
./configure
make
cp modules/turbo_realpath.so /usr/lib/php/modules

Please, remember to change /usr/lib/php/modules to the path used by your PHP installation.

Next, you have to configure this PHP extension in php.ini file, like so:

; you have to load the extension first
enable=turbo_realpath.so
; then copy the value of open_basedir into realpath_cache_basedir parameter
realpath_cache_basedir = /var/www/html/drupal
; and finally DISABLE the open_basedir setting,
; it will be changed automatically to the value of a realpath_cache_basedir setting.
; open_basedir=""

As you can see, in order to use this extension, you have to move the value of open_basedir setting into realpath_cache_basedir and then disable open_basedir itself. After this, PHP will reenable open_basedir restrictions automatically.

Finally, reload the HTTP server. If everything’s okay, you should:

  1. see a turbo_realpath extension listed by a phpinfo() function
  2. be unable to access files which are not within the allowed paths set in realpath_cache_basedir
  3. see open_basedir set automatically to the value of realpath_cache_basedir in a phpinfo() report
  4. see the non-empty list of cached paths by calling the realpath_cache_get() function (this function is for PHP 5.3.2+ only!).

Please note, there is a new version of turbo_realpath extension, click here for details

Warning! Realpath cache has one security flaw! Files which paths has been cached, can be easily overwritten by symlinks pointing to other, protected files. In result, potential attacker can access files which could not be opened otherwise, because PHP thinks, that cached paths are safe and unchanged. There is only one solution to this problem: disable PHP functions responsible for creation of symlinks in php.ini file and delete all existing symlinks from the web directories.

Tags:

7 Responses to “Running PHP on NFS: huge performance problems and one simple solution.”

  1. Stu says:

    Hi

    Just tried to compile this but getting the following error:

    [root@earth realpath_turbo]# make
    /bin/sh /root/realpath_turbo/libtool –mode=compile cc -I. -I/root/realpath_turbo -DPHP_ATOM_INC -I/root/realpath_turbo/include -I/root/realpath_turbo/main -I/root/realpath_turbo -I/hsphere/shared/php5-2/include/php -I/hsphere/shared/php5-2/include/php/main -I/hsphere/shared/php5-2/include/php/TSRM -I/hsphere/shared/php5-2/include/php/Zend -I/hsphere/shared/php5-2/include/php/ext -I/hsphere/shared/php5-2/include/php/ext/date/lib -DHAVE_CONFIG_H -g -O2 -c /root/realpath_turbo/turbo_realpath.c -o turbo_realpath.lo
    cc -I. -I/root/realpath_turbo -DPHP_ATOM_INC -I/root/realpath_turbo/include -I/root/realpath_turbo/main -I/root/realpath_turbo -I/hsphere/shared/php5-2/include/php -I/hsphere/shared/php5-2/include/php/main -I/hsphere/shared/php5-2/include/php/TSRM -I/hsphere/shared/php5-2/include/php/Zend -I/hsphere/shared/php5-2/include/php/ext -I/hsphere/shared/php5-2/include/php/ext/date/lib -DHAVE_CONFIG_H -g -O2 -c /root/realpath_turbo/turbo_realpath.c -fPIC -DPIC -o .libs/turbo_realpath.o
    /root/realpath_turbo/turbo_realpath.c: In function âPHP_INI_BEGINâ:
    /root/realpath_turbo/turbo_realpath.c:35: error: expected declaration specifiers before âPHP_INI_ENTRYâ
    /root/realpath_turbo/turbo_realpath.c:48: error: expected â=â, â,â, â;â, âasmâ or â__attribute__â before â{â token
    /root/realpath_turbo/turbo_realpath.c:54: error: expected â=â, â,â, â;â, âasmâ or â__attribute__â before â{â token
    /root/realpath_turbo/turbo_realpath.c:60: error: expected â=â, â,â, â;â, âasmâ or â__attribute__â before â{â token
    /root/realpath_turbo/turbo_realpath.c:68: error: expected â{â at end of input
    make: *** [turbo_realpath.lo] Error 1

    Any ideas whats wrong?
    Running on Centos 5 with 2.6.18-194.el5

  2. Stu says:

    Hi

    After a bit of google searching I found the the following needs to be added to the top of turbo_realpath.c:

    #include “php_ini.h”

    This made Centos and php 5.2 happy ;)

  3. Artur Graniszewski says:

    Hi,

    first of all, sorry for late reply, I was on vacation. You’re right, there should be #include “php_ini.h” in PHP 5.2 extension. Unfortunately I tested my code only on PHP 5.3 so I did not knew about this issue before.

    I’ll update the source codes. Thank you very much for your help:)

  4. grubi says:

    Wouldn’t be the best solution to cache only non-symlinks, when open_basedir/safe_mode is activated? So you get on the one hand side the best performance for “normal” pathes and on the other one the bad performance for symlinks… without any security flaws

  5. Artur Graniszewski says:

    The problem would still exist. Imagine that potential hacker could create a regular file, open it with PHP so it would be cached, then removed it quickly and created a symlink. In that case PHP couldn’t tell, if cached real path is still valid without using another fstat.

  6. Mark says:

    Think this would be useful for iscsi users? Also, mind providing the example command for a novices such as myself for checking these “lstat” results? Thanks for providing this script, I am just trying to research things as much as possible for trying it out.

  7. Artur Graniszewski says:

    ISCSI is much more efficient than NFS, so it shouldn’t be such a huge problem for PHP.

    Neverthless, every extra system call slows the script down (even on a local file system). This is sometimes can be a problem even in case of fstats performed on the .htaccess files by the Apache HTTPD.

Leave a Reply