jump to navigation

An Introduction to Apache’s MPM-Prefork or MPM-Worker Mode December 15, 2012

Posted by Tournas Dimitrios in Apache-Tips.

Everybody that gets involved into web-programming should have a broad range of skills . Likening web-applications as a chain , knowing the basic concepts of each chain-link is the best way to tackle  “weird” symptoms (if and when they occur) . Even more , when hosting websites on VPS (or the Cloud) , knowing the functionality of each “chain-link” will help us make the optimum combination of all pieces in the aforementioned “chain”. Certainly the web server is an important “link” of the chain  , and the most popular web server on the Internet today is Apache . This article will look a some basic concepts related to it’s internal functionality .

Load only the required modules :

Apache is a modular program where the administrator can customize it’s functionality by selecting a set of modules . These modules can be compiled either statically as part of the ‘httpd’ binary , or as Dynamically loaded Shared Objects (DSO) . DSO modules can either be compiled when the server is built , or added later to extend Apache’s functionality . Installing Apache on your favorite *nix distribution is best made though a package manager (for instance , Centos uses Yum) . Of course , the installation can also be done by downloading and compiling the source code , but this task should be left to those that really know what they are doing  . To find out what modules are loaded by Apache run the following command (example was taken from a CentOs 6 box) :

[root@aws-server]# apachectl -l
Compiled in modules:
[root@aws-server]# apachectl -t -D DUMP_MODULES
Loaded Modules:
 core_module (static)
 mpm_prefork_module (static)
 http_module (static)
 so_module (static)
 auth_basic_module (shared)
 auth_digest_module (shared)
 authn_file_module (shared)
 authn_alias_module (shared)
 authn_anon_module (shared)
 authn_dbm_module (shared)
 authn_default_module (shared)
 authz_host_module (shared)
 authz_user_module (shared)
 authz_owner_module (shared)
 authz_groupfile_module (shared)
 authz_dbm_module (shared)
 authz_default_module (shared)
 ldap_module (shared)
 authnz_ldap_module (shared)
 include_module (shared)
 log_config_module (shared)
 logio_module (shared)
 env_module (shared)
 ext_filter_module (shared)
 mime_magic_module (shared)
 expires_module (shared)
 deflate_module (shared)
 headers_module (shared)
 usertrack_module (shared)
 setenvif_module (shared)
 mime_module (shared)
 dav_module (shared)
 status_module (shared)
 autoindex_module (shared)
 info_module (shared)
 dav_fs_module (shared)
 vhost_alias_module (shared)
 negotiation_module (shared)
 dir_module (shared)
 actions_module (shared)
 speling_module (shared)
 userdir_module (shared)
 alias_module (shared)
 substitute_module (shared)
 rewrite_module (shared)
 proxy_module (shared)
 proxy_balancer_module (shared)
 proxy_ftp_module (shared)
 proxy_http_module (shared)
 proxy_ajp_module (shared)
 proxy_connect_module (shared)
 cache_module (shared)
 suexec_module (shared)
 disk_cache_module (shared)
 cgi_module (shared)
 version_module (shared)
 php5_module (shared)
Syntax OK

As shown in the last example , only a few modules are statistically loaded , all other modules are dynamically loaded (this improves the server performance) .The mod_so module which is statically compiled into the Apache core ,  is responsible for loading any other DSO module (defined by the ‘LoadModule’ command in the ‘httpd.conf’ file) . The last example also shows that a default installation (fresh httpd installation) loads many modules , to improve  the server performance , it’s wise to disable modules that are unnecessary to our web-application .

Choose appropriate MPM :

Each web-server implement a different technique (model) for handling incoming HTTP-requests in parallel . Apache’s  has a few MPM models of operation (Multi Processing Modules)  , and are designed to be powerful , flexible  and can work on a very wide variety of platforms . Pretty popular models used by Apache are Prefork , Worker  and Event , there are also other completely different concurrency models though  (using Asynchronous sockets/IO , as well as ones that mix two or even three models together) . While the Worker MPM model uses Threads , another popular model (Prefork) uses Processes (the default model used on CentOs)  .

A basic explanation of multithreading and multiprocessing :

Multiprocessing is the coordinated processing of multiple instances of a process (program) at the same time . Actually the same process is copied into the memory multiple times and each copy is assigned an unique ID (PID) . Multiprocessing systems are much more complicated than single-process systems because the operating system must allocate resources to competing processes in a reasonable manner. UNIX is one of the most widely used multiprocessing systems , but there are many others , including OS/2 for high-end PCs . Apache’s Prefork  mode is a multiprocessing model as shown below :

[root@aws-server]# top -p $(pgrep -d',' http)
2520 root      20   0  304m  10m 5704 S  0.0  1.8   0:00.03 httpd
2522 apache    20   0  304m 5944  680 S  0.0  1.0   0:00.00 httpd
2523 apache    20   0  304m 5944  680 S  0.0  1.0   0:00.00 httpd
2524 apache    20   0  304m 5944  680 S  0.0  1.0   0:00.00 httpd
2525 apache    20   0  304m 5944  680 S  0.0  1.0   0:00.00 httpd
[root@aws-server]# pstree -p |grep httpd
|             |-httpd(2523)
|             |-httpd(2524)
|             |-httpd(2525)
[root@aws-server]# ps -ef |grep -i httpd
root    2520     1  0 18:08 ?     /usr/sbin/httpd
apache  2522  2520   0 18:08 ?   /usr/sbin/httpd
apache  2523  2520   0 18:08 ?   /usr/sbin/httpd
apache  2524  2520   0 18:08 ?   /usr/sbin/httpd
apache  2525  2520   0 18:08 ?    /usr/sbin/httpd

The prefork MPM uses multiple child processes, each child handles one connection at a time . Prefork is well suited for single or double CPU systems , speed is comparable to that of worker , and it’s highly tolerant of faulty modules and crashing children – but the memory usage is high , and more traffic leads to greater memory usage .
Multithreading is the ability of a program (or an operating system process) to manage its use by multiple requests without having to have multiple copies of the parent process . Each request to the process is kept in track by a thread with a separate identity (all threads have a common parent ) . Apache’s Worker mode is a multithread model as shown below :

[root@aws-server]# top -p $(pgrep -d',' http)
2561 root    	20   0  209m 8028 3688 S  0.0  1.3    httpd.worker
2563 apache    20   0  395m 5812 1352 S  0.0  1.0    httpd.worker
2564 apache    20   0  395m 5816 1356 S  0.0  1.0   httpd.worker
2565 apache    20   0  395m 5816 1356 S  0.0  1.0   httpd.worker
2602 apache    20   0  395m 5820 1360 S  0.0  1.0   httpd.worker

 [root@aws-server]#pstree -p |grep httpd
			|					 +-{httpd.worker}(2569)
|           |                    |-{httpd.worker}(2570)
|           |                    |-{httpd.worker}(2571)
|           |                    |-{httpd.worker}(2572)
|           |                    |-{httpd.worker}(2573)
|           |                    |-{httpd.worker}(2574)
|           |                    |-{httpd.worker}(2575)
|           |                    |-{httpd.worker}(2576)
|           |                    |-{httpd.worker}(2577)
|           |                    |-{httpd.worker}(2578)
|           |                    `-{httpd.worker}(2579)
|           |-httpd.worker(2564)-
|           |                    |-{httpd.worker}(2592)
|           |                    |-{httpd.worker}(2593)
|           |                    |-{httpd.worker}(2594)
|           |                    |-{httpd.worker}(2595)
|           |                    |-{httpd.worker}(2596)
|           |                    |-{httpd.worker}(2597)
|           |                    |-{httpd.worker}(2598)
|           |                    |-{httpd.worker}(2599)
|           |                    |-{httpd.worker}(2600)
|           |                    `-{httpd.worker}(2601)
|           |-httpd.worker(2565)-
|           |                    |-{httpd.worker}(2581)
|           |                    |-{httpd.worker}(2582)
|           |                    |-{httpd.worker}(2583)
|           |                    |-{httpd.worker}(2584)
|           |                    |-{httpd.worker}(2585)
|           |                    |-{httpd.worker}(2586)
|           |                    |-{httpd.worker}(2587)
|           |                    |-{httpd.worker}(2588)
|           |                    |-{httpd.worker}(2589)
|           |                    `-{httpd.worker}(2590)
|            `-httpd.worker(2602)-
|           |                   |-{httpd.worker}(2605)
|           |                   |-{httpd.worker}(2606)
|           |                   |-{httpd.worker}(2607)
|           |                   |-{httpd.worker}(2608)
|           |                   |-{httpd.worker}(2609)
|           |                   |-{httpd.worker}(2610)
|           |                   |-{httpd.worker}(2611)
|           |                   |-{httpd.worker}(2612)
|           |                   |-{httpd.worker}(2613)
|                               `-{httpd.worker}(2614)
[root@aws-server]# ps -ef |grep -i httpd
root   2561     1  0 17:23 ?    /usr/sbin/httpd.worker
apache  2563  2561  0 17:23 ?   /usr/sbin/httpd.worker
apache  2564  2561  0 17:23 ?   /usr/sbin/httpd.worker
apache  2565  2561  0 17:23 ?   /usr/sbin/httpd.worker
apache  2602  2561  0 17:23 ?    /usr/sbin/httpd.worker

As seen in the above example , Apache was configured to spawn four child processes (in the worker MPM mode ) , each child is  multi-threaded  and each thread handles a single connection. Worker  is fast and highly scalable and the memory footprint is comparatively low . This operation mode is well suited for multiple processor systems and high traffic websites . On the other hand , worker is less tolerant of faulty modules , and a faulty thread can affect all the threads in a child process .
Notice  the first Apache process  :  the root user is the owner and it’s parent ID is “1”  . Simple explanation , as our server is configured to listen on a “privileged || well-known port” — 80 — only the “root” user has the right to start those services (the parent of root is init — “1” ) .

Defining  Apache’s MPM model (Prefork vs Worker) :

Both MPM modules are statically compiled into the Apache bin (Prefork and Worker) ,  but only one can be enabled into the server at any time . Finding out which MPM is enabled on current Apache process is simple as running the following command :

[root@aws-server]# apachectl -l
Compiled in modules:

Changing Apache’s mode is simple as opening the proper file (/etc/sysconfig/httpd) and un-commenting a line  (HTTPD=/usr/sbin/httpd.worker) . Of course , the server should be restarted .

[root@aws-server]# vi /etc/sysconfig/httpd
      1 # Configuration file for the httpd service.
      3 #
      4 # The default processing model (MPM) is the process-based
      5 # 'prefork' model.  A thread-based model, 'worker', is also
      6 # available, but does not work with some modules (such as PHP).
      7 # The service must be stopped before changing this variable.
      8 #
      9 #HTTPD=/usr/sbin/httpd.worker

How PHP integrates with Apache :

PHP itself does not respond to the actual HTTP requests — this is the job of the web server . Code that has to be generated dynamically is forwarded by the server to the PHP engine , finally Apache , receive the result and send it back to the browser . There are multiple ways to chain Apache with PHP  , the most popular is “mod_php“. This module is actually PHP itself but compiled as a module for the web server , which is  loaded right inside Apache  . There are other methods for chaining PHP with Apache (CGI , Fast-CGI) , but mod_php is the most popular one .
Here comes the hard part , if Apache is configured to handle concurrency using its Worker MPM mode (using Threads) , then PHP must be able to operate within this same multi-threaded environment (PHP has to be tread-safe to play the game correctly with Apache) . mod_php  is not thread-safe , so it can’t be used with Apache in this specific mode .There are other PHP-modules out there that have have multi-thread functionality (for instance , php-ztc ) . As PHP is also extensible , those extensions you are going to use should also have multi-threaded functionality . Not every 3rd party PHP module is guaranteed to be thread-safe . So  think twice before you change Apache’s default MPM mode (Prefork MPM) .

Final thoughts :

This article has only scratched the surface of this huge topic , it has left out important details (bench-marking , performance) . My intention was to give a general overview about this important subject . So which version should you use (Thread-safe or the non-thread-safe) , I don’t have a answer . I’d guess that the non-thread-safe version (which is the default on most Linux distributions) should be our first choice , and if we really know what we are doing , choosing Worker MPM is a good alternative solution . But as always , first do extensive tests on a development box before those changes are applied into a live server .

Sources  : 


1. Bolic - December 28, 2012

thank you a lot

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: