In my first take at this stuff I gave a sample on how to run django projects behind lighttpd with simple FCGI scripts integrated with the server. I will elaborate a bit on this stuff, with a way to combine lighttpd and Django that gives much more flexibility in distributing Django applications over machines. This is especially important if you expect high loads on your servers. Of course you should make use of the Django caching middleware, but there are times when even that is not enough and the only solution is to throw more hardware at the problem.
Update: I maintain my descriptions now in my trac system. See the lighty+FCGI description for Django.
Caveat: since Django is very new software, I don't have production experiences with it. So this is more from a theoretical standpoint, incorporating knowledge I gained with running production systems for several larger portals. In the end it doesn't matter much what your software is - it only matters how you can distribute it over your server farm.
To follow this documentation, you will need the following packages and files installed on your system:
- [Django] itself - currently fetched from SVN. Follow the setup instructions or use
python setup.py install.
- [Flup] - a package of different ways to run WSGI applications. I use the threaded WSGIServer in this documentation.
- [lighttpd] itself of course. You need to compile at least the fastcgi, the rewrite and the accesslog module, usually they are compiled with the system.
- [Eunuchs] - only needed if you are using Python 2.3, because Flup uses socketpair in the preforked servers and that is only available starting with Python 2.4
- [django-fcgi.py] - my FCGI server script, might some day be part of the Django distribution, but for now just fetch it here. Put this script somewhere in your $PATH, for example
/usr/local/binand make it executable.
- If the above doesn't work for any reason (maybe your system doesn't support socketpair and so can't use the preforked server), you can fetch [django-fcgi-threaded.py] - an alternative that uses the threading server with all it's problems. I use it for example on Mac OS X for development.
Before we start, let's talk a bit about server architecture, python and heavy load. The still preferred Installation of Django is behind Apache2 with modpython2. modpython2 is a quite powerfull extension to Apache that integrates a full Python interpreter (or even many interpreters with distinguished namespaces) into the Apache process. This allows Python to control many aspects of the server. But it has a drawback: if the only use is to pass on requests from users to the application, it's quite an overkill: every Apache process or thread will incorporate a full python interpreter with stack, heap and all loaded modules. Apache processes get a bit fat that way.
Another drawback: Apache is one of the most flexible servers out there, but it's a resource hog when compared to small servers like lighttpd. And - due to the architecture of Apache modules - mod_python will run the full application in the security context of the web server. Two things you don't often like with production environments.
So a natural approach is to use lighter HTTP servers and put your application behind those - using the HTTP server itself only for media serving, and using FastCGI to pass on requests from the user to your application. Sometimes you put that small HTTP server behind an Apache front that only uses modproxy (either directly or via modrewrite) to proxy requests to your applications webserver - and believe it or not, this is actually a lot faster than serving the application with Apache directly!
The second pitfall is Python itself. Python has a quite nice threading library. So it would be ideal to build your application as a threaded server - because threads use much less resources than processes. But this will bite you, because of one special feature of Python: the GIL. The dreaded global interpreter lock. This isn't an issue if your application is 100% Python - the GIL only kicks in when internal functions are used, or when C extensions are used. Too bad that allmost all DBAPI libraries use at least some database client code that makes use of a C extension - you start a SQL command and the threading will be disabled until the call returns. No multiple queries running ...
So the better option is to use some forking server, because that way the GIL won't kick in. This allows a forking server to make efficient use of multiple processors in your machine - and so be much faster in the long run, despite the overhead of processes vs. threads.
For this documentation I take a three-layer-approach for distributing the software: the front will be your trusted Apache, just proxying all stuff out to your project specific lighttpd. The lighttpd will have access to your projects document root and wil pass on special requests to your FCGI server. The FCGI server itself will be able to run on a different machine, if that's needed for load distribution. It will use a preforked server because of the threading problem in Python and will be able to make use of multiprocessor machines.
I won't talk much about the first layer, because you can easily set that up yourself. Just proxy stuff out to the machine where your lighttpd is running (in my case usually the Apache runs on different machines than the applications). Look it up in the mod_proxy documentation, usually it's just ProxyPass and ProxyPassReverse.
The second layer is more interesting. lighttpd is a bit weird in the configuration of FCGI stuff - you need FCGI scripts in the filesystem and need to hook those up to your FCGI server process. The FCGI scripts actually don't need to contain any content - they just need to be in the file system.
So we start with your Django project directory. Just put a directory publichtml in there. That's the place where you put your media files, for example the adminmedia directory. This directory will be the document root for your project server. Be sure only to put files in there that don't contain private data - private data like configs and modules better stay in places not accessible by the webserver.
Next set up a lighttpd config file. You only will use the rewrite and the fastcgi modules. No need to keep an access log, that one will be written by your first layer, your apache server. In my case the project is in
/home/gb/work/myproject - you will need to change that to your own situation. Store the following content as
server.modules = ( "mod_rewrite", "mod_fastcgi" ) server.document-root = "/home/gb/work/myproject/public_html" server.indexfiles = ( "index.html", "index.htm" ) server.port = 8000 server.bind = "127.0.0.1" server.errorlog = "/home/gb/work/myproject/error.log"
fastcgi.server = (
"/main.fcgi" => (
"main" => (
"socket" => "/home/gb/work/myproject/main.socket" ) ),
"/admin.fcgi" => (
"admin" => (
"socket" => "/home/gb/work/myproject/admin.socket" ) ) )
url.rewrite = (
mimetype.assign = (
I bind the lighttpd only to the localhost interface because in my test setting the lighttpd runs on the same host as the Apache server. In multi server settings you will bind to the public interface of your lighttpd servers, of course. The FCGI scripts communicate via sockets in this setting, because in this test setting I only use one server for everything. If your machines would be distributed, you would use the "host" and "port" settings instead of the "socket" setting to connect to FCGI servers on different machines. And you would add multiple entries for the "main" stuff, to distribute the load of the application over several machines. Look it up in the lighttpd documentation what options you will have.
I set up two FCGI servers for this - one for the admin settings and one for the main settings. All applications will be redirected through the main settings FCGI and all admin requests will be routed to the admin server. That's done with the two rewrite rules - you will need to add a rewrite rule for every application you are using.
Since lighttpd needs the FCGI scripts to exist to pass along the PATH_INFO to the FastCGI, you will need to touch the following files:
They don't need to contain any code, they just need to be listed in the directory.
Starting with lighttpd 1.3.16 (at the time of this writing only in svn) you will be able to run without the stub files for the .fcgi - you just add
"check-local" => "disable" to the two FCGI settings. Then the local files are not needed.
So if you want to extend this config file, you just have to keep some very basic rules in mind:
- every settings file needs it's own .fcgi handler
- every .fcgi needs to be touched in the filesystem - this might go away in a future version of lighttpd, but for now it is needed
- load distribution is done on .fcgi level - add multiple servers or sockets to distribute the load over several FCGI servers
- every application needs a rewrite rule that connects the application with the .fcgi handler
Now we have to start the FCGI servers. That's actually quite simple, just use the provided django-fcgi.py script as follows:
django-fcgi.py --settings=myproject.work.main --socket=/home/gb/work/myproject/main.socket --minspare=5 --maxspare=10 --maxchildren=100 --daemon
django-fcgi.py --settings=myproject.work.admin --socket=/home/gb/work/myproject/admin.socket --maxspare=2 --daemon
Those two commands will start two FCGI server processes that use the given sockets to communicate. The admin server will only use two processes - this is because often the admin server isn't the server with the many hits, that's the main server. So the main server get's a higher-than-default setting for spare processes and maximum child processes. Of course this is just an example - tune it to your needs.
The last step is to start your lighttpd with your configuration file:
lighttpd -f /home/gb/work/myproject/lighttpd.conf
That's it. If you now access either the lighttpd directly at http://localhost:8000/polls/ or through your front apache, you should see your application output. At least if everything went right and I didn't make too much errors.