Apache httpd is the most popular HTTP server, and having Apache httpd on a large installation is a must, just like panettone on Christmas day in Italy. Like the panettone, Apache comes in many flavors and with different fillings. You have to find the one you like.
In this recipe, we configure Apache with mod_proxy
, and refine it through mod_rewrite
rules. This is a simple, but robust solution. It can be used to increase web2py scalability, throughput, security, and flexibility. These rules should satisfy both the connoisseur and
the beginner.
This recipe will show you how to make a web2py installation on a host appear as part of a website, even when hosted somewhere else. We will also show how Apache can be used to improve the performance of your web2py application, without touching web2py.
You should have the following:
web2py installed and running on
localhost
with the built-in Rocket webserver (port 8000)Apache HTTP server (
httpd
) version 2.2.x or latermod_proxy
andmod_rewrite
(included in the standard Apache distribution)
On Ubuntu or other Debian-based servers, you can install Apache with:
apt-get install apache
On CentOS or other Fedora-based Linux distributions, you can install Apache with:
yum install httpd
For most other systems you can download Apache from the website http://httpd.apache.org/, and install it yourself with the provided instructions.
Now that we have Apache HTTP server (from now on we will refer to it simply as Apache) and web2py both running locally, we must configure it.
Apache is configured by placing directives in plain text configuration files. The main configuration file is usually called httpd.conf
. The default location of this file is set at compile time, but may be overridden with the -f
command line flag. httpd.conf
may include other configuration files. Additional directives may be placed in any of these configuration files.
The configuration files may be located in /etc/apache2
, in /etc/apache
, or in /etc/httpd
, depending on the details of the OS and the Apache version.
Before editing any of the files, make sure that the required modules are enabled from the command-line shell (
bash
), type:a2enmod proxy a2enmod rewrite
With
mod_proxy
andmod_rewrite
enabled, we are now ready to set up a simple rewrite rule to proxy forward HTTP requests received by Apache to any other HTTP server we wish. Apache supports multipleVirtualHosts
, that is, it has the ability to handle different virtual host names and ports within a single Apache instance. The defaultVirtualHost
configuration is in a file called/etc/<apache>/sites-available/default
, where<apache>
isapache
,apache2
, orhttpd
.In this file each
VirtualHost
is defined by creating an entry as follows:<VirtualHost *:80> ... </VirtualHost>
You can read the in-depth
VirtualHost
documentation at http://httpd.apache.org/docs/2.2/vhosts/.To use
RewriteRules
, we need to activate the Rewrite Engine inside theVirtualHost
:<VirtualHost *:80> RewriteEngine on ... </VirtualHost>
Then we can configure the rewrite rule:
<VirtualHost *:80> RewriteEngine on # make sure we handle the case with no / at the end of URL RewriteRule ^/web2py$ /web2py/ [R,L] # when matching a path starting with /web2py/ do use a reverse # proxy RewriteRule ^/web2py/(.*) http://localhost:8000/$1 [P,L] ... </VirtualHost>
The second rule tells Apache to do a reverse proxy connection to
http://localhost:8000
, passing all the path components of the URL called by the user, except for the first, web2py. The syntax used for rules is based on regular expressions (regex
), where the first expression is compared to the incoming URL (the one requested by the user).If there is a match, the second expression is used to build a new URL. The flags inside
[and]
determine how the resulting URL is to be handled. The previous example matches any incoming request on the defaultVirtualHost
with a path that begins with/web2py
, and generates a new URL prependinghttp://localhost:8000/
to the remainder of the matched path; the part of the incoming URL that matches the expression.*
replaces$1
in the second expression.The flag
P
tells Apache to use its proxy to retrieve the content pointed by the URL, before passing it back to the requesting browser.Suppose that the Apache Server responds at the domain
www.example.com
; then if the user's browser requestshttp://www.example.com/web2py/welcome
, it will receive a response with the contents from the scaffolding application of web2py. Thats is, it would be as if the browser had requestedhttp://localhost:8000/welcome
.There is a catch: web2py could send an HTTP redirect, for instance to point the user's browser to the default page. The problem is that the redirect is relative to web2py's application layout, the one that the Apache proxy is trying to hide, so the redirect is probably going to point the browser to the wrong location. To avoid this, we must configure Apache to intercept redirects and correct them.
<VirtualHost *:80> ... #make sure that HTTP redirects generated by web2py are reverted / -> /web2py/ ProxyPassReverse /web2py/ http://localhost:8000/ ProxyPassReverse /web2py/ / # transform cookies also ProxyPassReverseCookieDomain localhost localhost ProxyPassReverseCookiePath / /web2py/ ... </VirtualHost>
There is yet another issue. Many URLs generated by web2py are also relative to the web2py context. These include the URLs of images or CSS style sheets. We have to instruct web2py how to write the correct URL, and of course, since it is web2py, it is simple and we do not have to modify any code in our application code. We need to define a file
routes.py
in the root of web2py's installation, as follows:routes_out=((r'^/(?P<any>.*)', r'/web2py/\g<any>'),)
Apache can, at this point, transform the received content before sending it back to the client. We have the opportunity to improve website speed in several ways. For example, we can compress all content before sending it back to the browser, if the browser accepts compressed content.
# Enable content compression on the fly, # speeding up the net transfer on the reverse proxy. <Location /web2py/> # Insert filter SetOutputFilter DEFLATE # Netscape 4.x has some problems... BrowserMatch ^Mozilla/4 gzip-only-text/html # Netscape 4.06-4.08 have some more problems BrowserMatch ^Mozilla/4\.0[678] no-gzip # MSIE masquerades as Netscape, but it is fine BrowserMatch \bMSIE !no-gzip !gzip-only-text/html # Don't compress images SetEnvIfNoCase Request_URI \ \.(?:gif|jpe?g|png)$ no-gzip dont-vary # Make sure proxies don't deliver the wrong content Header append Vary User-Agent env=!dont-vary </Location>
It is possible in the same way, just by configuring Apache, to do other interesting tasks, such as SSL encryption, load balancing, acceleration by content caching, and many other things. You can find information for those and many other setups at http://httpd.apache.org.
Here is the complete configuration for the default VirtualHost as used in the following recipe:
<VirtualHost *:80> ServerName localhost # ServerAdmin: Your address, where problems with the server # should # be e-mailed. This address appears on some server-generated # pages, # such as error documents. e.g. [email protected] ServerAdmin root@localhost # DocumentRoot: The directory out of which you will serve your # documents. By default, all requests are taken from this # directory, # but symbolic links and aliases may be used to point to other # locations. # If you change this to something that isn't under /var/www then # suexec will no longer work. DocumentRoot "/var/www/localhost/htdocs" # This should be changed to whatever you set DocumentRoot to. <Directory "/var/www/localhost/htdocs"> # Possible values for the Options directive are "None", "All", # or any combination of: # Indexes Includes FollowSymLinks # SymLinksifOwnerMatch ExecCGI MultiViews # # Note that "MultiViews" must be named *explicitly* --- # "Options All" # doesn't give it to you. # # The Options directive is both complicated and important. # Please # see http://httpd.apache.org/docs/2.2/mod/core.html#options # for more information. Options Indexes FollowSymLinks # AllowOverride controls what directives may be placed in # .htaccess # It can be "All", "None", or any combination of the keywords: # Options FileInfo AuthConfig Limit AllowOverride All # Controls who can get stuff from this server. Order allow,deny Allow from all </Directory> ### WEB2PY EXAMPLE PROXY REWRITE RULES RewriteEngine on # make sure we handle when there is no / at the end of URL RewriteRule ^/web2py$ /web2py/ [R,L] # when matching a path starting with /web2py/ do a reverse proxy RewriteRule ^/web2py/(.*) http://localhost:8000/$1 [P,L] # make sure that HTTP redirects generated by web2py are reverted # / -> /web2py/ ProxyPassReverse /web2py/ http://localhost:8000/ ProxyPassReverse /web2py/ / # transform cookies also ProxyPassReverseCookieDomain localhost localhost ProxyPassReverseCookiePath / /web2py/ # Enable content compression on the fly speeding up the net # transfer on the reverse proxy. <Location /web2py/> # Insert filter SetOutputFilter DEFLATE # Netscape 4.x has some problems... BrowserMatch ^Mozilla/4 gzip-only-text/html # Netscape 4.06-4.08 have some more problems BrowserMatch ^Mozilla/4\.0[678] no-gzip # MSIE masquerades as Netscape, but it is fine BrowserMatch \bMSIE !no-gzip !gzip-only-text/html # Don't compress images SetEnvIfNoCase Request_URI \ \.(?:gif|jpe?g|png)$ no-gzip dont-vary # Make sure proxies don't deliver the wrong content Header append Vary User-Agent env=!dont-vary </Location> </VirtualHost>
You must restart Apache for any change to take effect. You can use the following command for the same:
apachectl restart