Book Image

Nginx Essentials

By : Valery Kholodkov, Valery I Kholodkov
Book Image

Nginx Essentials

By: Valery Kholodkov, Valery I Kholodkov

Overview of this book

Table of Contents (13 chapters)

Configuring Nginx


Now that you know how to install Nginx and the structure of its installation, we can study how to configure Nginx. Simplicity of configuration is one of the reasons Nginx is popular among webmasters, because this saves them a lot of time.

In a nutshell, Nginx configuration files are simply sequences of directives that can take up to eight space-separated arguments, for example:

gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;

In the configuration file, the directives are delimited by a semicolon (;) from one another. Some of the directives may have a block instead of a semicolon. A block is delimited by curly brackets ({}). A block can contain arbitrary text data, for example:

types {
    text/html                            html htm shtml;
    text/css                              css;
    text/xml                              xml;
    image/gif                            gif;
    image/jpeg                         jpeg jpg;
    application/x-javascript      js;
    application/atom+xml        atom;
    application/rss+xml            rss;
}

A block can also contain a list of other directives. In this case, the block is called a section. A section can enclose other sections, thus establishing a hierarchy of sections.

Most important directives have short names; this reduces the effort required to maintain the configuration file.

Value types

In general, a directive can have arbitrary quoted or unquoted strings as arguments. But many directives have arguments that have common value types. To help you quickly get your head around the value types I listed them in the following table:

Value type

Format

Example of a value

Flag

[on|off]

on, off

Signed integer

-?[0-9]+

1024

Size

[0-9]+([mM]|[kK])?

23M, 12348k

Offset

[0-9]+([mM]|[kK]|[gG])?

43G, 256M

Milliseconds

[0-9]+[yMwdhms]?

30s, 60m

Variables

Variables are named objects that can be assigned a textual value. Variables can only appear inside the http section. A variable is referred to by its name, prefixed by the dollar ($) symbol. Alternatively, a variable reference can enclose a variable name in curly brackets to prevent merging with surrounding text.

Variables can be used in any directive that accepts them, as shown here:

proxy_set_header Host $http_host;

This directive sets the HTTP header host in a forwarded request to HTTP host name from the original request. This is equivalent to the following:

proxy_set_header Host ${http_host};

With the following syntax, you can specify the host name:

proxy_set_header Host ${http_host}_squirrel;

The preceding command will append a string _squirrel to the value of the original host name. Without curly brackets, the string _squirrel would have been interpreted as a part of the variable name, and the reference would have pointed to a variable "http_host_squirrel" rather than http_host.

There are also special variable names:

  • Variables from $1 to $9 refer to the capture arguments in the regular expressions, as shown here:

            location ~ /(.+)\.php$ {
                [...]
                proxy_set_header X-Script-Name $1;
            }

    The preceding configuration will set the HTTP header X-Script-Name in the forwarded request to the name of the PHP script in the request URI. The captures are specified in a regular expression using round brackets.

  • Variables that start with $arg_ refer to the corresponding query argument in the original HTTP request, as shown here:

            proxy_set_header X-Version-Name $arg_ver;

    The preceding configuration will set the HTTP header X-Version-Name in the forwarded request to the value of the ver query argument in the original request.

  • Variables that start with $http_ refer to the corresponding HTTP header line in the original request.

  • Variables that start with $sent_http_ refer to the corresponding HTTP header line in the outbound HTTP request.

  • Variables that start with $upstream_http_ refer to the corresponding HTTP header line in the response received from an upstream.

  • Variables that start with $cookie_ refer to the corresponding cookie in the original request.

  • Variables that start with $upstream_cookie_ refer to the corresponding cookie in the response received from an upstream.

Variables must be declared by Nginx modules before they can be used in the configuration. Built-in Nginx modules provide a set of core variables that allow you to operate with the data from HTTP requests and responses. Refer to the Nginx documentation for the complete list of core variables and their functions.

Third-party modules can provide extra variables. These variables have to be described in the third-party module's documentation.

Inclusions

Any Nginx configuration section can contain inclusions of other files via the include directive. This directive takes a single argument containing a path to a file to be included, as shown here:

/*
 * A simple relative inclusion. The target file's path
 * is relative to the location of the current configuration file.
 */
include mime.types;

/*
 * A simple inclusion using an absolute path.
 */
include /etc/nginx/conf/site-defaults.conf;

Once specified, the include directive instructs Nginx to process the contents of the file or files specified by the argument of this directive as if they were specified in place of the include directive.

Note

Relative paths are resolved with respect to the path of the configuration file the directive is specified in. This is good to keep in mind when the include directive is specified in another included file, such as when a virtual host configuration file contains a relative include directive.

The include directive can also contain a globbed path with wild cards, either relative or absolute. In this case, the globbed path is expanded and all files matching the specified pattern are included in no particular order. Take a look at the following code:

/*
 * A simple glob inclusion. This will include all files
 * ending on ".conf" located in /etc/nginx/sites-enabled
 */
include /etc/nginx/sites-enabled/*.conf;

The include directive with wild cards is an obvious solution for including site configurations, as their number can vary greatly. Using the include directive, you can properly structure the configuration file or reuse certain parts multiple times.

Sections

A section is a directive that encloses other directives in its block. Each section's delimiters must be located in the same file, while the content of a section can span multiple files via the include directive.

It is not possible to describe every possible configuration directive in this chapter. Refer to the Nginx documentation for more information. However, I will quickly go over the Nginx configuration section types so that you can orient in the structure of the Nginx configuration files.

The http section

The http section enables and configures the HTTP service in Nginx. It has the server and upstream declarations. As far as individual directives are concerned, the http section usually contains those that specify defaults for the entire HTTP service.

The http section must contain at least one server section in order to process HTTP requests. Here is a typical layout of the http section:

 http {
     [...]
     server {
         [...]
     }
 }

Here and in other examples of this book, we use […] to refer to omitted irrelevant parts of the configuration.

The server section

The server section configures an HTTP or HTTPS virtual host and specifies listening addresses for them using the listen directive. At the end of the configuration stage, all listening addresses are grouped together and all listening addresses are activated at startup.

The server section contains the location sections, as well as sections that can be enclosed by the location section (see description of other sections types for details). Directives that are specified in the server section itself go into the so-called default location. In that regard, the server section serves the purpose of the location section itself.

When a request comes in via one of the listening addresses, it is routed to the server sections that match a virtual host pattern specified by the server_name directive. The request is then routed further to the location that matches the path of the request URI or processed by the default location if there is no match.

The upstream section

The upstream section configures a logical server that Nginx can pass requests to for further processing. This logical server can be configured to be backed by one or more physical servers external to Nginx with concrete domain names or IP addresses.

Upstream can be referred to by name from any place in the configuration file where a reference to a physical server can take place. In this way, your configuration can be made independent of the underlying structure of the upstream, while the upstream structure can be changed without changing your configuration.

The location section

The location section is one of the workhorses in Nginx. The location directive takes parameters that specify a pattern that is matched against the path of the request URI. When a request is routed to a location, Nginx activates configuration that is enclosed by that location section.

There are three types of location patterns: simple, exact, and regular expression location patterns.

Simple

A simple location has a string as the first argument. When this string matches the initial part of the request URI, the request is routed to that location. Here is an example of a simple location:

        location /images {
            root /usr/local/html/images;
        }

Any request with a URI that starts with /images, such as /images/powerlogo.png, /images/calendar.png, or /images/social/github-icon.png will be routed to this location. A URI with a path that equals to /images will be routed to this location as well.

Exact

Exact locations are designated with an equals (=) character as the first argument and have a string as the second argument, just like simple locations do. Essentially, exact locations work just like simple locations, except that the path in the request URI has to match the second argument of the location directive exactly in order to be routed to that location:

        location = /images/empty.gif {
            emptygif;
        }

The preceding configuration will return an empty GIF file if and only if the URI /images/empty.gif is requested.

Regular expression locations

Regular expression locations are designated with a tilde (~) character or ~* (for case-insensitive matches) as the first argument and have a regular expression as the second argument. Regular expression locations are processed after both simple and exact locations. The path in the request URI has to match the regular expression in the second argument of the location directive in order to be routed to that location. A typical example is as follows:

        location ~ \.php$ {
            [...]
        }

According to the preceding configuration, requests with URIs that end with .php will be routed to this location.

The location sections can be nested. For that, you just need to specify a location section inside another location section.

The if section

The if section encloses a configuration that becomes active once a condition specified by the if directive is satisfied. The if section can be enclosed by the server and location sections, and is only available if the rewrite module is present.

A condition of an if directive is specified in round brackets and can take the following forms:

  • A plain variable, as shown here:

    if ($file_present) {
        limit_rate 256k;
    }

    If the variable evaluates to true value in runtime, the configuration section activates.

  • A unary expression that consists of an operator and a string with variables, as shown here:

    if ( -d "${path}" ) {
        try_files "${path}/default.png" "${path}/default.jpg";
    }

    The following unary operators are supported:

    Operator

    Description

    Operator

    Description

    -f

    True if specified file exists

    !-f

    True if specified file does not exist

    -d

    True if specified directory exists

    !-d

    True if specified directory does not exist

    -e

    True if specified file exists and is a symbolic link

    !-e

    True if specified file does not exist or is not a symbolic link

    -x

    True if specified file exists and is executable

    !-x

    True if specified file does not exist or is not executable

  • A binary expression that consists of a variable name, an operator, and a string with variables. The following binary operators are supported:

    Operator

    Description

    Operator

    Description

    =

    True if a variable matches a string

    !=

    True if a variable does not match a string

    ~

    True if a regular expression matches the value of a variable

    !~

    True if a regular expression does not match the value of a variable

    ~*

    True if a case-insensitive regular expression matches the value of a variable

    !~*

    True if a case-insensitive regular expression does not match the value of a variable

Let's discuss some examples of the if directive.

This one adds a prefix /msie/ to the URL of any request that contains MSIE in the user-agent field:

if ($http_user_agent ~ MSIE) {
    rewrite ^(.*)$ /msie/$1 break;
}

The next example sets the variable $id to the value of the cookie named id, if it is present:

if ($http_cookie ~* "id=([^;]+)(?:;|$)") {
    set $id $1;
}

The next one returns HTTP status 405 ("Method Not Allowed") for every request with the method POST:

if ($request_method = POST) {
    return 405;
}

Finally, the configuration in the following example limits the rate to 10 KB whenever the variable $slow evaluates to true:

if ($slow) {
    limit_rate 10k;
}

The if directive seems like a powerful instrument, but it must be used with caution. This is because the configuration inside the if section is not imperative, that is, it does not alter the request processing flow according to the order of the if directives.

Note

Because of the nonintuitive behavior of the if directive, its use is discouraged.

Conditions are not evaluated in the order they are specified in the configuration file. They are merely applied simultaneously and configuration settings from the sections for which conditions were satisfied are merged together and applied at once.

The limit_except section

The limit_except section activates the configuration that it encloses if the request method does not match any from the list of methods specified by this directive. Specifying the GET method in the list of methods automatically assumes the HEAD method. This section can only appear inside the location section, as shown here:

limit_except GET {
    return 405;
}

The preceding configuration will respond with HTTP status 405 ("Method Not Allowed") for every request that is not made using the GET or HEAD method.

Other section types

Nginx configuration can contain other section types, such as main and server in the main section, as well as section types provided by third-party modules. In this book, we will not pay close attention to them.

Refer to the documentation of the corresponding modules for information about these types of configuration sections.

Configuration settings' inheritance rules

Many Nginx configuration settings can be inherited from a section of outer level to a section of inner level. This saves a lot of time when you configure Nginx.

The following figure illustrates how inheritance rules work:

All settings can be attributed to three categories:

  • Those that make sense only in the entire HTTP service (marked red)

  • Those that make sense in the virtual host configuration (marked blue)

  • Those that make sense on all levels of configuration (marked green)

The settings from the first category do not have any inheritance rules, because they cannot inherit values from anywhere. They can be specified in the http section only and can be applied to the entire HTTP service. These are settings set by directives, such as variables_hash_max_size, variables_hash_bucket_size, server_names_hash_max_size, and server_names_hash_bucket_size.

The settings from the second category can inherit values only from the http section. They can be specified both in the http and server sections, but the settings applied to a given virtual host are determined by inheritance rules. These are settings set by directives, such as client_header_timeout, client_header_buffer_size, and large_client_header_buffers.

Finally, the settings from the third category can inherit values from any section up to http. They can be specified in any section inside the HTTP service configuration, and the settings applied to a given context are determined by inheritance rules.

The arrows on the figure illustrate value propagation paths. The colors of the arrows specify the scope of the setting. The propagation rules along a path are as follows:

When you specify a value for a parameter at a certain level of the configuration, it overrides the value of the same parameter at the outer levels if it is set, and automatically propagates to the inner levels of the configuration. Let's take a look at the following example:

location / {
    # The outer section
    root /var/www/example.com;
    gzip on;

    location ~ \.js$ {
        # Inner section 1
        gzip off;

    }
    location ~ \.css$ {
        # Inner section 2
    }
    [...]
}

The value of the root directive will propagate to the inner sections, so there is no need to specify it again. The value of the gzip directive in the outer section will propagate to the inner sections, but will be overridden by the value of the gzip directive inside the first inner section. The overall effect of that will be that gzip compression will be enabled everywhere in the other section, except for the first inner section.

When a value for some parameter is not specified in a given configuration section, it is inherited from a section that encloses the current configuration section. If the enclosing section does not have this parameter set, the search goes to the outer level and so on. If a value for a certain parameter is not specified at all, a built-in default value is used.

The First sample configuration

By this point in the chapter, you might have accumulated a lot of knowledge without having an idea of what a complete working configuration looks like. We will study a short but functioning configuration that will give you an idea of what a complete configuration file might look like:

error_log logs/error.log;

events {
    use epoll;
    worker_connections  1024;
}

http {
    include           mime.types;
    default_type      application/octet-stream;

    server {
        listen      80;
        server_name example.org www.example.org;

        location / {
            proxy_pass http://localhost:8080;
            include proxy_params;
        }

        location ~ ^(/images|/js|/css) {
            root html;
            expires 30d;
        }
    }
}

This configuration first instructs Nginx to write the error log to logs/error.log. Then, it sets up Nginx to use the epoll event processing method (use epoll) and allocates memory for 1024 connections per worker (worker_connections 1024). After that, it enables the HTTP service and configures certain default settings for the HTTP service (include mime.types, default_type application/octet-stream). It creates a virtual host and sets its names to example.org and www.example.org (server_name example.org www.example.org). The virtual host is made available at the default listening address 0.0.0.0 and port 80 (listen 80).

We then configure two locations. The first location passes every request routed to it into a web application server running at http://localhost:8080 (proxy_pass http://localhost:8080). The second location is a regular expression location. By specifying it we effectively exclude a set of paths from the first location. We use this location to return static data such as images, JavaScript files, and CSS files. We set the base directory for our media files as html (root html). For all media files, we set the expiration date as 30 days (expires 30d).

To try out this configuration, back up your default configuration file and replace the content of the default configuration file with the preceding configuration.

Then, restart Nginx for the settings to take effect. After this is done, you can navigate to the URL http://localhost/ to check out your new configuration.

Configuration best practices

Now that you know more about the elements and structure of the Nginx configuration file, you might be curious about what best practices exist in this area. Here is a list of recommendations that will help you to maintain your configuration more efficiently and make it more robust and manageable:

  • Structure your configuration well. Observe which common parts of the configuration are used more often, move them to separate files, and reuse them using the include directive. In addition to that, try to make each file in your configuration file hierarchy of a reasonable length, ideally no more than two screens. This will help you to read your files quicker and navigate over them efficiently.

    Note

    It is important to know exactly how your configuration works to successfully manage it. If the configuration doesn't work the way you expect, you might run into issues due to wrong settings being applied, for example, unavailability of arbitrary URIs, unexpected outages, and security loopholes.

  • Minimize use of the if directive. The if directive has a nonintuitive behavior. Try to avoid using it whenever possible to make sure configuration settings are applied to the incoming requests as you expect.

  • Use good defaults. Experiment with inheritance rules and try to come up with defaults for your settings so that they result in the least number of directives to be configured. This includes moving common settings from location to the server level and further to the HTTP level.