Mastering Chef

As we learned earlier while understanding terminology, a chef-client is an agent that runs on machines that are meant to be configured using Chef. The chef-client agent is meant to be executed in an environment where we are using Chef in a client-server architecture.

Upon the invocation of a chef-client, the following things happen:

Ohai is executed and automatic attributes are collected, which are eventually used to build a node object
Authentication with a chef-server
Synchronization of cookbooks
Loading of cookbooks and convergence
Checking for the status of chef-client run, reporting, and exception handling.

The chef-client, by default, looks for a configuration file named client.rb. On Linux/Unix-based machines this file is located at /etc/chef/client.rb. On Windows, this file is located at C:\chef\client.rb.

The chef-client command supports many options. The following option indicates which configuration file to use. By default, /etc/chef/client.rb is used for the purpose of a Chef run:

-c CONFIG, --config CONFIG

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

The following option indicates that chef-client will be executed as a daemon and not as a foreground process. This option is only available on Linux/Unix. To run chef-client as a service in a Windows environment, use the chef-client::service recipe in the chef-client cookbook:

-d, --daemonize

The following option specifies the name of the environment:

-E ENVIRONMENT, --environment ENVIRONMENT

By default, a chef-client run forks a process where the cookbooks are executed. This helps prevent issues such as memory leaks and also helps to run a chef code with a steady amount of memory:

-f, --fork

The following option specifies the output format: summary (default), .json, .yaml, .txt, and .pp:

-F FORMAT, --format FORMAT

The following option indicates that the formatter output will be used instead of the logger output:

--force-formatter

The following option indicates that the logger output will be used instead of the formatter output:

--force-logger

The following option specifies a path to a JSON file, which will be used to override attributes and maybe specify run_list as well:

-j PATH, --json-attribute PATH

The following option specifies the location of a file containing a client key. The default location is /etc/chef/client.pem:

-k KEYFILE, --client KEYFILE

When a chef-client first registers a new machine with a chef-server, it doesn't have /etc/chef/client.pem. It contacts the chef-server with a key called validation_key (default location: /etc/chef/validation.pem). Upon contacting the chef-server, the chef-server responds with a new client key, which is stored in /etc/chef/client.pem. Going forward, every communication with a chef-server is authenticated with /etc/chef/client.pem:

-K KEYFILE, --validation_key KEYFILE

The following option is the name with which a machine is registered with a chef-server. The default name of the node is FQDN:

-N NODENAME, --node-name NODENAME

The following command replaces the current run list with specified items:

-o RUN_LIST_ITEM, --override-runlist RUN_LIST_ITEM

The following option provides a number in seconds to add an interval that determines how frequently a chef-client is executed. This option is useful when a chef-client is executed in daemon mode:

-s SECONDS, -splay SECONDS

The following command indicates that the chef-client executable will be run in the why-run mode. It's a dry-run mode where a chef-client run does everything, but it doesn't modify the system:

-W, --why-run

The following command specifies the location in which process identification number (PID) is saved. This is useful to manage a chef daemon via a process management system such as Monit:

-P PID_FILE, --pid PID_FILE

Let's presume we've already written a cookbook to install and configure a popular web server called Nginx.

We will create two files on our target machine:

client.rb: For our setup, the location will be /etc/chef/client.rb. It is a default configuration that will be used by a chef-client executable:
```
log_level        :info
log_location     "/var/log/chef.log"
chef_server_url  "http://chef-server:4000"
environment      "production"
```
As you can see, we've mentioned in our configuration that log_level is INFO, the log file is stored at /var/log/chef.log, chef-client will connect to a Chef server hosted at a machine accessible by the name chef-server, and finally we have our setup distributed across different environments and this machine is in the production environment.
roles.json: For our setup, the location will be /etc/chef/roles.json. This is a .json file that defines attributes, and a run_list which will be used to fetch the concerned cookbooks from a chef-server and the bootstrap machine;
```
{
  "run_list":["role[webserver]"],
  "app_user": "www-data",
  "log_dir": "/var/log",
}
```
As you can see, we've defined a run_list that comprises of a role called webserver. Along with this, we've specified two attributes: app_user and log_dir.

With client.rb and roles.json in place, now you can run chef-client as follows:

#chef-client –j /etc/chef/roles.json

The following image describes the steps as they happen during the chef-client run:

Let's look at each step closely.

Step 1 – Building a node object

As a first step, a chef-client will build the node object. To do this, the system is profiled first by Ohai.

Ohai returns a bunch of information about the system in a .json format. The following is an output from the Ohai run on our chef-eg01 instance:

# ohai
{
  "languages": {
    "ruby": {
      "platform": "x86_64-linux",
      "version": "2.1.0",
      "release_date": "2013-12-25",
      . . .
    },
    "python": {
      "version": "2.6.6",
      "builddate": "Jun 18 2012, 14:18:47"
    },
    "perl": {
      "version": "5.10.1",
      "archname": "x86_64-linux-thread-multi"
    },
    "lua": {
      "version": "5.1.4"
    },
    "java": {
      "version": "1.7.0_09",
      "runtime": {
        "name": "Java(TM) SE Runtime Environment",
        "build": "1.7.0_09-b05"
      },
      "hotspot": {
        "name": "Java HotSpot(TM) 64-Bit Server VM",
        "build": "23.5-b02, mixed mode"
      }
    }
  },
  "kernel": {
    "name": "Linux",
    "release": "2.6.32-220.23.1.el6.x86_64",
    "version": "#1 SMP Mon Jun 18 18:58:52 BST 2012",
    "machine": "x86_64",
   },
    "os": "GNU/Linux"
  },
  "os": "linux",
  "os_version": "2.6.32-220.23.1.el6.x86_64",
  "lsb": {
    "id": "CentOS",
    "description": "CentOS release 6.2 (Final)",
    "release": "6.2",
    "codename": "Final"
  },
  . . .
  "chef_packages": {
    "ohai": {
      "version": "6.14.0",
      "ohai_root": "/usr/local/rvm/gems/ruby-2.1.0/gems/ohai-6.14.0/lib/ohai"
    },
    "chef": {
      "version": "11.10.4",
      "chef_root": "/usr/local/rvm/gems/ruby-2.1.0/gems/chef-11.10.4/lib"
    }
  },
  "hostname": "chef-eg01",
  "fqdn": "chef-eg01.sychonet.com",
  "domain": "sychonet.com",
  "network": {
    "interfaces": {
      "lo": {
      . . .
      },
      "eth0": {
      . . .
      }
  },
  "ipaddress": "10.0.0.42",
  "macaddress": "0A:F8:4C:7A:C3:B2",
  "ohai_time": 1397945435.3669002,
  "dmi": {
    "dmidecode_version": "2.11"
  },
  "keys": {
    "ssh": {
      "host_dsa_public":"XXXXXXX",
      "host_rsa_public":"XXXXXXX
    }
  },
  . . .
}

As we can see, Ohai gave us plenty of useful information about our machine, such as the different language interpreters installed on the system, kernel version, OS platform and release, network, SSH keys, disks, RAM, and so on. All this information, that is automatic attributes, along with the node name, is used to build and register a node object with a chef-server. The default name of the node object is FQDN, as returned by Ohai. However, we can always override the node name in the client.rb configuration file.

Step 2 – Authenticate

We won't want our private chef-server to be responding to requests made by anyone. To accomplish this, each request to the Chef server is accompanied with some headers encrypted using the private key (client.pem).

As part of this step, a chef-client checks the presence of the /etc/chef/client.pem file, which is used for the purpose of authentication.

If no client.pem is present, a chef-client looks for a /etc/chef/validation.pem file, which is a private key assigned to the chef-validator. Once the chef-validator has authenticated itself to a chef-server, a chef-server creates a public/private key pair. The chef-server keeps a public key with itself, while a private key is sent back to a chef-client. After this step, our node object built in step 1 is registered with the chef-server.

Note

After the initial chef-client run is over, the chef-validator key is no longer required and can (ideally should) be deleted from the machine.

Step 3 – Synchronization of cookbooks

Now, since we are authenticated, we can go about fetching cookbooks from a chef-server. However, to send cookbooks to the relevant instance, a chef-server has to know which cookbooks to send across.

In this step, a chef-client fetches a node object from the chef-server. A node object defines what is in run_list and what attributes are associated with the node. A run_list list defines what cookbooks will be downloaded from a chef-server.

The following is what we have in our run_list:

"run_list":["role[webserver]"]

Our run_list comprises of one element called role[webserver]. A role is a way in which the Chef world organizes cookbooks together under one hood. Here is what our role looks like:

webserver.rb
# Role Name:: webserver
# Copyright 2014, Sychonet
# Author: [email protected]

name "webserver"
description "This role configures nginx webserver"

run_list  "recipe[nginx]","recipe[base]"
override_attributes(
  :app => {
    :base => "/apps",
    :user => "ubuntu",
    :group => "ubuntu",
    :log => "/var/log/nginx",
    :data => "/data"
  }
)

Our role has run_list, which comprises of two elements: recipe[passenger-nginx] and recipe[base]. These recipes contain code that will be used to bootstrap a machine using Chef. Along with this, we've a few attributes:

node[:app][:base] = "/apps"
node[:app][:user] = "Ubuntu"
node[:app][:group] = "Ubuntu"
node[:app][:log] = "/var/log/nginx"
node[:app][:data] = "/data"

We will be using these attributes in our recipes to set up a machine according to our requirements. These attributes may already be defined in our cookbook and if they are, then they are overridden here.

Here is what a typical node json object looks like:

{
  "name": "chef-eg01.sychonet.com",
  "json_class": "Chef::Node",
  "chef_type": "node",
  "chef_environment": "production",
  "automatic": { . . . },
  "default": { . . . },
  "normal": { . . . },
  "override": { . . . },
  "run_list": [ . . . ]
}

Once the chef-client has obtained the node json object from the chef-server, it expands run_list. The run_list defined in a node object contains roles and recipes, and roles contain run_list that again contains further roles and recipes. During the execution of a chef-client, run_list gets expanded to the level of recipes.

Now, with a list of recipes to be executed on the machine, a chef-client downloads all the cookbooks mentioned in the expanded run_list from the chef server. Some cookbooks might not really be defined in run_list, but might be part of a dependency and those cookbooks are also downloaded as part of this event. A chef server maintains different versions of cookbooks and hence, if we want, we can request a specific version of a cookbook by specifying it as part of run_list, as follows:

{"run_list":["recipe[[email protected]]"]}

This will set up version 1.4.2 of the nginx recipe. We can also mention a version in the dependency or environment as follows:

depends "nginx", "= 1.4.2"

Alternatively, we can use the following code:

cookbook "nginx", "= 1.4.2"

Downloaded cookbooks are saved in a local filesystem on a machine at the location specified by file_cache_path, defined in client.rb (defaults to /var/chef/cache).

Upon subsequent chef-client runs, the cookbooks that haven't changed since the last run aren't downloaded and only the changed cookbooks are resynced.

Step 4 – Loading of cookbooks and convergence

Now, with all the cookbooks synchronized, a chef-client loads the components in the following order:

Libraries: Theses are loaded first so that all language extensions and Ruby classes are available.
Attributes: An attribute file updates node attributes and recipes.
Definitions: Theses must be loaded before recipes because they create new pseudo-resources.
Recipes: At this point, recipes are evaluated. Nothing is done with any resource defined in the recipe.

Recipes are loaded in the order they are specified in run_list. This is a very important concept to grasp because it can be a deal breaker if not understood properly. Let's look at our run_list in /etc/chef/roles.json:

"run_list":["role[webserver]"]

The webserver role in turn defines the following run_list:

run_list  "recipe[nginx]","recipe[base]"

This implies that the expanded run_list will look something like the following:

run_list  "recipe[nginx]","recipe[base]"

Now, if there are things mentioned in recipe[nginx] that require things that are being set up in recipe[base], then our Chef run will fail. For example, say we are setting up a user www-data in recipe[base] and we need Nginx to be started as a service with the user www-data in recipe[nginx], then it won't work because the www-data user won't be created until the base recipe is executed and it'll only be executed once recipe[nginx] has been executed.

At this point in time, all the evaluated resources found in recipes are put in resource collection, which is an array of each evaluated resource. Any external Ruby code is also executed at this point in time.

Now, with resource collection ready for use, a Chef run reaches a stage of execution.

Chef iterates through a resource collection in the following order:

It runs specified actions for each resource
A provider knows how to perform actions

Step 5 – Reporting and exception handling

Once a chef-client run has ended, the status of the run is checked. If there has been an error, Chef exits with unhandled exception and we can write exception handlers to handle such situations. For example, we might want to notify a system administrator about an issue with the chef-client run.

In the event of success as well, we might want to do certain things and this is handled via report handlers. For example, we might want to push a message to a queue saying that a machine has been bootstrapped successfully.

Using chef-solo

chef-solo is another executable that can be used to bootstrap any machine using cookbooks.

There are times when the need for a chef-server just isn't there, for example, when testing a newly written Chef cookbook on a virtual machine. During these times, we can't make use of a chef-client, as a chef-client requires a chef-server to communicate with.

The chef-solo allows using cookbooks with nodes without requiring a chef-server. It runs locally and requires those cookbooks (along with dependencies) to be present locally on the machine too.

Other than this difference, the chef-solo doesn't provide support for the following features:

Search
Authentication or authorization
Centralized distribution of cookbooks
Centralized API to interact with different infrastructure components.

The chef-solo can pick up cookbooks from either a local directory or URL where a tar.gz archive of the cookbook is present.

The chef-solo command uses the /etc/chef/solo.rb configuration file, or we can also specify an alternate path for this configuration file using the –config option during the chef-solo execution.

The chef-solo, by default, will look for data bags at /var/chef/data_bags. However, this location can be changed by specifying an alternate path in the data_bag_path attribute defined in solo.rb. The chef-solo picks up roles from the /var/chef/roles folder, but this location again can be modified by specifying an alternate path in the role_path attribute in solo.rb.

Other than the options supported by a chef-client, the chef-solo executable supports the following option:

-r RECIPE_URL, --recipe-url RECIPE_URL

A URL from where a remote cookbook's tar.gz will be downloaded.

For example:

#chef-solo –c ~/solo.rb –j ~/node.json –r http://repo.sychonet.com/chef-solo.tar.gz

The tar.gz file is first archived into file_cache_path and finally, extracted to cookbook_path.

Now that we understand how the Chef run happens, let's get our hands dirty and go about setting up our developer workstation.

Mastering Chef

By : Mayank Joshi

Mastering Chef

By: Mayank Joshi

Overview of this book

Related Content you might be interested in

Current Title:

Mastering Chef

The anatomy of a Chef run

A Chef run using chef-client

Tip

Step 1 – Building a node object

Step 2 – Authenticate

Note

Step 3 – Synchronization of cookbooks

Step 4 – Loading of cookbooks and convergence

Step 5 – Reporting and exception handling

Using chef-solo