Book Image

Puppet 5 Essentials - Third Edition

By : Felix Frank
Book Image

Puppet 5 Essentials - Third Edition

By: Felix Frank

Overview of this book

Puppet is a configuration management tool that allows you to automate all your IT configurations, giving you control over what you do to each Puppet Agent in a network, and when and how you do it. In this age of digital delivery and ubiquitous Internet presence, it's becoming increasingly important to implement scaleable and portable solutions, not only in terms of software, but also the system that runs it. This book gets you started quickly with Puppet and its tools in the right way. It highlights improvements in Puppet and provides solutions for upgrading. It starts with a quick introduction to Puppet in order to quickly get your IT automation platform in place. Then you learn about the Puppet Agent and its installation and configuration along with Puppet Server and its scaling options. The book adopts an innovative structure and approach, and Puppet is explained with flexible use cases that empower you to manage complex infrastructures easily. Finally, the book will take readers through Puppet and its companion tools such as Facter, Hiera, and R10k and how to make use of tool chains.
Table of Contents (10 chapters)

Controlling the order of execution

With what you've seen this far, you might have got the impression that Puppet's DSL is a specialized scripting language. That is actually quite far from the truth. A manifest is not a script or program. The language is a tool to model a system state through a set of resources, including files, packages, and cron jobs, among others.

The whole paradigm is different from that of scripting languages. Ruby or Perl are imperative languages that are based around statements that will be evaluated in a strict order. The Puppet DSL is declarative, which means that the manifest declares a set of resources that are expected to have certain properties. These resources are put into a catalog, and Puppet then tries to build a path through all declared resources. The compiler parses the manifests in order, but the configurer applies resources in a very different way.

In other words, the manifests should always describe what you expect to be the end result. The specifics of what actions need to be taken to get there are decided by Puppet.

To make this distinction more clear, let's look at an example:

package { 'haproxy':
ensure => 'installed',
}
file {'/etc/haproxy/haproxy.cfg':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
source => 'puppet:///modules/haproxy/etc/haproxy/haproxy.cfg',
}
service { 'haproxy':
ensure => 'running',
}

With this manifest, Puppet will make sure that the following state is reached:

  1. The HAproxy package is installed.
  2. The haproxy.cfg file has specific content, which has been prepared in a file in /etc/puppet/modules/.
  3. HAproxy is started.

To make this work, it is important that the necessary steps are performed in order:

  • A configuration file cannot usually be installed before the package because there is not yet a directory to contain it
  • The service cannot start before installation either. If it becomes active before the configuration is in place, it will use the default settings from the package instead

This point is being stressed because the preceding manifest does not, in fact, contain cues for Puppet to indicate such a strict ordering. Without explicit dependencies, Puppet is free to put the resources in any order it sees fit.

The recent versions of Puppet allow a form of local manifest-based ordering, so the presented example will actually work as is. The manifest-based ordering can be configured in the puppet.conf configuration file as follows:

ordering = manifest

This setting is default for Puppet 4. It is still important to be aware of the ordering principles because the implicit order is difficult to determine in more complex manifests, and as you will learn soon, there are other factors that will influence the order.

Declaring dependencies

The easiest way to bring order to such a straightforward manifest is resource chaining. The syntax for this is a simple ASCII arrow between two resources:

package { 'haproxy':
ensure => 'installed',
}
->
file { '/etc/haproxy/haproxy.cfg':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
source => 'puppet:///modules/haproxy/etc/haproxy/haproxy.cfg',
}
->
service {'haproxy':
ensure => 'running',
}

This is only viable if all the related resources can be written next to each other. In other words, if the graphic representation of the dependencies does not form a straight chain, but more of a tree, star, or any other shape, this syntax is not sufficient.

Internally, Puppet will construct an ordered graph of resources and synchronize them during a traversal of that graph.

A more generic and flexible way to declare dependencies is through special metaparameters-parameters that are eligible for use with any resource type. There are different metaparameters, most of which have nothing to do with ordering (you have seen provider in an earlier example). For resource ordering, Puppet offers the metaparameters require and before.
Both take one or more references to a declared resource as their value. As was previously mentioned, Puppet references have a special syntax:

Type['title']
e.g.
Package['haproxy']
You can only build references to resources that are declared in the catalog. You cannot build and use references to something that is not managed by Puppet, even when it exists on the managed system.

Here is the HAproxy manifest, ordered using the require metaparameter:

package { 'haproxy':
ensure => 'installed',
}
file {'/etc/haproxy/haproxy.cfg':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
source => 'puppet:///modules/haproxy/etc/haproxy/haproxy.cfg',
require => Package['haproxy'],
}
service {'haproxy':
ensure => 'running',
require => File['/etc/haproxy/haproxy.cfg'],
}

The following manifest is semantically identical, but relies on the before metaparameter rather than require:

package { 'haproxy':
ensure => 'installed',
before => File['/etc/haproxy/haproxy.cfg'],
}
file { '/etc/haproxy/haproxy.cfg':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
source => 'puppet:///modules/haproxy/etc/haproxy/haproxy.cfg',
before => Service['haproxy'],
}
service { 'haproxy':
ensure => 'running',
}
The manifest can also mix both styles of notation, of course. This is left as a reader exercise with no dedicated depiction.

The require metaparameter usually leads to more understandable code because it expresses the dependency of the annotated resource on another resource. The before parameter, on the other hand, implies a dependency that a referenced resource forms upon the current resource. This can be counter-intuitive, especially for frequent users of packaging systems (which usually implement a require-style dependency declaration).

Sometimes, it might be difficult to decide whether to use require or before. In simple cases, most people prefer require. In some cases, it is easier to use before. Think of services that have multiple configuration files. Keeping information about the configuration file and the requirement in a single place reduces errors caused by forgetting to also adopt changes to the service when adding or removing additional configuration files. Take a look at the following example code:

file { '/etc/apache2/apache2.conf':
ensure => file,
before => Service['apache2'],
}
file { '/etc/apache2/httpd.conf':
ensure => file,
before => Service['apache2'],
}
service { 'apache2':
ensure => running,
enable => true,
}

In the example, all dependencies are declared within the file resource declarations. If you use the require parameter instead, you will always need to touch at least two resources in case of changes:

file { '/etc/apache2/apache2.conf':
ensure => file,
}
file { '/etc/apache2/httpd.conf':
ensure => file,
}
service { 'apache2':
ensure => running,
enable => true,
require => [
File['/etc/apache2/apache2.conf'],
File['/etc/apache2/httpd.conf'],
],
}

Will you remember to update the service resource declaration whenever you add a new file to be managed by Puppet? Consider another, simpler example:

if $os_family == 'Debian' {
file { '/etc/apt/preferences.d/example.net.prefs':
content => '...',
before => Package['apache2'],
}
}
package { 'apache2':
ensure => 'installed',
}

The file in the preferences.d directory only makes sense for Debian-like systems; that's why the package cannot safely require it. If the manifest is applied on a different OS, such as CentOS, the apt preferences file will not appear in the catalog thanks to the if clause. If the package had it as a requirement regardless, the resulting catalog would be inconsistent, and Puppet would not apply it. Specifying before in the file resource is safe, and semantically equivalent.

The before metaparameter is outright necessary in situations like this one, and can make the manifest code more elegant and straightforward in other scenarios. Familiarity with both before and require is advisable.

Error propagation

Defining requirements serves another important purpose. References on declared resources will only be validated as successful references if the depended-upon resource was finished successfully. This can be seen as a kind of stop point inside Puppet DSL code, when a required resource is not synchronized successfully.

For example, a file resource will fail if the URL of the source file is broken:

file { '/etc/haproxy/haproxy.cfg':
ensure => file,
source => 'puppet:///modules/haproxy/etc/haproxy.cfg',
}

One path segment is missing here. Puppet will report that the file resource could not be synchronized:

root@puppetmaster:~# puppet apply typo.pp
Notice: Compiled catalog for puppetmaster.example.net in environment production in 0.62 seconds
Error: /Stage[main]/Main/File[/etc/haproxy/haproxy.cfg]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/haproxy/etc/haproxy.cfg
Notice: /Stage[main]/Main/Service[haproxy]: Dependency File[/etc/haproxy/haproxy.cfg] has failures: true
Warning: /Stage[main]/Main/Service[haproxy]: Skipping because of failed dependencies
Notice: Applied catalog in 0.06 seconds

In this example, the Error line describes the error caused by the broken URL. The error propagation is represented by the Notice and Warning lines below it.

Puppet failed to apply changes to the configuration file; it cannot compare the current state to the nonexistent source. As the service depends on the configuration file, Puppet will not even try to start it. This is for safety: if any dependencies cannot be put into the defined state, Puppet must assume that the system is not fit for the application of the dependent resource.

This is another important reason to make use of resource dependencies. Remember that both the chaining arrow and the before metaparameter imply error propagation as well.

Avoiding circular dependencies

Before you learn about another way in which resources can interrelate, there is an issue that you should be aware of: dependencies must not form circles. Let's visualize this in an example:

file { '/etc/haproxy':
ensure => 'directory',
owner => 'root',
group => 'root',
mode => '0644',
}
file { '/etc/haproxy/haproxy.cfg':
ensure => file,
owner => 'root',
group => 'root',
mode => '0644',
source => 'puppet:///modules/haproxy/etc/haproxy/haproxy.cfg',
}
service { 'haproxy':
ensure => 'running',
require => File['/etc/haproxy/haproxy.cfg'],
before => File['/etc/haproxy'],
}

The dependency circle in this manifest is somewhat hidden (as will likely be the case for many such circles that you will encounter during regular use of Puppet).
It is formed by the following relations:

  • The File['/etc/haproxy/haproxy.cfg'] auto-requires the parent directory, File['/etc/haproxy']. This is an implicit, built-in dependency
  • The parent directory, File['/etc/haproxy'], requires Service['haproxy'] due to its before metaparameter
  • The Service['haproxy'] service requires the File['/etc/haproxy/haproxy.cfg'] config

Implicit dependencies exist for the following resource combinations, among others:

  • If a directory and a file inside the directory are declared, Puppet will first create the directory and then the file
  • If a user and his/her primary group is declared, Puppet will first create the group and then the user
  • If a file and the owner (user) are declared, Puppet will first create the user and then the file

Granted, the preceding example is contrived-it will not make sense to manage the service before the configuration directory. Nevertheless, even a manifest design that is apparently sound can result in circular dependencies. This is how Puppet will react to such a design:

root@puppetmaster:~# puppet apply circle.pp
Notice: Compiled catalog for puppetmaster.example.net in environment production in 0.62 seconds
Error: Failed to apply catalog: Found 1 dependency cycle:
(File[/etc/haproxy/haproxy.cfg] =>
Service[haproxy] =>
File[/etc/haproxy] => File[/etc/haproxy/haproxy.cfg])

Try the '--graph' option and opening the resulting '.dot' file in OmniGraffle or GraphViz

The output helps you locate the offending relation(s). For very wide dependency circles with lots of involved resources, the textual rendering is difficult to analyze. Therefore, Puppet also gives you the opportunity to get a graphical representation of the dependency graph through the --graph option.

If you do this, Puppet will include the full path to the newly created .dot file in its output. Its content looks similar to Puppet's output:

digraph Resource_Cycles {
label = "Resource Cycles"
"File[/etc/haproxy/haproxy.cfg]" ->"Service[haproxy]" ->"File[/etc/haproxy]" ->"File[/etc/haproxy/haproxy.cfg]"
}

This is not helpful by itself, but it can be fed directly into tools such as dotty to produce an actual diagram.

To summarize, resource dependencies are helpful in keeping Puppet from acting upon resources in unexpected or uncontrolled situations. They are also useful in restricting the order of resource evaluation.