Book Image

Extending Puppet - Second Edition

By : Alessandro Franceschi, Jaime Soriano Pastor
Book Image

Extending Puppet - Second Edition

By: Alessandro Franceschi, Jaime Soriano Pastor

Overview of this book

Puppet has changed the way we manage our systems, but Puppet itself is changing and evolving, and so are the ways we are using it. To tackle our IT infrastructure challenges and avoid common errors when designing our architectures, an up-to-date, practical, and focused view of the current and future Puppet evolution is what we need. With Puppet, you define the state of your IT infrastructure, and it automatically enforces the desired state. This book will be your guide to designing and deploying your Puppet architecture. It will help you utilize Puppet to manage your IT infrastructure. Get to grips with Hiera and learn how to install and configure it, before learning best practices for writing reusable and maintainable code. You will also be able to explore the latest features of Puppet 4, before executing, testing, and deploying Puppet across your systems. As you progress, Extending Puppet takes you through higher abstraction modules, along with tips for effective code workflow management. Finally, you will learn how to develop plugins for Puppet - as well as some useful techniques that can help you to avoid common errors and overcome everyday challenges.
Table of Contents (19 chapters)
Extending Puppet Second Edition
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Preface
Index

Puppet in action


Client-server communication is done using REST-like API calls on a SSL socket, basically it's all HTTPS traffic from clients to the server's port 8140/TCP.

The first time we execute Puppet on a node, its x509 certificates are created and placed in ssldir, and then the Puppet Master is contacted in order to retrieve the node's catalog.

On the Puppet Master, unless we have autosign enabled, we must manually sign the clients' certificates using the cert subcommand:

puppet cert list # List the unsigned clients certificates
puppet cert list --all # List all certificates
puppet cert sign <certname> # Sign the given certificate

Once the node's certificate has been recognized as valid and been signed, a trust relationship is created and a secure client-server communication can be established.

If we happen to recreate a new machine with an existing certname, we have to remove the certificate of the old client from the server:

puppet cert clean  <certname> # Remove a signed certificate

At times, we may also need to remove the certificates on the client; a simple move command is safe enough:

mv /etc/puppetlabs/puppet/ssl /etc/puppetlabs/puppet/ssl.bak

After that, the whole directory will be recreated with new certificates when Puppet is run again (never do this on the server—it'll remove all client certificates previously signed and the server's certificate, whose public key has been copied to all clients).

A typical Puppet run is composed of different phases. It's important to know them in order to troubleshoot problems:

  1. Execute Puppet on the client. On a root shell, run:

    puppet agent -t
    
  2. If pluginsync = true (default from Puppet 3.0), then client retrieves any extra plugin (facts, types, and providers) present in the modules on the Master's $modulepath client output with the following command:

    Info: Retrieving pluginfacts
    Info: Retriving plugin
    
  3. The client runs facter and sends its facts to the server client output:

    Info: Loading facts in /var/lib/puppet/lib/facter/... [...]
    
  4. The server looks for the client's certname in its nodes list.

  5. The server compiles the catalog for the client using its facts. Server logs as follows:

    Compiled catalog for <client> in environment production in 8.22 seconds
    
  6. If there are syntax errors in the processed Puppet code, they are exposed here and the process terminates; otherwise, the server sends the catalog to the client in the PSON format. Client output is as follows:

    Info: Caching catalog for <client>
    
  7. The client receives the catalog and starts to apply it locally. If there are dependency loops, the catalog can't be applied and the whole run fails. Client output is as follows:

    Info: Applying configuration version '1355353107'
    
  8. All changes to the system are shown on stdout or in logs. If there are errors (in red or pink, depending on Puppet versions), they are relevant to specific resources but do not block the application of the other resources (unless they depend on the failed ones). At the end of the Puppet run, the client sends a report of what has been changed to the server. Client output is as follows:

    Notice: Applied catalog in 13.78 seconds
    
  9. The server sends the report to a report collector if enabled.

Resources

When dealing with Puppet's DSL, most of the time, we use resources as they are single units of configuration that express the properties of objects on the system. A resource declaration is always composed of the following parts:

  • type: This includes package, service, file, user, mount, exec, and so on

  • title: This is how it is called and may be referred to in other parts of the code

  • Zero or more attributes:

    type { 'title':
      attribute  => value,
      other_attribute => value,
    }

Inside a catalog, for a given type, there can be only one title; there cannot be multiple resources of the same type with the same title, otherwise we get an error like this:

Error: Duplicate declaration: <Type>[<name>] is already declared in file <manifest_file> at line <line_number>; cannot redeclare on node <node_name>.

Resources can be native (written in Ruby), or defined by users in Puppet DSL.

These are examples of common native resources; what they do should be quite obvious:

  file { 'motd':
    path    => '/etc/motd',
    content => "Tomorrow is another day\n",
  }

  package { 'openssh':
    ensure => present,
  }

  service { 'httpd':
    ensure => running, # Service must be running
    enable => true,    # Service must be enabled at boot time
  }

For inline documentation about a resource, use the describe subcommand, for example:

puppet describe file

Note

For a complete reference of the native resource types and their arguments check: http://docs.puppetlabs.com/references/latest/type.html

The resource abstraction layer

From the previous resource examples, we can deduce that the Puppet DSL allows us to concentrate on the types of objects (resources) to manage and doesn't bother us on how these resources may be applied on different operating systems.

This is one of Puppet's strong points, resources are abstracted from the underlying OS, we don't have to care or specify how, for example, to install a package on Red Hat Linux, Debian, Solaris, or Mac OS, we just have to provide a valid package name. This is possible thanks to Puppet's Resource Abstraction Layer (RAL), which is engineered around the concept of types and providers.

Types, as we have seen, map to an object on the system. There are more than 50 native types in Puppet (some of them applicable only to specific OSes), the most common and used are augeas, cron, exec, file, group, host, mount, package, service, and user. To have a look at their Ruby code, and learn how to make custom types, check these files:

ls -l $(facter rubysitedir)/puppet/type

For each type, there is at least one provider, which is the component that enables that type on a specific OS. For example, the package type is known for having a large number of providers that manage the installation of packages on many OSes, which are aix, appdmg, apple, aptitude, apt, aptrpm, blastwave, dpkg, fink, freebsd, gem, hpux, macports, msi, nim, openbsd, pacman, pip, pkgdmg, pkg, pkgutil, portage, ports, rpm, rug, sunfreeware, sun, up2date, urpmi, yum, and zypper.

We can find them here:

ls -l $(facter rubysitedir)/puppet/provider/package/

The Puppet executable offers a powerful subcommand to interrogate and operate with the RAL: puppet resource.

For a list of all the users present on the system, type:

puppet resource user

For a specific user, type:

puppet resource user root

Other examples that might give glimpses of the power of RAL to map systems' resources are:

puppet resource package
puppet resource mount
puppet resource host
puppet resource file /etc/hosts
puppet resource service

The output is in the Puppet DSL format; we can use it in our manifests to reproduce that resource wherever we want.

The Puppet resource subcommand can also be used to modify the properties of a resource directly from the command line, and, since it uses the Puppet RAL, we don't have to know how to do that on a specific OS, for example, to enable the httpd service:

puppet resource service httpd ensure=running enable=true

Nodes

We can place the preceding resources in our first manifest file (/etc/puppetlabs/code/environments/production/manifests/site.pp) or in the form included there and they will be applied to all our Puppet managed nodes. This is okay for quick samples out of books, but in real life things are very different. We have hundreds of different resources to manage, and dozens, hundreds, or thousands of different systems to apply different logic and properties to.

To help organize our Puppet code, there are two different language elements: with node, we can confine resources to a given host and apply them only to it; with class, we can group different resources (or other classes), which generally have a common function or task.

Whatever is declared in a node, definition is included only in the catalog compiled for that node. The general syntax is:

   node $name [inherits $parent_node] {
  [ Puppet code, resources and classes applied to the node ]
}

$name is the certname of the client (by default its FQDN) or a regular expression; it's possible to inherit, in a node, whatever is defined in the parent node, and, inside the curly braces, we can place any kind of Puppet code: resources declarations, classes inclusions, and variable definitions. An example is given as follows:

node 'mysql.example.com' {

  package { 'mysql-server':
    ensure => present,
  }
  service { 'mysql':

    ensure => 'running',
  }
}

But generally, in nodes we just include classes, so a better real life example would be:

node 'mysql.example.com' {
  include common
  include mysql
}

The preceding include statements that do what we might expect; they include all the resources declared in the referred class.

Note that there are alternatives to the usage of the node statement; we can use an External Node Classifier (ENC) to define which variables and classes assign to nodes or we can have a nodeless setup, where resources applied to nodes are defined in a case statement based on the hostname or a similar fact that identifies a node.

Classes and defines

A class can be defined (resources provided by the class are defined for later usage but are not yet included in the catalog) with this syntax:

class mysql {
  $mysql_service_name = $::osfamily ? {
    'RedHat' => 'mysqld',
    default  => 'mysql',
  }
  package { 'mysql-server':
    ensure => present,
  }
  service { 'mysql':
    name => $mysql_service_name,
    ensure => 'running',
  }
  […]
}

Once defined, a class can be declared (the resources provided by the class are actually included in the catalog) in multiple ways:

  • Just by including it (we can include the same class many times, but it's evaluated only once):

    include mysql
  • By requiring it—what makes all resources in current scope require the included class:

    require mysql
  • Containing it—what makes all resources requiring the parent class also require the contained class. In the next example, all resources in mysql and in mysql::service will be resolved before exec:

    class mysql {
      contain mysql::service
      ...
    }
    
    include mysql
    exec { 'revoke_default_grants.sh':
      require => Class['mysql'],
    }
  • Using the parameterized style (available since Puppet 2.6), where we can optionally pass parameters to the class, if available (we can declare a class with this syntax only once for each node in our catalog):

    class { 'mysql':
      root_password => 's3cr3t',}

A parameterized class has a syntax like this:

class mysql (
  $root_password,
  $config_file_template = undef,
  ...
) {
  […]
}

Here, we can see the expected parameters defined between parentheses. Parameters with an assigned value have it as their default, as it is here. The case of undef for the $config_file_template parameter.

The declaration of a parameterized class has exactly the same syntax of a normal resource:

class { 'mysql':
  $root_password => 's3cr3t',
}

Puppet 3.0 introduced a feature called data binding; if we don't pass a value for a given parameter, as in the preceding example, before using the default value, if present, Puppet does an automatic lookup to a Hiera variable with the name $class::$parameter. In this example, it would be mysql::root_password.

This is an important feature that radically changes the approach of how to manage data in Puppet architectures. We will come back to this topic in the following chapters.

Besides classes, Puppet also has defines, which can be considered classes that can be used multiple times on the same host (with a different title). Defines are also called defined types, since they are types that can be defined using Puppet DSL, contrary to the native types written in Ruby.

They have a similar syntax to this:

define mysql::user (
  $password,                # Mandatory parameter, no defaults set
  $host      = 'localhost', # Parameter with a default value
  [...]
 ) {
  # Here all the resources
}

They are used in a similar way:

mysql::user { 'al':
  password => 'secret',
}

Note that defines (also called user defined types, defined resource type, or definitions) like the preceding one, even if written in Puppet DSL, have exactly the same usage pattern as native types, written in Ruby (such as package, service, file, and so on).

In types, besides the parameters explicitly exposed, there are two variables that are automatically set. They are $title (which is the defined title) and $name (which defaults to the value of $title) and can be set to an alternative value.

Since a define can be declared more than once inside a catalog (with different titles), it's important to avoid to declare resources with a static title inside a define. For example, this is wrong:

define mysql::user ( ...) {
  exec { 'create_mysql_user':
    [ … ]
  }
}

Because, when there are two different mysql::user declarations, it will generate an error like:

Duplicate definition: Exec[create_mysql_user] is already defined in file /etc/puppet/modules/mysql/manifests/user.pp at line 2; cannot redefine at /etc/puppet/modules/mysql/manifests/user.pp:2 on node test.example42.com 

A correct version could use the $title variable which is inherently different each time:

define mysql::user ( ...) {
  exec { "create_mysql_user_${title}":
    [ … ]
  }
}

Class inheritance

We have seen that in Puppet classes are just containers of resources that have nothing to do with Object Oriented Programming classes so the meaning of class inheritance is somehow limited to a few specific cases.

When using class inheritance, the parent class (puppet in the sample below) is always evaluated first and all the variables and resource defaults sets are available in the scope of the child class (puppet::server).

Moreover, the child class can override the arguments of a resource defined in the parent class:

class puppet {
  file { '/etc/puppet/puppet.conf':
    content => template('puppet/client/puppet.conf'),
  }
}
class puppet::server inherits puppet {
  File['/etc/puppet/puppet.conf'] {
    content => template('puppet/server/puppet.conf'),
  }
}

Note the syntax used; when declaring a resource, we use a syntax such as file { '/etc/puppet/puppet.conf': [...] }; when referring to it the syntax is File['/etc/puppet/puppet.conf'].

Even when possible, class inheritance is usually discouraged in Puppet style guides except for some design patterns that we'll see later in the book.

Resource defaults

It is possible to set default argument values for a resource type in order to reduce code duplication. The general syntax to define a resource default is:

Type {
  argument => default_value,
}

Some common examples are:

Exec {
  path => '/sbin:/bin:/usr/sbin:/usr/bin',
}
File {
  mode  => 0644,
  owner => 'root',
  group => 'root',
}

Resource defaults can be overridden when declaring a specific resource of the same type.

It is worth noting that the area of effect of resource defaults might bring unexpected results. The general suggestion is as follows:

  • Place the global resource defaults in site.pp outside any node definition

  • Place the local resource defaults at the beginning of a class that uses them (mostly for clarity's sake, as they are parse-order independent)

We cannot expect a resource default defined in a class to be working in another class, unless it is a child class, with an inheritance relationship.

Resource references

In Puppet, any resource is uniquely identified by its type and its name. We cannot have two resources of the same type with the same name in a node's catalog.

We have seen that we declare resources with a syntax such as:

type { 'name':
  arguments => values,
}

When we need to reference them (typically when we define dependencies between resources) in our code, the syntax is (note the square brackets and the capital letter):

Type['name']

Some examples are as follows:

file { 'motd': ... }
apache::virtualhost { 'example42.com': .... }
exec { 'download_myapp': .... }

These examples are referenced, respectively, with the following code:

File['motd']
Apache::Virtualhost['example42.com']
Exec['download_myapp']