Nagios Core Administration Cookbook Second Edition

Nagios Core Administration Cookbook Second Edition - Second Edition

By : Tom Ryder

Buy this Book

Nagios Core Administration Cookbook Second Edition - Second Edition

By: Tom Ryder

Buy this Book

Overview of this book

Nagios Core is an open source monitoring framework suitable for any network that ensures both internal and customer-facing services are running correctly and manages notification and reporting behavior to diagnose and fix outages promptly. It allows very fine configuration of exactly when, where, what, and how to check network services to meet both the uptime goals of your network and systems team and the needs of your users. This book shows system and network administrators how to use Nagios Core to its fullest as a monitoring framework for checks on any kind of network services, from the smallest home network to much larger production multi-site services. You will discover that Nagios Core is capable of doing much more than pinging a host or to see whether websites respond. The recipes in this book will demonstrate how to leverage Nagios Core's advanced configuration, scripting hooks, reports, data retrieval, and extensibility to integrate it with your existing systems, and to make it the rock-solid center of your network monitoring world.

Nagios Core Administration Cookbook Second Edition

Credits

About the Author

About the Reviewer

www.PacktPub.com

Preface

Free Chapter

Understanding Hosts, Services, and Contacts

Introduction

Creating a new network host

Creating a new HTTP service

Creating a new e-mail contact

Verifying configuration

Creating a new hostgroup

Creating a new servicegroup

Creating a new contactgroup

Creating a new time period

Running a service on all hosts on a group

Working with Commands and Plugins

Creating a new command

Customizing an existing command

Using an alternative check command for hosts

Writing a new plugin from scratch

Implementing threshold checks in a plugin

Using macros as environment variables in a plugin

Working with Checks and States

Introduction

Specifying how frequently to check a host or service

Changing thresholds for PING RTT and packet loss

Changing thresholds for disk usage

Scheduling downtime for a host or service

Managing brief outages with flapping

Adjusting flapping percentage thresholds for a service

Configuring Notifications

Introduction

Configuring notification periods

Configuring notifications for groups

Choosing states for notification

Specifying the number of failed checks before notification

Automating contact rotation

Defining an escalation for repeated notifications

Defining a custom notification method

Filtering notifications based on a host or service value

Monitoring Methods

Introduction

Monitoring PING for any host

Monitoring SSH for any host

Checking an alternative SSH port

Monitoring mail services

Monitoring web services

Checking that a website returns a given string

Monitoring database services

Monitoring the output of an SNMP query

Monitoring a RAID or other hardware device

Creating an SNMP OID for monitoring

Enabling Remote Execution

Introduction

Monitoring local services on a remote machine with NRPE

Setting the listening address for NRPE

Setting allowed client hosts for NRPE

Creating new NRPE command definitions securely

Giving limited sudo(8) privileges to NRPE

Using check_by_ssh with key authentication instead of NRPE

Using check_mk instead of NRPE

Using the Web Interface

Introduction

Using the Tactical Overview

Viewing and interpreting availability reports

Viewing and interpreting trends

Viewing and interpreting notification history

Adding comments on hosts or services in the web interface

Viewing configuration in the web interface

Scheduling checks from the web interface

Acknowledging a problem via the web interface

Managing Network Layout

Introduction

Creating a network host hierarchy

Using the network map

Choosing icons for hosts

Establishing a host dependency

Establishing a service dependency

Monitoring individual nodes in a cluster

Using the network map as an overlay

Managing Configuration

Introduction

Grouping configuration files in directories

Keeping a configuration under version control

Configuring host roles using groups

Building groups using regular expressions

Using inheritance to simplify configuration

Defining macros in a resource file

Using another object's directives in a host or service check

Using custom directives

Dynamically building host definitions

Security and Performance

Introduction

Using authentication for the Nagios Core web interface

Using authenticated contacts

Writing debugging information to the Nagios log file

Monitoring Nagios performance with nagiostats

Setting up a redundant monitoring host

Automating and Extending Nagios Core

Introduction

Allowing and submitting passive checks

Submitting passive checks from a remote host with NSCA

Submitting passive checks in response to SNMP traps

Setting up an event handler script

Tracking host and service states with Nagiosgraph

Reading status in a MySQL database with NDOUtils

Reading status from a Unix socket with MK Livestatus

Writing customized Nagios Core reports

Getting extra visualizations with NagVis

Writing custom Nagios Core management scripts

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Creating a new e-mail contact

In this recipe, we'll create a new contact with which hosts and services can interact with each other, chiefly to inform the contact when the state of hosts or services changes. We'll use the simplest example of setting up an e-mail contact and configuring an existing host so that this contact receives an e-mail message when Nagios Core's host checks fail and the host is apparently unreachable. In this instance, we'll arrange for [email protected] to receive an e-mail message whenever the sparta.example.net host goes from the DOWN state to the UP state, or vice-versa.

Getting ready

You should have a working Nagios Core 4.0 or better server running with a web interface and at least one host to check. If you need to do this first, refer to the Creating a new network host recipe in this chapter.

For this particular kind of contact, you'll also need to have a working SMTP daemon running on the monitoring server, such as Exim or Postfix. You should verify that you're able to send messages to the target address and that they're successfully delivered to the host you expect them to be delivered to.

How to do it...

We can add a simple new contact to the Nagios Core configuration as follows:

Change to Nagios Core's configuration directory; ideally, it should contain a file that's devoted to contacts, such as contacts.cfg here, and edit that file:
```
# cd /usr/local/nagios/etc/objects
# vi contacts.cfg
```

Add the following contact definition to the end of the file, substituting your own values for the properties in bold as you need them:

define contact {
    contact_name                   spartaadmin
    alias                          Administrator of sparta.example.net
    email                          [email protected]
    host_notification_commands     notify-host-by-email
    host_notification_options      d,u,r
    host_notification_period       24x7
    service_notification_commands  notify-service-by-email
    service_notification_options   w,u,c,r
    service_notification_period    24x7
}

Edit the definition for the sparta.example.net host and add or replace the definition of contacts for the appropriate host to our new contact spartaadmin:

define host {
    host_name              sparta.example.net
    alias                  sparta
    address                192.0.2.21
    max_check_attempts     3
    check_period           24x7
    check_command          check-host-alive
    contacts               spartaadmin
    notification_interval  60
    notification_period    24x7
}

Reload the configuration:
```
# /etc/init.d/nagios reload
```

When we are done with the preceding steps, the next time our host changes its state we should receive messages like the one shown in the following screenshot:

When the host becomes available again, we should receive a recovery message as follows:

If possible, it's worth testing this setup with a test host that we can safely bring down and then up again to verify that we receive appropriate notifications.

How it works...

This configuration adds a new contact to the Nagios Core configuration and references it in one of the hosts as the appropriate contact to be used when the host has problems.

We've defined the required directives for the contact and a couple of others:

contact_name: This defines a unique name for the contact so that we can refer to it in host and service definitions, or anywhere else we might need to do so in the Nagios Core configuration.
alias: This defines a human-friendly name for the contact, perhaps a brief explanation of who the person or group is and/or what they're responsible for.
email: This defines the e-mail address of the contact, since we're going to be sending messages by e-mails.
host_notification_commands: This defines the command or commands to be run when a state change on a host prompts a notification for this contact. In this case, we're going to send e-mails to the the contact about the results with a predefined command called notify-host-by-email.
host_notification_options: This specifies different kinds of host events for which this contact should be notified. Here, we're using d,u,r, which means that this contact will receive notifications for a host going down, becoming unreachable, or coming back up.
host_notification_period: This defines the time period in which this contact can be notified by any host events. If a host notification is generated and defined to be sent to this contact, but falls outside this time period, the notification will not be sent.
service_notification_commands: This defines the command or commands that are to be run when a state change on a service prompts a notification for this contact. In this case, we're going to send an e-mail to the contact about the results with a predefined command called notify-service-by-email.
service_notification_options: This specifies different kinds of service events for which this contact should be notified. Here, we're using w,u,c,r, which means that we want to receive notifications about services entering WARNING, UNKNOWN, or CRITICAL states, and also when they recover and go back to the OK state.
service_notification_period: This is the same as for host_notification_period, except that this directive refers to notifications about services, not hosts.

Note that we placed the definition for the contact in contacts.cfg, which is a reasonably sensible place. However, we can place the contact definition in any file that Nagios Core will read as part of its configuration; we can organize our hosts, services, and contacts any way we like, but it helps to choose some sort of system, so we can easily identify where definitions are likely to be when we need to add, change, or remove them.

There's more...

If we define a lot of contacts with similar options, it may be appropriate to have individual contacts extend contact templates, so they can inherit those common settings. The default Nagios Core configuration includes such a template, called generic-contact. We could instead define our new contact as an extension of this template as follows:

define contact {
    use           generic-contact
    contact_name  spartaadmin
    alias         Administrator of sparta.example.net
    email         [email protected]
}

To see the directives defined for generic-contact, you can inspect its definition in the /usr/local/nagios/etc/objects/templates.cfg file.

Nagios Core Administration Cookbook Second Edition - Second Edition

By : Tom Ryder

Nagios Core Administration Cookbook Second Edition - Second Edition

By: Tom Ryder

Overview of this book

Related Content you might be interested in

Current Title:

Nagios Core Administration Cookbook Second Edition - Second Edition

Creating a new e-mail contact

Getting ready

How to do it...

How it works...

There's more...

See also