Book Image

Learning Nagios - Third Edition

By : Wojciech Kocjan, Piotr Beltowski
Book Image

Learning Nagios - Third Edition

By: Wojciech Kocjan, Piotr Beltowski

Overview of this book

Nagios, a powerful and widely used IT monitoring and management software for problem -solving. It detects problems related to your organizations infrastructure and helps in resolving the issue before it impacts the business. Following the success of the previous edition, this book will continue to help you monitor the status of network devices and also notify the system administrators of network problems. Starting with the fundamentals, the book will teach you how to install and configure Nagios for your environment. The book helps you learn how to end downtimes, adding comments and generating reports using the built-in Web interface of Nagios. Moving on, you will be introduced to the third-party web interfaces and applications for checking the status and report specific information. As you progress further in Learning Nagios, you will focus on the standard set of Nagios plugins and also focus on teach you how to efficiently manage large configurations and using templates. Once you are up to speed with this, you will get to know the concept and working of notifications and events in Nagios. The book will then uncover the concept of passive check and shows how to use NRDP (Nagios Remote Data Processor). The focus then shifts to how Nagios checks can be run on remote machines and SNMP (Simple Network Management Protocol) can be used from Nagios. Lastly, the book will demonstrate how to extend Nagios by creating custom check commands, custom ways of notifying users and showing how passive checks and NRDP can be used to integrate your solutions with Nagios. By the end of the book, you will be a competent system administrator who could monitor mid-size businesses or even large scale enterprises.
Table of Contents (19 chapters)
Learning Nagios - Third Edition
About the Authors
About the Reviewer

Understanding escalations

A common problem with resolving problems is that a host or a service may have blurred ownership. Often there is no single person responsible for a host or service, which makes things harder. It is also typical to have a service with subtle dependencies on other things, which by themselves are small enough not to be monitored by Nagios. In such a case, it is good to include lower management in the escalations so that they are able to focus on problems that haven't been resolved in a timely manner.

Here is a good example—a database server might fail because a small Perl script that is run prior to actual start to clean things up has entered an infinite loop. The owner of this machine gets notified. But the question is, who should be fixing it? The script owner? Or perhaps the database administrator? Often this may end up in different teams assuming someone else should resolve it—programmers waiting on database administrators and vice versa.

In such cases, escalations...