Learning Nagios 3.0

There are many reasons why you should make sure that all of your resources are working as expected. If you're still not convinced after reading the introduction to this chapter, here are a few main points why it is important to monitor your infrastructure.

The main advantage is the improvement in quality. If your IT staff can notice failures more quickly, they will also be able to respond to them much faster. Sometimes, it takes hours or days to get the first report of a failure even if many users are bumping into errors. Nagios will make sure that if something is not working, you know about it.

It is also possible to make Nagios perform recovery actions automatically. This is done using event handlers. These are commands that are run after the status of a host or service has changed — this way when a primary router is down, Nagios will switch to a backup solution until the primary one is fixed. A typical case would be to start a dial-up connection as a fallback, in case VPN is down.

Another advantage is much better problem determination. Very often, what the users report as a failure is far from the root cause of the problem — an email system being down due to LDAP service not working correctly. If you define dependencies between hosts correctly, Nagios will point out that the POP3 email server is assumed to be not working because the LDAP service, which it depends upon, has a problem. Nagios will start checking the email server as soon as the problem with LDAP has been resolved.

Nagios is also very flexible when it comes to notifying people about what isn't functioning correctly. You can set it up to send emails to different people depending on what is not functioning properly. In most of the cases, your company has a large IT team or multiple teams. Usually you want some people to handle servers, and others to handle network switches/routers/modems. You can even use Nagios' web interface to manage who is working on what issue. You can also configure how Nagios sends notifications via email, pager over Jabber, MSN, or by using your own scripts.

Monitoring resources is not only useful for identifying problems; it can also save you from running into them. Nagios handles warnings and critical situations differently. This means that it's possible to recognize potentially problematic situations quickly. For example, if your disk storage on an email server is running out, it's better to be aware of this situation before it becomes a critical issue.

Monitoring can also be set up on multiple machines across various locations that can communicate all their results to a central Nagios server. This way, information on all hosts and services in your system can be accessed from a single machine. This gives you a more complete picture of your IT infrastructure, and also allows for testing of more complex things such as firewalls.

Learning Nagios 3.0

Learning Nagios 3.0

Overview of this book

Related Content you might be interested in

Current Title:

Learning Nagios 3.0

Benefits of Monitoring Resources