Total Hit Counter

Wednesday, February 13, 2013

Something more then Monitoring


Something More then Monitoring.


Modify post-installations :

After you get Nagios installed and running properly, you'll no doubt want to start monitoring more than just your local machine (your monitoring host). One way of monitoring a remote Linux/UNIX™ host is to use the NRPE addon that allows you to monitor disk usage, CPU load, memory usage, and other local resources/attributes on the remote host. See Resources for a list of monitoring links.

You'll most likely want to monitor Windows® machines, Netware servers, routers/switches, network printers, and publicly available services (HTTP, FTP, SSH, and so on).

Monitor redundancy and failover :

With redundant hosts, you can maintain the ability to monitor your network when the primary host that runs Nagios fails, or when portions of your network become unreachable, which could impact SLA guarantees. Before you implement the redundancy monitoring tool, make sure you implemented event handlers for hosts and services, issued external commands to Nagios, executed NRPE addons on remote hosts, and checked the status of the Nagios process with the check_nagios plug-in. You will need to modify sample scripts in the eventhandlers subdirectory of the Nagios distribution.

Scenario #1

In one redundancy implementation scenario, the master and slave hosts monitor the same hosts and service on the network. Under normal circumstances, only the master host will be sending out notifications to contacts about problems. The slave host running Nagios will take over the job of notifying contacts about problems if the master host is down or stops running Ajax applications.

Just make sure the lag time between the master host failing and the slave host taking over is minimal. You can do this by having, for example, the master host recheck the slave host to allow for fast detection of host problems.

Scenario #2

The basic goal of failover monitoring is to have the Nagios process on the slave host sit idle while the Nagios process on the master host is running. If the process on the master host stops running (or if the host goes down), the Nagios process on the slave host starts monitoring everything.

Detect and handle state flapping :

Flapping occurs when a service or host changes state too frequently, resulting in a storm of problem and recovery notifications. Flapping can be indicative of configuration problems (such as thresholds set too low), troublesome services, or real network problems impacting SLA guarantees.

A host or service is determined to have started flapping when its percent state change first exceeds a high flapping threshold. A host or service is determined to have stopped flapping when its percent state goes below a low flapping threshold (assuming that is was previously flapping).

For both hosts and services, there are global high and low thresholds and host- or service-specific thresholds that you can configure. Nagios will use the global thresholds for flap detection if you do not specify host- or service-specific thresholds. To enable flapping detection, you'll need to set flap_detection directives to 1.

Consider security :

Some security measures you should consider are to use a dedicated monitoring service to install Nagios for your Ajax applications, and make sure only the Nagios users read or write in the check result directory. Do not run Nagios as a root.

If you are using external commands, make sure you set proper permission in the /user/local/nagios/var/rw directory. You'll need to require authentication to CGIs and use full paths in the command definition.

Don't forget to hide sensitive information with $USERn$ macros, and secure access to remote agents. Encrypt communication channels between Nagios installations and between Nagios servers and your monitoring agents. Also important is the stripping of dangerous characters from macros before they are used in notifications.

Optimize Nagios :

This section discusses some things to consider when you attempt to optimize Nagios to improve server performance. First, disable environment macros, adjust buffer slots, and check service latencies to determine the best value for maximum concurrent checks. Use compiled—not interpreted—plug-ins, schedule regular host checks, and enable cached host checks.

Next, optimize hardware for maximum performance, and set the maximum time that the Nagios daemon can spend processing the results of host and service checks. Most important of all, take advantage of graph performance statistics with the Multi Router Traffic Grapher (MRTG—see Resources for a link) to keep track of how well your Nagios installation handles the load over time and how your configuration changes affect it.

Get Nagios addons :

Nagios comes with three core addons: NRPE, NDOUtils, and NSCA. While they give you the basic command-line options, you can add other options as listed in the Nagios Plugin Manual. See Resources for links to both the addons and manual.

NRPE

The NRPE addon is designed to let you execute Nagios plug-ins on remote Linux/UNIX machines. NRPE can check remote services on other hosts through ftp and http. From the monitoring host, Nagios can monitor the CPU, disk usage, memory usage, and other local resources on remote machines.

Because these public resources are not usually exposed to external machines, NRPE must be installed on the remote machines. It allows you to execute scripts and check metrics on remote Windows machines.

While using SSH is more secure than the NRPE addon, SSH imposes a larger (CPU) overhead on both the monitoring and remote machines. This can become an issue when you start monitoring hundreds or thousands of machines. Many Nagios administrators opt for using the NRPE addon because of the lower load it imposes.

NDOUtils

The NDOUtils addon lets you export current and historical data of configurations and events from one or more Nagios instances to a MySQL database. Storing information from Nagios in a database will allow for quicker retrieval.

NSCA

The NSCA addon is installed on the monitory host, and lets you integrate passive alerts and checks from remote machines and applications with Nagios. This is useful for processing security alerts as well as redundant and distributed Nagios setups.

Conclusion
This article helps you to plan ahead to improve the monitoring and performance of your Ajax applications with Nagios, an open source host, service, and network program on remote servers. Because network performance is critical not only to developers, but also to testers, system administrators, and potential users, being aware of and resolving potential performance and environmental monitoring issues can make your development team's and users' experiences trouble-free.

No comments: