Effective Server Monitoring using Nagios NRPE

Linux Add comments

Monitoring and Security has always been a concern of Webhosting Industry. A web host is always in the look out of services, which will guarantee minimal downtime. Similarly, a web hosting client requires a service with which he can promise maximum uptime to his customers and gain the trust of future prospects.

Monitoring a web server essentially means alerting the server owner about the status of a service. It can be done internally where a software checks the service status and notifies the owner if some service goes down, or it can be external, where you have a web server monitoring company to the check the services status with a certain frequency.

There are many server monitoring and website monitoring tools or services available. Ideally every server you have should be monitored, every service you use, along with every domain name you own. If your server goes down for more than a certain interval of time, you should receive a message about the problem. However, if you use the aid of external monitoring companies, this can be very expensive and service fees can quickly exceed your income. You may have to end up monitoring only the most important servers or the websites of most important clients.

In addition, you will have to answer some of the major queries that arise in the mind of customers:

a) How effectively are my servers monitored ?
b) How will I monitor my servers without any external help ?
c) Am I alerted if something goes wrong in my servers

and the list goes on ….

Achieving these goals in an effective manner is a pretty daunting task for system administrators. To know why monitoring servers is so important, lets see it through the eyes of customers. Lets think of a customer who runs his business online or he is in the middle of an Ad-campaign or a customer who travels around the world and his work depends lots on e-mail. So when the critical systems like webserver or mailserver are down, it actually effects their business and there is no wrong in customers getting so frustrated about their Webhosts.

The only way this can be achieved is thorough effective server monitoring. As I said, there are lots of tools available in the market, but my article will concentrate on one of the best open source tool that is preferred by most of the webhosting industries called Nagios. Also, I am not going to tell you how to install and configure Nagios because you already have a well documented article in their website, but I am here to tell how efficiently you can use Nagios to monitor remote servers.

Lets take it in a step by step basis. Lets answer their queries.

1)How effectively are my servers monitored ?

The Nagios plugins has about 50 plugins when installed and only very few plugins monitor the remote services i.e those services that runs on the public ports. The remaining plugins only runs in the local machine unless the client machine which as to be monitored by the Nagios is configured properly.

Apart from the services running in the server, the customer is always inquisitive to know about the disk space, load, memory etc. and wants them to be alerted before they can cause any fatal damage. Now here comes the role of a system administrator who can configure the client machine to retrieve these data. In Nagios, this can be done in three ways:

1) SSH
2) SNMP
3) NRPE

It is possible to execute Nagios plugins on remote Linux/Unix machines through SSH. There is a check_by_ssh plugin that allows you to do this. Using SSH is not an ideal good option as you have to create user in the client side, install keys and also may have to give full privileges to run the Nagios commands. It also imposes a larger (CPU) overhead on both the monitoring and remote machines. This can become an issue when you start monitoring a large number of machines.

SNMP is a network management protocol used by the more expert system administrators. Using SNMP, you can access any information that includes information about routers, temperature etc. in your server room to read statistics, alarms and status messages.

Well the third one is the use of NRPE and out of the three, it is more easier and is readily manageable. NRPE is designed is to allow Nagios to monitor “local” resources (like CPU load, memory usage, etc.) on remote machines. Its main advantages include:

1) Installation is very easy
2) You can write you own scripts in bash or perl to monitor process in client servers in NRPE
3) NRPE uses nagios plugins to monitor the client servers, so make sure you install it.

It uses the check_nrpe plugin, which resides on the monitoring machine and the NRPE daemon, which runs on the remote machine. Installation of NRPE is pretty simple. Its so well documented that you just need to exactly follow those steps. NRPE can be run as a daemon or as an xinetd service whichever you feel is more comfortable.

In order to enable it via xinetd edit the file

service nrpe

{

flags           = REUSE

socket_type     = stream

port            = 5666

wait            = no

user            = nagiosb

group           = nagiosb

server          = /usr/bin/nrpe

server_args     = -c /etc/nrpe.cfg --inetd

log_on_failure  += USERID

disable         = no

only_from       = 127.0.0.1

}

If you are running it as stand alone just run the command:

nrpe -c /etc/nrpe.cfg -d

Well, as you can see /etc/nrpe.cfg is the main configuration file that is used. Make sure that

server_port=5666

If there is a firewall or router, make sure you mention the IP’s in allowed_hosts. For example:

allowed_hosts==x.x.x.x,127.0.0.1

To allow arguments for plugins, set:”

dont_blame_nrpe=1

Also you need to define the plugins using arguments for which the examples as been given in the configuration file itself.

To check whether the nrpe is working properly you can use this command from the konsole:

/usr/local/nagios/libexec/check_nrpe -H;IP of the client machine; -c check_load

The check nrpe plugin is installed when we install the NRPE from the source. Once installed copy it to /usr/local/nagios/libexec.

Once installed you can use it in combination with all the nagios plugins and you can monitor almost everything in the client servers.

2) How will I monitor my servers without any external help ?

Yes, you can monitor your servers on your own. Individual user accounts can be created in Nagios and can be provided to the customers so that they can always keep an eye on their servers. Of-course the webhosting service should provide you with such privileges.

3) Am I alerted if something goes wrong in my servers

Yes its the most important feature and the customers gets alerts through e-mails.

Well if you do not avail any of the above said basic features from a webhosting service, most probably you are choosing a wrong host.

Monitoring the servers can solve many issues which causes headaches for both the sysadmins and the customers. By regularly monitoring the servers, you are making sure that none of the services in server are down and as a result the problems will be very less. As a guy working in support industry, I have seen many problems recurring due to service failures, load surge and system down which can be avoided by implementing effective monitoring systems. Try implementing the monitoring systems and see the difference for yourself. You have just reduced half of your workload !!!




Leave a Reply

Wordpress Themes by Natty WP. Web Hosting
Images by our golf tips desEXign.