You will create a simple network management configuration which checks that certain programs are running on your network and are performing as expected, and that certain system details are performing within normal parameters.
In order to simplify configuration the manager will poll (initiate) all communication.
You should add at least two extra features to the program. These extra features should be clearly documented.
Deliverable 1:
A Perl script which polls SNMP agents to check that certain services/processes are running, and that the system is operating as expected.
The script should read two configuration files:
One which specifies for each application to check:
Which host to query, the community string to use, and the maximum resources the application should be using
And another which specifies for each host:
The community string to use, and the maximum resources the system should be using
Also each configuration file should specify details regarding how alerts are sent to an administrator if things are not running as specified. That is, an alert is sent if a process is not running or is using more resources than expected, or the system load is above the level specified. Warnings should be sent via email and/or via syslog as specified in the config file.
Example config files are supplied which your script should be capable of accepting.
The program will be invoked as follows:
./snmp_check.pl [OPTIONS]
OPTIONS
The options command line arguments modify how the program behaves and can appear in any order:
-p [prog-config-file-path] | --prog-config [prog-config-file-path]
Where prog-config-file-path is the path to the config file containing the processes that the script will check are running on the hosts (which are also specified in that file).
If this option is not specified the script should try to read
/etc/snmp_check/prog_check.conf
-s [system-config-file-path] | --sys-config [system-config-file-path]
Where system-config-file-path is the path to the system config file. This file specifies system-wide attributes to check on specific hosts.
If this option is not specified the script should try to read
/etc/snmp_check/sys_check.conf
-h | --help
Displays information about the script and how it is used.
Any additional features you have implemented should be clearly described.
Here are the contents of an example prog_check.conf, which your script should be able to process:
# an example config file for snmp_prog_check.pl
# this config file specifies programs which should be running
rsync host:localhost community:public email syslog
# troll is on port 1161
# troll runs apache and should warn if it is not running
# or if it is using too many resources
apache2 host:TCP:troll.murdoch.edu.au:1161 community:ICT338 maxmem:500 email
syslog
# troll also acts as a log server
syslog-ng host:TCP:troll.murdoch.edu.au:1161 community:ICT338 email
# moriah is on port 1162 of troll (via port forwarding)
# moriah is a gateway, running asterisk telephone services
asterisk host:TCP:troll.murdoch.edu.au:1162 community:ICT338 maxmem:15000 email
# myth1 is on port 1164 of troll (via port forwarding)
# myth1 is a MythTV backend so mythbackend should be running
mythbackend host:TCP:troll.murdoch.edu.au:1164 community:ICT338 syslog
Here is an example sys_check.conf contents:
# an example config file for snmp_prog_check.pl
# this config file specifies normal opperating conditions
# Warn if localhost is under high load
localhost MaxProcessorLoad:80 community:public community:public email syslog
# Warn if troll is under high load
TCP:troll.murdoch.edu.au:1161 MaxProcessorLoad:90 community:public
community:public email syslog
# moriah is on port 1162 of troll (via port forwarding)
# Warn if moriah is under high load
TCP:troll.murdoch.edu.au:1162 MaxProcessorLoad:90 community:public
community:public email syslog
# rattus is on port 1163 of troll (via port forwarding)
# Warn if rattus is under high load
TCP:troll.murdoch.edu.au:1163 MaxProcessorLoad:90 community:public
community:public email syslog
# myth1 is on port 1164 of troll (via port forwarding)
# Warn if myth1 is under high load
TCP:troll.murdoch.edu.au:1164 MaxProcessorLoad:90 community:public
community:public email syslog
# gruff is on port 1165 of troll (via port forwarding)
# Warn if gruff is under high load
TCP:troll.murdoch.edu.au:1165 MaxProcessorLoad:90 community:public
community:public email syslog
# goblin is on port 1166 of troll (via port forwarding)
# Warn if goblin is under high load
TCP:troll.murdoch.edu.au:1166 MaxProcessorLoad:90 community:public
community:public email syslog
# myth2 is on port 1167 of troll (via port forwarding)
# Warn if myth2 is under high load
TCP:troll.murdoch.edu.au:1167 MaxProcessorLoad:90 community:public
community:public email syslog
Configuration file format:
Any lines in the config file starting with # should be ignored.
host:hostname
Defines the agent to connect to for the record on the same line, where hostname is the name or IP of the machine. The port and/or TCP/UDP may also be specified, as shown in the examples.
community:name
Defines the community to use for the record on the same line, where name is the community string.
prog_check.conf format:
The first word on a line is the name of a process which should be running.
Following the name of a program are directives which describe details for the program, they are separated by whitespace and should be able to appear in any order.
maxmem:num
Defines the maximum amount of memory the program is expected to consume, where num is the maximum number of KBytes the program should be consuming.
snmp_sys_check.conf format:
The first word on a line is the hostname or IP of the agent to connect to.
Following the name of the system are directives which describe details for the system, they are separated by whitespace and should be able to appear in any order.
MaxProcessorLoad:num
Defines the maximum processor load which any single CPU is expected to reach, where num is the maximum load a CPU should be consuming.
You should add at least two extra features to the program. These should be clearly documented in the code, comments section and help option.
Script Output
Output should include the date and time and a meaningful description of the event which has occurred (eg: "01/05/10 16:44:55 The system 'localhost' has exceeded the CPU load which is expected. Current load: 90, Expected < 80").
The script should deal gracefully with the event that one of the hosts are unresponsive. Some of these systems will not be available at certain times of the day.
Deliverable 2:
SNMP config files for two situations:
a) an SNMP config file for a computer which acts as an agent and manager. In this case only localhost should be allowed access to SNMP objects.