Adoptable Cookbooks List

Looking for a cookbook to adopt? You can now see a list of cookbooks available for adoption!
List of Adoptable Cookbooks

Supermarket Belongs to the Community

Supermarket belongs to the community. While Chef has the responsibility to keep it running and be stewards of its functionality, what it does and how it works is driven by the community. The chef/supermarket repository will continue to be where development of the Supermarket application takes place. Come be part of shaping the direction of Supermarket by opening issues and pull requests or by joining us on the Chef Mailing List.

Select Badges

Select Supported Platforms

Select Status

RSS

nagios (84) Versions 6.0.4

Installs and configures Nagios server

Policyfile
Berkshelf
Knife
cookbook 'nagios', '= 6.0.4', :supermarket
cookbook 'nagios', '= 6.0.4'
knife supermarket install nagios
knife supermarket download nagios
README
Dependencies
Changelog
Quality 0%

nagios cookbook

Build Status

Installs and configures Nagios server. Chef nodes are automatically discovered using search, and Nagios host groups are created based on Chef roles and optionally environments as well.

Requirements

Chef

Chef version 0.10.10+ and Ohai 0.6.12+ are required.

Because of the heavy use of search, this recipe will not work with Chef Solo, as it cannot do any searches without a server.

This cookbook relies heavily on multiple data bags. See Data Bag below.

The system running this cookbooks should have a role named 'monitoring' so that NRPE clients can authorize monitoring from that system. This role name is configurable via an attribute. See Attributes below.

The functionality that was previously in the nagios::client recipe has been moved to its own NRPE cookbook at https://github.com/tas50/chef-nrpe

Platform

  • Debian 6.X, 7.X
  • Ubuntu 10.04, 12.04, 13.04
  • Red Hat Enterprise Linux (CentOS/Amazon/Scientific/Oracle) 5.X, 6.X

Notes: This cookbook has been tested on the listed platforms. It may work on other platforms with or without modification.

Cookbooks

  • apache2 2.0 or greater
  • build-essential
  • nginx
  • nginx_simplecgi
  • php
  • yum-epel (note: this requires yum cookbook v3.0, which breaks compatibility with many other cookbooks)

Attributes

default

  • node['nagios']['user'] - Nagios user, default 'nagios'.
  • node['nagios']['group'] - Nagios group, default 'nagios'.
  • node['nagios']['plugin_dir'] - location where Nagios plugins go, default '/usr/lib/nagios/plugins'.
  • node['nagios']['multi_environment_monitoring'] - Chef server will monitor hosts in all environments, not just its own, default 'false'
  • node['nagios']['monitored_environments'] - If multi_environment_monitoring is 'true' nagios will monitor nodes in all environments. If monitored_environments is defined then nagios will monitor only hosts in the list of environments defined. For ex: ['prod', 'beta'] will monitor only hosts in 'prod' and 'beta' chef_environments. Defaults to '[]' - and all chef environments will be monitored by default.
  • node['nagios']['monitoring_interface'] - If set, will use the specified interface for all nagios monitoring network traffic. Defaults to nil

  • node['nagios']['server']['install_method'] - whether to install from package or source. Default chosen by platform based on known packages available for Nagios: debian/ubuntu 'package', redhat/centos/fedora/scientific: source

  • node['nagios']['server']['service_name'] - name of the service used for Nagios, default chosen by platform, debian/ubuntu "nagios3", redhat family "nagios", all others, "nagios"

  • node['nagios']['home'] - Nagios main home directory, default "/usr/lib/nagios3"

  • node['nagios']['conf_dir'] - location where main Nagios config lives, default "/etc/nagios3"

  • node['nagios']['resource_dir'] - location for recources, default "/etc/nagios3"

  • node['nagios']['config_dir'] - location where included configuration files live, default "/etc/nagios3/conf.d"

  • node['nagios']['log_dir'] - location of Nagios logs, default "/var/log/nagios3"

  • node['nagios']['cache_dir'] - location of cached data, default "/var/cache/nagios3"

  • node['nagios']['state_dir'] - Nagios runtime state information, default "/var/lib/nagios3"

  • node['nagios']['run_dir'] - where pidfiles are stored, default "/var/run/nagios3"

  • node['nagios']['docroot'] - Nagios webui docroot, default "/usr/share/nagios3/htdocs"

  • node['nagios']['timezone'] - Nagios timezone, defaults to UTC

  • node['nagios']['enable_ssl] - boolean for whether Nagios web server should be https, default false

  • node['nagios']['ssl_cert_file'] = Location of SSL Certificate File. default "/etc/nagios3/certificates/nagios-server.pem"

  • node['nagios']['ssl_cert_chain_file'] = Optional location of SSL Intermediate Certificate File. No default.

  • node['nagios']['ssl_cert_key'] = Location of SSL Certificate Key. default "/etc/nagios3/certificates/nagios-server.pem"

  • node['nagios']['http_port'] - port that the Apache/Nginx virtual site should listen on, determined whether ssl is enabled (443 if so, otherwise 80). Note: You will also need to configure the listening port for either NGINX or Apache within those cookbooks.

  • node['nagios']['server_name'] - common name to use in a server cert, default "nagios"

  • node['nagios']['ssl_req'] - info to use in a cert, default /C=US/ST=Several/L=Locality/O=Example/OU=Operations/CN=#{node['nagios']['server_name']}/emailAddress=ops@#{node['nagios']['server_name']}

  • node['nagios']['server']['url'] - url to download the server source from if installing from source

  • node['nagios']['server']['version'] - version of the server source to download

  • node['nagios']['server']['checksum'] - checksum of the source files

  • `node['nagios']['server']['patch_url'] - url to download patches from if installing from source

  • `node['nagios']['server']['patches'] - array of patch filenames to apply if installing from source

  • node['nagios']['url'] - URL to host Nagios from - defaults to nil and instead uses FQDN

  • node['nagios']['notifications_enabled'] - set to 1 to enable notification.

  • node['nagios']['check_external_commands']

  • node['nagios']['default_contact_groups']

  • node['nagios']['sysadmin_email'] - default notification email.

  • node['nagios']['sysadmin_sms_email'] - default notification sms.

  • node['nagios']['server_auth_method'] - authentication with the server can be done with openid (using apache2::mod_auth_openid), cas (using apache2::mod_auth_cas),ldap (using apache2::mod_authnz_ldap), or htauth (basic). The default is htauth. "openid" will utilize openid authentication, "cas" will utilize cas authentication, "ldap" will utilize LDAP authentication, and any other value will use htauth (basic).

  • node['nagios']['cas_login_url'] - login url for cas if using cas authentication.

  • node['nagios']['cas_validate_url'] - validation url for cas if using cas authentication.

  • node['nagios']['cas_validate_server'] - whether to validate the server cert. Defaults to off.

  • node['nagios']['cas_root_proxy_url'] - if set, sets the url that the cas server redirects to after auth.

  • node['nagios']['ldap_bind_dn'] - DN used to bind to the server when searching for ldap entries.

  • node['nagios']['ldap_bind_password'] - bind password used with the DN provided for searching ldap.

  • node['nagios']['ldap_url'] - ldap url and search parameters.

  • node['nagios']['ldap_authoritative'] - accepts "on" or "off". controls other authentication modules from authenticating the user if this one fails.

  • node['nagios']['users_databag'] - the databag containing users to search for. defaults to users

  • node['nagios']['users_databag_group'] - users databag group considered Nagios admins. defaults to sysadmin

  • node['nagios']['services_databag'] - the databag containing services to search for. defaults to nagios_services

  • node['nagios']['servicegroups_databag'] - the databag containing servicegroups to search for. defaults to nagios_servicegroups

  • node['nagios']['templates_databag'] - the databag containing templates to search for. defaults to nagios_templates

  • node['nagios']['hosttemplates_databag'] - the databag containing host templates to search for. defaults to nagios_hosttemplates

  • node['nagios']['eventhandlers_databag'] - the databag containing eventhandlers to search for. defaults to nagios_eventhandlers

  • node['nagios']['unmanaged_hosts_databag'] - the databag containing unmanagedhosts to search for. defaults to nagios_unmanagedhosts

  • node['nagios']['serviceescalations_databag'] - the databag containing serviceescalations to search for. defaults to nagios_serviceescalations

  • node['nagios']['hostescalations_databag'] - the databag containing hostescalations to search for. defaults to nagios_hostescalations

  • node['nagios']['contacts_databag'] - the databag containing contacts to search for. defaults to nagios_contacts

  • node['nagios']['contactgroups_databag'] - the databag containing contactgroups to search for. defaults to nagios_contactgroups

  • node['nagios']['servicedependencies_databag'] - the databag containing servicedependencies to search for. defaults to nagios_servicedependencies

  • node['nagios']['host_name_attribute'] - node attribute to use for naming the host. Must be unique across monitored nodes. Defaults to hostname

  • node['nagios']['regexp_matching'] - Attribute to enable regexp matching. Defaults to 0.

  • node['nagios']['large_installation_tweaks'] - Attribute to enable large installation tweaks. Defaults to 0.

  • node['nagios']['templates'] - These set directives in the default host template. Unless explicitly overridden, they will be inherited by the host definitions for each discovered node and nagios_unmanagedhosts data bag. For more information about these directives, see the Nagios documentation for host definitions.

  • node['nagios']['hosts_template'] - Host template you want to inherit properties/variables from, default 'server'. For more information, see the nagios doc on Object Inheritance.

  • node['nagios']['interval_length'] - minimum interval.

  • node['nagios']['brokers'] - Hash of broker modules to include in the config. Hash key is the path to the broker module, the value is any parameters to pass to it.

  • node['nagios']['default_host']['flap_detection'] - Defaults to true.

  • node['nagios']['default_host']['process_perf_data'] - Defaults to false.

  • node['nagios']['default_host']['check_period'] - Defaults to '24x7'.

  • node['nagios']['default_host']['check_interval'] - In seconds. Must be divisible by node['nagios']['interval_length']. Defaults to 15.

  • node['nagios']['default_host']['retry_interval'] - In seconds. Must be divisible by node['nagios']['interval_length']. Defaults to 15.

  • node['nagios']['default_host']['max_check_attempts'] - Defaults to 1.

  • node['nagios']['default_host']['check_command'] - Defaults to the pre-defined command 'check-host-alive'.

  • node['nagios']['default_host']['notification_interval'] - In seconds. Must be divisible by node['nagios']['interval_length']. Defaults to 300.

  • node['nagios']['default_host']['notification_options'] - Defaults to 'd,u,r'.

  • node['nagios']['default_host']['action_url'] - Defines a action url. Defaults to nil.

  • node['nagios']['default_service']['process_perf_data'] - Defaults to false.

  • node['nagios']['default_service']['action_url'] - Defines a action url. Defaults to nil.

  • node['nagios']['server']['web_server'] - web server to use. supports Apache or Nginx, default "apache"

  • node['nagios']['server']['nginx_dispatch'] - nginx dispatch method. supports cgi or php, default "cgi"

  • node['nagios']['server']['stop_apache'] - stop apache service if using nginx, default false

  • node['nagios']['server']['redirect_root'] - if using Apache, should http://server/ redirect to http://server/nagios3 automatically, default false

  • node['nagios']['server']['normalize_hostname'] - If set to true, normalize all hostnames in hosts.cfg to lowercase. Defaults to false.

These are additional nagios.cfg options.

  • node['nagios']['conf']['max_service_check_spread'] - Defaults to 5
  • node['nagios']['conf']['max_host_check_spread'] - Defaults to 5
  • node['nagios']['conf']['service_check_timeout'] - Defaults to 60
  • node['nagios']['conf']['host_check_timeout'] - Defaults to 30
  • node['nagios']['conf']['process_performance_data'] - Defaults to 0
  • node['nagios']['conf']['host_perfdata_command'] - Defaults to nil
  • node['nagios']['conf']['host_perfdata_file'] - Defaults to nil
  • node['nagios']['conf']['host_perfdata_file_template'] - Defaults to nil
  • node['nagios']['conf']['host_perfdata_file_mode'] - Defaults to nil
  • node['nagios']['conf']['host_perfdata_file_processing_interval'] - Defaults to nil
  • node['nagios']['conf']['host_perfdata_file_processing_command'] - Defaults to nil
  • node['nagios']['conf']['service_perfdata_command'] - Defaults to nil
  • node['nagios']['conf']['service_perfdata_file'] - Defaults to nil
  • node['nagios']['conf']['service_perfdata_file_template'] - Defaults to nil
  • node['nagios']['conf']['service_perfdata_file_mode'] - Defaults to nil
  • node['nagios']['conf']['service_perfdata_file_processing_interval'] - Defaults to nil
  • node['nagios']['conf']['service_perfdata_file_processing_command'] - Defaults to nil
  • node['nagios']['conf']['date_format'] - Defaults to 'iso8601'
  • node['nagios']['conf']['p1_file'] - Defaults to #{node['nagios']['home']}/p1.pl
  • node['nagios']['conf']['debug_level'] - Defaults to 0
  • node['nagios']['conf']['debug_verbosity'] - Defaults to 1
  • node['nagios']['conf']['debug_file'] - Defaults to #{node['nagios']['state_dir']}/#{node['nagios']['server']['name']}.debug

These are nagios cgi.config options.

  • node['nagios']['cgi']['show_context_help'] - Defaults to 1
  • node['nagios']['cgi']['authorized_for_system_information'] - Defaults to '*'
  • node['nagios']['cgi']['authorized_for_configuration_information'] - Defaults to '*'
  • node['nagios']['cgi']['authorized_for_system_commands'] - Defaults to '*'
  • node['nagios']['cgi']['authorized_for_all_services'] - Defaults to '*'
  • node['nagios']['cgi']['authorized_for_all_hosts'] - Defaults to '*'
  • node['nagios']['cgi']['authorized_for_all_service_commands'] - Defaults to '*'
  • node['nagios']['cgi']['authorized_for_all_host_commands'] - Defaults to '*'
  • node['nagios']['cgi']['default_statusmap_layout'] - Defaults to 5
  • node['nagios']['cgi']['default_statuswrl_layout'] - Defaults to 4
  • node['nagios']['cgi']['escape_html_tags'] - Defaults to 0
  • node['nagios']['cgi']['action_url_target'] - Defaults to '_blank'
  • node['nagios']['cgi']['notes_url_target'] - Defaults to '_blank'
  • node['nagios']['cgi']['lock_author_names'] - Defaults to 1

Recipes

default

Includes the correct client installation recipe based on platform, either nagios::server_package or nagios::server_source.

The server recipe sets up Apache as the web front end by default. This recipe also does a number of searches to dynamically build the hostgroups to monitor, hosts that belong to them and admins to notify of events/alerts.

Searches are confined to the node's chef_environment unless multi-environment monitoring is enabled.

The recipe does the following:

  1. Searches for users in 'users' databag belonging to a 'sysadmin' group, and authorizes them to access the Nagios web UI and also to receive notification e-mails.
  2. Searches all available roles/environments and builds a list which will become the Nagios hostgroups.
  3. Places nodes in Nagios hostgroups by role / environment membership.
  4. Installs various packages required for the server.
  5. Sets up configuration directories.
  6. Moves the package-installed Nagios configuration to a 'dist' directory.
  7. Disables the 000-default VirtualHost present on Debian/Ubuntu Apache2 package installations.
  8. Templates configuration files for services, contacts, contact groups, templates, hostgroups and hosts.
  9. Enables the Nagios web UI.
  10. Starts the Nagios server service

server_package

Installs the Nagios server from packages. Default for Debian / Ubuntu systems.

server_source

Installs the Nagios server from source. Default for Red Hat / Fedora based systems as native packages for Nagios are not available in the default repositories.

pagerduty

Installs pagerduty plugin for nagios. If you only have a single pagerduty key, you can simply set a node['nagios']['pagerduty_key'] attribute on your server. For multiple pagerduty key configuration see Pager Duty under Data Bags.

This recipe was written based on the Nagios Integration Guide from PagerDuty which explains how to get an API key for your Nagios server.

Data Bags

Users

Create a users data bag that will contain the users that will be able to log into the Nagios webui. Each user can use htauth with a specified password, or an openid. Users that should be able to log in should be in the sysadmin group. Example user data bag item:

{
  "id": "nagiosadmin",
  "groups": "sysadmin",
  "htpasswd": "hashed_htpassword",
  "openid": "http://nagiosadmin.myopenid.com/",
  "nagios": {
    "pager": "nagiosadmin_pager@example.com",
    "email": "nagiosadmin@example.com"
  }
}

When using server_auth_method 'openid' (default), use the openid in the data bag item. Any other value for this attribute (e.g., "htauth", "htpasswd", etc) will use the htpasswd value as the password in /etc/nagios3/htpasswd.users.

The openid must have the http:// and trailing /. The htpasswd must be the hashed value. Get this value with htpasswd:

% htpasswd -n -s nagiosadmin
New password:
Re-type new password:
nagiosadmin:{SHA}oCagzV4lMZyS7jl2Z0WlmLxEkt4=

For example use the {SHA}oCagzV4lMZyS7jl2Z0WlmLxEkt4= value in the data bag.

Contacts and Contact Groups

To send alerting notification to contacts that aren't authorized to login to Nagios via the 'users' data bag create nagios_contacts and nagios_contactgroups data bags.

Example nagios_contacts data bag item

{
  "id": "devs",
  "alias": "Developers",
  "use": "default-contact",
  "email": "devs@company.com",
  "pager": "page_the_devs@company.com"
}

Example nagios_contactgroup data bag item

{
  "id": "non_admins",
  "alias": "Non-Administrator Contacts",
  "members": "devs,helpdesk,managers"
}

Services

To add service checks to Nagios create a nagios_services data bag containing definitions for services to be monitored. This allows you to add monitoring rules without directly editing the services and commands templates in the cookbook. Each service will be named based on the id of the data bag item and the command will be named using the same id prepended with "check_". Just make sure the id in your data bag doesn't conflict with a service or command already defined in the templates.

Here's an example of a service check for sshd that you could apply to all hostgroups:

{
  "id": "ssh",
  "hostgroup_name": "linux",
  "command_line": "$USER1$/check_ssh $HOSTADDRESS$"
}

You may optionally define the service template for your service by including service_template and a valid template name.

Example:
javascript
"service_template": "special_service_template".

You may also optionally add a service description that will be displayed in the Nagios UI using "description": "My Service Name". If this is not present the databag item ID will be used as the description. You use defined escalations for the service with 'use_escalation'. See _Service_Escalations for more information.

You may also use an already defined command definition by omitting the command_line parameter and using use_existing_command parameter instead:

{
  "id": "pingme",
  "hostgroup_name": "all",
  "use_existing_command": "check-host-alive"
}

You may also specify that a check only be run if the nagios server is in a specific environment. This is useful if you have nagios servers in several environments but you would like a service check to only apply in one particular environment:

{
  "id": "ssh",
  "hostgroup_name": "linux",
  "activate_check_in_environment": "staging",
  "command_line": "$USER1$/check_ssh $HOSTADDRESS$"
}

Service Groups

Create a nagios_servicegroups data bag that will contain definitions for service groups. Each server group will be named based on the id of the data bag.

{
  "id": "ops",
  "alias": "Ops",
  "notes": "Services for ops"
}

You can group your services by using the "servicegroups" keyword in your services data bags. For example, to have your ssh checks show up under the ops service group, you could define it like this:

{
  "id": "ssh",
  "hostgroup_name": "all",
  "command_line": "$USER1$/check_ssh $HOSTADDRESS$",
  "servicegroups": "ops"
}

Service Dependencies

Create a nagios_servicedependencies data bag that will contain definitions for service dependencies. Each service dependency will be named based on the id of the data bag. Each service dependency requires a dependent host name and/or hostgroup name, dependent service description, host name and/or hostgroup name, and service description.

{
  "id": "Service_X_depends_on_Service_Y",
  "dependent_host_name": "ServerX",
  "dependent_service_description": "Service X",
  "host_name": "ServerY",
  "service_description": "Service Y",
  "notification_failure_criteria": "u, c"
}

Additional directives can be defined as described in the Nagios documentation.

Time Periods

Create a data bag for time periods, nagios_timeperiods by default, for timeperiod defintions. Time periods are named based on the id of the data bag, and the id and alias are required.

Here is an example timeperiod definition:

{
  "id": "time_period_name",
  "alias": "This time period goes from now to then",
  "times": [
    "sunday 09:00-17:00",
    "monday 09:00-17:00",
    "tuesday 09:00-17:00",
    "wednesday 09:00-17:00",
    "thursday 09:00-17:00",
    "friday 09:00-17:00",
    "saturday 09:00-17:00"
  ]
}

Additional information on defining time periods can be found in the Nagios Documentation.

Host Templates

Host templates are optional, but allow you to specify combinations of attributes to apply to a host. Create a nagios_hosttemplates\ data bag that will contain definitions for host templates to be used. Each host template need only specify id and whichever parameters you want to override.

Here's an example of a template that reduces the check frequency to once per day and changes the retry interval to 1 hour.

{
  "id": "windows-host",
  "check_command": "check-host-alive-windows"
}

You then use the host template by setting the node['nagios']['host_template'] attribute for a node. You could apply this with a role as follows:

role 'windows'

default_attributes(
  nagios: {
    host_template: 'windows-host'
  }
)

Additional directives can be defined as described in the Nagios documentation for Host Definitions.

Templates

Templates are optional, but allow you to specify combinations of attributes to apply to a service. Create a nagios_templates\ data bag that will contain definitions for templates to be used. Each template need only specify id and whichever parameters you want to override.

Here's an example of a template that reduces the check frequency to once per day and changes the retry interval to 1 hour.

{
  "id": "dailychecks",
  "check_interval": "86400",
  "retry_interval": "3600"
}

You then use the template in your service data bag as follows:

{
  "id": "expensive_service_check",
  "hostgroup_name": "linux",
  "command_line": "$USER1$/check_example $HOSTADDRESS$",
  "service_template": "dailychecks"
}

Search Defined Hostgroups

Create a nagios_hostgroups data bag that will contain definitions for Nagios hostgroups populated via search. These data bags include a Chef node search query that will populate the Nagios hostgroup with nodes based on the search.

Here's an example to find all HP hardware systems for an "hp_systems" hostgroup:

{
  "search_query": "dmi_system_manufacturer:HP",
  "hostgroup_name": "hp_systems",
  "id": "hp_systems"
}

Monitoring Systems Not In Chef

Create a nagios_unmanagedhosts data bag that will contain definitions for hosts not in Chef that you would like to manage. "hostgroups" can be an existing Chef role (every Chef role gets a Nagios hostgroup) or a new hostgroup. Note that "hostgroups" must be an array of hostgroups even if it contains just a single hostgroup. host_template defaults to 'server', but you can override it to use a custom template.

Here's an example host definition:

{
  "address": "webserver1.mydmz.dmz",
  "hostgroups": ["web_servers","production_servers"],
  "id": "webserver1",
  "notifications": 1,
  "host_template": "unpingable-host"
}

Similar to services, you may also filter unmanaged hosts by environment. This is useful if you have nagios servers in several environments but you would like to monitor an unmanaged host that only exists in a particular environment:

{
  "address": "webserver1.mydmz.dmz",
  "hostgroups": ["web_servers","production_servers"],
  "id": "webserver1",
  "environment": "production",
  "notifications": 1
}

Service Escalations

You can optionally define service escalations for the data bag defined services. Doing so involves two steps - creating the nagios_serviceescalations data bag and invoking it from the service. For example, to create an escalation to page managers on a 15 minute period after the 3rd page:

{
  "id": "15-minute-escalation",
  "contact_groups": "managers",
  "first_notification": "3",
  "last_notification": "0",
  "escalation_period": "24x7",
  "notification_interval": "900"
}

Then, in the service data bag,

{
  "id": "my-service",
  // ...
  "use_escalation": "15-minute-escalation"
}

You can also define escalations using wildcards, like so:

{
  "id": "first-warning",
  "contact_groups": "sysadmin",
  "hostgroup_name": "*",
  "first_notification": "1",
  "last_notification": "0",
  "notification_interval": "21600",
  "escalation_period": "24x7",
  "escalation_options": "w",
  "hostgroup_name": "*",
  "service_description": "*",
  "register": 1
}

This configures notifications for all warnings to repeat on a given interval (under the default config, every 6 hours). (Note that you must register this kind of escalation, as it is not a template.)

Event Handlers

You can optionally define event handlers to trigger on service alerts by creating a nagios_eventhandlers data bag that will contain definitions of event handlers for services monitored via Nagios.

This example event handler data bags restarts chef-client. Note: This assumes you have already defined a NRPE job restart_chef-client on the host where this command will run. You can use the NRPE LWRP to add commands to your local NRPE configs from within your cookbooks.

{
  "command_line": "$USER1$/check_nrpe -H $HOSTADDRESS$ -t 45 -c restart_chef-client",
  "id": "restart_chef-client"
}

Once you've defined an event handler you will need to add the event handler to a service definition in order to trigger the action. See the example service definition below.

{
  "command_line": "$USER1$/check_nrpe -H $HOSTADDRESS$ -t 45 -c check_chef_client",
  "hostgroup_name": "linux",
  "id": "chef-client",
  "event_handler": "restart_chef-client"
}

Pager Duty

You can define pagerduty contacts and keys by creating nagios_pagerduty data bags that contain the contact and
the relevant key. Setting admin_contactgroup to "true" will add this pagerduty contact to the admin contact group
created by this cookbook.

{
  "id": "pagerduty_critical",
  "admin_contactgroup": "true",
  "key": "a33e5ef0ac96772fbd771ddcccd3ccd0"
}

You can add these contacts to any contactgroups you create.

Monitoring Role

Create a role to use for the monitoring server. The role name should match the value of the attribute "node['nagios']['server_role']". By default, this is 'monitoring'. For example:

# roles/monitoring.rb
name 'monitoring'
description 'Monitoring server'
run_list(
  'recipe[nagios::default]'
)

default_attributes(
  'nagios' => {
    'server_auth_method' => 'htauth'
  }
)
$ knife role from file monitoring.rb

Usage

server setup

Create a role named 'monitoring', and add the nagios server recipe to the run_list. See Monitoring Role above for an example.

Apply the nrpe cookbook to nodes in order to install the NRPE client

By default the Nagios server will only monitor systems in its same environment. To change this set the multi_environment_monitoring attribute. See Attributes

Create data bag items in the users data bag for each administer you would like to be able to login to the Nagios server UI. Pay special attention to the method you would like to use to authorization users (openid or htauth). See Users and Atttributes

At this point you now have a minimally functional Nagios server, however the server will lack any service checks outside of the single Nagios Server health check.

defining checks

NRPE commands are defined in recipes using the nrpe_check LWRP provider in the nrpe cookbooks. For base system monitoring such as load, ssh, memory, etc you may want to create a cookbook in your environment that defines each monitoring command via the LWRP.

With NRPE commands created using the LWRP you will need to define Nagios services to use those commands. These services are defined using the nagios_services data bag and applied to roles and/or environments. See Services

enabling notifications

You need to set default['nagios']['notifications_enabled'] = 1 attribute on your Nagios server to enable email notifications.

For email notifications to work an appropriate mail program package and local MTA need to be installed so that /usr/bin/mail or /bin/mail is available on the system.

Example:

Include postfix cookbook to be installed on your Nagios server node.

Add override_attributes to your monitoring role:

# roles/monitoring.rb
name 'monitoring'
description 'Monitoring Server'
run_list(
  'recipe[nagios:default]',
  'recipe[postfix]'
)

override_attributes(
  'nagios' => { 'notifications_enabled' => '1' },
  'postfix' => { 'myhostname':'your_hostname', 'mydomain':'example.com' }
)

default_attributes(
  'nagios' => { 'server_auth_method' => 'htauth' }
)
$ knife role from file monitoring.rb

License & Authors

Copyright 2009, 37signals
Copyright 2009-2013, Chef Software, Inc
Copyright 2012, Webtrends Inc.
Copyright 2013-2014, Limelight Networks, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

nagios Cookbook CHANGELOG

This file is used to list changes made in each version of the nagios cookbook.

v6.0.4

Bug

  • Fix normalized hostnames not normalizing the hostgroups
  • Don't register the service templates so that Nagios will start properly
  • Require Apache2 cookbook version 2.0 or greater due to breaking changes with how site.conf files are handled

Improvement

  • Added additional options for perfdata

New Feature

  • Added the ability to specify a URL to download patches that will be applied to the source install prior to compliation

v6.0.2

Bug

  • Remove .DS_Store files in the supermarket file that caused failures on older versions of Berkshelf

v6.0.0

Breaking changes

  • NRPE is no longer installed by the nagios cookbook. This is handled by the NRPE cookbook. Moving this logic allows for more fined grained control of how the two services are installed and configured
  • Previously the Nagios server was monitored out of the box using a NRPE check. This is no longer the case since the cookbooks are split. You'll need to add a services data bag to return this functionality
  • RHEL now defaults to installing via packages. If you would like to continue installing via source make sure to set the installation_method attribute
  • node['nagios']['additional_contacts'] attribute has been removed. This was previously used for Pagerduty integration
  • Server setup is now handled in the nagios::default recipe vs. the nagios::server recipe. You will need to update roles / nodes referencing the old recipe

Bug

  • htpasswd file should be setup after Nagios has been installed to ensure the user has been created
  • Ensure that the Linux hostgroup still gets created even if the Nagios server is the first to come up in the environment
  • Correctly set the vname on RHEL/Fedora platforms for source/package installs
  • Set resource_dir in nagios.cfg on RHEL platforms with a new attribute
  • Create the archives dir in the log on source installs
  • Properly create the Nagios user/group on source installs
  • Properly set the path for the p1.pl file on RHEL platforms
  • Ensure that the hostgroups array doesn't include duplicates in the even that an environment and role have the same name
  • Only template nagios.cfg once
  • Fix ocsp-command typo in nagios.cfg
  • Fix bug that prevented Apache2 recipe from completing

Improvement

  • Readme cleanup
  • Created a new users_helper library to abstract much of the Ruby logic for building user lists out of the recipe
  • Avoid writing out empty comments in templates for data bag driven configs
  • Add a full chefignore file to help with Berkshelf
  • Better documented host_perfdata_command and service_perfdata_command in the README
  • Add possibility to configure default_service with options process_perf_data & action_url
  • Add possibility to configure default_host with options process_perf_data & action_url
  • Allow freshness_threshold and active_checks_enabled to be specified in templates
  • Added a generic service-template w/min req. params

New Feature

  • New attribute node['nagios']['monitored_environments'] for specifying multiple environments you'd like to monitor
  • Allow using the exclusion hostgroup format used by Nagios when defining the hostgroup for a check
  • Host templates can now be defined via a new host_templates data bag.

Development

  • Vagrantfile updated for Vagrant 1.5 format changes
  • Updated Rubocop / Foodcritic / Chefspec / Berkshelf gems to the latest for Travis testing
  • Updated Berkshelf file to the 3.0 format
  • Updated Test Kitchen / Kitchen Vagrant gems to the latest for local testing
  • Test Kitchen suite added for source installs
  • Ubuntu 13.04 swapped for 14.04 in Test Kitchen
  • Added a large number of data bags to be used by Test Kitchen to handle several scenarios
  • Setup port forwarding in Test Kitchen so you can converge the nodes and load the Web UI
  • Added additional Test Kitchen and Chef Spec tests

v5.3.4

Bug

  • Fixed two bugs that prevented Apache/NGINX web server setups from configuring correctly

v5.3.2

Bug

  • Remove a development file that was accidentally added to the community site release

v5.3.0

Breaking changes

  • Directories for RHEL installations have been updated to use correct RHEL directories vs. Debian directories. You may need to override these directories with the existing directories to not break existing installations on RHEL. Proceed with caution.

Bug

  • Cookbook no longer fails the run if a node has no roles
  • Cookbook no longer fails if there are no users defined in the data bag
  • Cookbook no longer fails if a node has no hostname
  • Cookbook no longer fails if the node does not have a defined OS
  • Fix incorrect Pagerduty key usage
  • Allowed NRPE hosts were not being properly determined due to bad logic and a typo

Improvement

  • Improve Test-Kitchen support with newer RHEL point releases, Ubuntu 13.04, and Debian 6/7
  • Simplified logic in web server detection for determining public domain and switches from symbols to strings throughout

New Feature

  • Support for Nagios host escalations via a new data bag. See the readme for additional details
  • New attribute node['nagios']['monitoring_interface'] to allow specifying a specific network interface's IP to monitor
  • You can now define the values for execute_service_checks, accept_passive_service_checks, execute_host_checks, and accept_passive_host_checks via attributes
  • You can now define the values for obsess_over_services and obsess_over_hosts settings via attributes

v5.2.0

Breaking changes

  • This release requires yum-epel, which requires the yum v3.0 cookbook. This may break other cookbooks in your environment

Bug

  • Change yum cookbook dependency to yum-epel dependecy as yum cookbook v3.0 removed epel repo setup functionality
  • Several fixes to the Readme examples

Improvement

  • Use the new monitoring-plugins.org address for the Nagios Plugins during source installs
  • The version of apt defined in the Berksfile is no longer constrained
  • Find all nodes by searching by node not hostname to workaround failures in ohai determining the hostname

New Feature

  • Allow defining of time periods via new data bag nagios_timeperiods. See the Readme for additional details

v5.1.0

Bug

  • COOK-3210 Contacts are now only written out if the contact has Nagios keys defined, which prevents e-mail-less contacts from being written out
  • COOK-4098 Fixed an incorrect example for using templates in the readme
  • Fixed a typo in the servicedependencies.cfg.erb template that resulted in hostgroup_name always being blank

Improvement

  • The Yum cookbook dependency has been pinned to < 3.0 to prevent breakage when the 3.0 cookbook is released
  • COOK-2389 The logic used to determine what IP to identify the monitored host by has been moved into the default library to simplify the hosts.cfg.erb template
  • A Vagrantfile has been added to allow for testing on Ubuntu 10.04/12.04 and CentOS 5.9/6.4 in multi-node setups
  • Chef spec tests have been added for the server
  • Gemfile updated to use Rubocop 0.15 and TestKitchen 1.0
  • COOK-3913 / COOK-3914 Source based installations now use Nagios 3.5.1 and the Nagios Plugins 1.5.0

New Feature

  • The names of the various data bags used in the cookbook can now be controlled with new attributes found in the server.rb attribute file
  • All configuration options in the cgi.cfg and nrpe.cfg files can now be controlled via attributes
  • COOK-3690 An intermediate SSL certificate can now be used on the web server as defined in the new attribute node['nagios']['ssl_cert_chain_file']
  • COOK-2732 A service can now be applied to multiple hostgroups via the data bag definition
  • COOK-3781 Service escalations can now be written using wildcards. See the readme for an example of this feature.
  • COOK-3702 Multiple PagerDuty keys for different contacts can be defined via a new nagios_pagerduty data bag. See the readme for more information on the new data bag and attributes for this feature.
  • COOK-3774Services can be limited to run on nagios servers in specific chef environments by adding a new "activate_check_in_environment" key to the services data bag. See the Services section of the readme for an example.
  • CHEF-4702 Chef solo users can now user solo-search for data bag searchd (https://github.com/edelight/chef-solo-search)

v5.0.2

Improvement

  • COOK-3777 - Update NRPE in nagios cookbook to 2.15
  • COOK-3021 - NRPE LWRP updates files every run
  • Fixing up to pass rubocop

v5.0.0

Bug

  • COOK-3778 - Fix missing customization points for Icinga
  • COOK-3731 - Remove range searches in Nagios cookbook that break chef-zero
  • COOK-3729 - Update Nagios Plugin download URL
  • COOK-3579 - Stop shipping icons files that arent used
  • COOK-3332 - Fix nagios::client failures on Chef Solo

Improvement

  • COOK-3730 - Change the default authentication method
  • COOK-3696 - Sort hostgroups so they don't get updated on each run
  • COOK-3670 - Add Travis support
  • COOK-3583 - Update Nagios source to 3.5.1
  • COOK-3577 - Cleanup code style
  • COOK-3287 - Provide more customization points to make it possible to use Icinga
  • COOK-1725 - Add configurable notification options for nagios::pagerduty

New Feature

  • COOK-3723 - Support regexp_matching in Nagios
  • COOK-3695 - Add more tunables for default host template

v4.2.0

New Feature

  • COOK-3445 - Allow setting service dependencies from data dags
  • COOK-3429 - Allow setting timezone from attribute
  • COOK-3422 - Enable large installation tweaks by attribute

Improvement

  • COOK-3440 - Permit additional pagerduty-like integrations
  • COOK-3136 - Fix nagios::client_source under Gentoo
  • COOK-3111 - Add support for alternate users databag to Nagios cookbook
  • COOK-2891 - Improve RHEL 5 detection in Nagios cookbook to catch all versions
  • COOK-2721 - Add Chef Solo support

Bug

  • COOK-3405 - Fix NRPE source install on Ubuntu
  • COOK-3404 - Fix htpasswd file references (Chef 11 fix)
  • COOK-3282 - Use host_name attribute when used in conjunction with a search-defined hostgroup
  • COOK-3162 - Allow setting port
  • COOK-3140 - No longer import databag users even if they don't have an htpasswd value set
  • COOK-3068 - Use nagios_conf definition in nagios::pagerduty

v4.1.4

Bug

  • [COOK-3014]: Nagios cookbook imports data bag users even if they have action :remove

Improvement

  • [COOK-2826]: Allow Nagios cookbook to configure location of SSL files

v4.1.2

Bug

  • [COOK-2967]: nagios cookbook has foodcritic failure

Improvement

  • [COOK-2630]: Improvements to Readme and Services.cfg.erb template

New Feature

  • [COOK-2460]: create attribute for allowed_hosts

v4.1.0

  • [COOK-2257] - Nagios incorrectly tries to use cloud IPs due to a OHAI bug
  • [COOK-2474] - hosts.cfg.erb assumes if nagios server node has the cloud attributes all nodes have the cloud attributes
  • [COOK-1068] - Nagios::client should support CentOS/RHEL NRPE installs via package
  • [COOK-2565] - nginx don't send AUTH_USER & REMOTE_USER to nagios
  • [COOK-2546] - nrpe config files should not be world readable
  • [COOK-2558] - Services that are attached to hostgroups created from the nagios_hostgroups databag are not created
  • [COOK-2612] - Nagios can't start if search can't find hosts defined in nagios_hostgroups
  • [COOK-2473] - Install Nagios 3.4.4 for source installs
  • [COOK-2541] - Nagios cookbook should use node.roles instead of node.run_list.roles when calculating hostgroups
  • [COOK-2543] - Adds the ability to normalize hostnames to lowercase
  • [COOK-2450] - Add ability to define service groups through data bags.
  • [COOK-2642] - With multiple nagios servers, they can't use NRPE to check each other
  • [COOK-2613] - Install Nagios 3.5.0 when installing from source

v4.0.0

This is a major release that refactors a significant amount of the service configuration to use data bags rather than hardcoding specific checks in the templates. The README describes how to create services via data bags.

The main incompatibility and breaking change is that the default services that are monitored by Nagios is reduced to only the "check-nagios" service. This means that existing installations will need to start converting checks over to the new data bag entries.

  • [COOK-1553] - Nagios: check_nagios command does not work if Nagios is installed from source
  • [COOK-1554] - Nagios: The nagios server should be added to all relevant host groups
  • [COOK-1746] - nagios should provide more flexibility for server aliases
  • [COOK-2006] - Extract default checks out of nagios
  • [COOK-2129] - If a host is in the _default environment it should go into the _default hostgroup
  • [COOK-2130] - Chef needs to use the correct nagios plugin path on 64bit CentOS systems
  • [COOK-2131] - gd development packages are not necessary for NRPE installs from source
  • [COOK-2132] - Update NRPE installs to 2.14 from 2.13
  • [COOK-2134] - Handle nagios-nrpe-server and nrpe names for NRPE in the init scripts and cookbook
  • [COOK-2135] - Use with-nagios-user and group options source NRPE installs
  • [COOK-2136] - Nagios will not pass config check when multiple machines in different domains have the same hostname
  • [COOK-2150] - hostgroups data bag search doesn't respect the multi_environment_monitoring attribute
  • [COOK-2186] - add service escalation to nagios
  • [COOK-2188] - A notification interval of zero is valid but prohibited by the cookbook
  • [COOK-2200] - Templates and Services from data bags don't specify intervals in the same way as the rest of the cookbook
  • [COOK-2216] - Nagios cookbook readme needs improvement
  • [COOK-2240] - Nagios server setup needs to gracefully fail when users data bag is not present
  • [COOK-2241] - Stylesheets fail to load on a fresh Nagios install
  • [COOK-2242] - Remove unused checks in the NRPE config file
  • [COOK-2245] - nagios::server writes openid apache configs before including apache2::mod_auth_openid
  • [COOK-2246] - Most of the commands in the Nagios cookbook don't work
  • [COOK-2247] - nagios::client_source sets pkgs to a string, then tries to pkgs.each do {|pkg| package pkg }
  • [COOK-2257] - Nagios incorrectly tries to use cloud IPs due to a OHAI bug
  • [COOK-2275] - The Nagios3 download URL attribute is unused
  • [COOK-2285] - Refactor data bag searches into library
  • [COOK-2294] - Add cas authentication to nagios cookbook
  • [COOK-2295] - nagios: chef tries to start nagios-nrpe-server on every run
  • [COOK-2300] - You should be able to define a nagios_service into the "all" host group
  • [COOK-2341] - pagerduty_nagios.pl URL changed
  • [COOK-2350] - Nagios server fails to start when installed via source on Ubuntu/Debian
  • [COOK-2369] - Add LDAP support in the nagios cookbook.
  • [COOK-2374] - Setting an unmanaged host to a string returns 'no method error'
  • [COOK-2375] - Allows adding a service that utilizes a pre-existing command
  • [COOK-2433] - Nagios: ldap authentication needs to handle anonymous binding ldap servers

v3.1.0

  • [COOK-2032] - Use public IP address for inter-cloud checks and private for intra-cloud checks
  • [COOK-2081] - add support for notes_url to nagios_services data bags

v3.0.0

This is a major release due to some dramatic refactoring to the service check configuration which may not be compatible with existing implementations of this cookbook.

  • [COOK-1544] - Nagios cookbook needs to support event handlers
  • [COOK-1785] - Template causes service restart every time
  • [COOK-1879] - Nagios: add configuration to automatically redirect http://myserver/ to http://myserver/nagios3/
  • [COOK-1880] - Extra attribute was left over after the multi_environment_monitoring update
  • [COOK-1881] - Oracle should be added to the metadata for Nagios
  • [COOK-1891] - README says to modify the nrpe.cfg template, but the cookbook exports a resource for nrpe checks.
  • [COOK-1947] - Nagios: Pager duty portions of Nagios cookbook not using nagios user/group attributes
  • [COOK-1949] - Nagios: A bad role on a node shouldn't cause the cookbook to fail
  • [COOK-1950] - Nagios: Simplify hostgroup building and cookbook code
  • [COOK-1995] - Nagios: Update source install to use Nagios 3.4.3 not 3.4.1
  • [COOK-2005] - Remove unusable check commands from nagios
  • [COOK-2031] - Adding templates as a data bag, extending service data bag to take arbitrary config items
  • [COOK-2032] - Use public IP address for intra-cloud checks
  • [COOK-2034] - Nagios cookbook calls search more often than necessary
  • [COOK-2054] - Use service description in the nagios_services databag items
  • [COOK-2061] - template.erb refers to a service variable when it should reference template.

v2.0.0

  • [COOK-1543] - Nagios cookbook needs to be able to monitor environments
  • [COOK-1556] - Nagios: Add ability to define service template to be used in the nagios_services data bag
  • [COOK-1618] - Users data bag group allowed to log into Nagios should be configurable
  • [COOK-1696] - Nagios: Support defining non-Chef managed hosts via data bag items
  • [COOK-1697] - nagios: Source installs should install the latest NRPE and Nagios plugins
  • [COOK-1717] - Nagios: nagios server web page under Apache2 fails to load out of the box
  • [COOK-1723] - Amazon missing as a supported OS in the Nagios metadata
  • [COOK-1732] - nagios::client_source includes duplicate resources
  • [COOK-1815] - Switch Nagios to use platform_family not platform
  • [COOK-1816] - Nagios: mod ssl shouldn't get installed if SSL isn't being used
  • [COOK-1887] - value_for_platform_family use in Nagios cookbook is broken

v1.3.0

  • [COOK-715] - don't source /etc/sysconfig/network on non-RHEL platforms
  • [COOK-769] - don't use nagios specific values in users data bag items if they don't exist
  • [COOK-1206] - add nginx support
  • [COOK-1225] - corrected inconsistencies (mode, user/group, template headers)
  • [COOK-1281] - add support for amazon linux
  • [COOK-1365] - nagios_conf does not use nagios user/group attributes
  • [COOK-1410] - remvoe deprecated package resource
  • [COOK-1411] - Nagios server source installs should not necessarily install the NRPE client from source
  • [COOK-1412] - Nagios installs from source do not install a mail client so notifications fail
  • [COOK-1413] - install nagios 3.4.1 instead of 3.2.3
  • [COOK-1518] - missing sysadmins variable in apache recipe
  • [COOK-1541] - support environments that have windows systems
  • [COOK-1542] - allow setting flap detection via attribute
  • [COOK-1545] - add support for defining host groups using search in data bags
  • [COOK-1553] - check_nagios command doesn't work from source install
  • [COOK-1555] - include service template for monitoring logs
  • [COOK-1557] - check-nagios command only works in environments with single nagios server
  • [COOK-1587] - use default attributes instead of normal in cookbook attributes files

V1.2.6

  • [COOK-860] - set mail command with an attribute by platform

v1.2.4

  • [COOK-1119] - attributes for command_timeout / dont_blame_nrpe options
  • [COOK-1120] - allow monitoring from servers in multiple chef_environments

v1.2.2

  • [COOK-991] - NRPE LWRP No Longer Requires a Template
  • [COOK-955] - Nagios Service Checks Defined by Data Bags

v1.2.0

  • [COOK-837] - Adding a Recipe for PagerDuty integration
  • [COOK-868] - use node, not @node in template
  • [COOK-869] - corrected NRPE PID path
  • [COOK-907] - LWRP for defining NRPE checks
  • [COOK-917] - changes to mod_auth_openid module

v1.0.4

  • [COOK-838] - Add HTTPS Option to Nagios Cookbook

v1.0.2

  • [COOK-636] - Nagios server recipe attempts to start too soon
  • [COOK-815] - Nagios Config Changes Kill Nagios If Config Goes Bad

v1.0.0

  • Use Chef 0.10's node.chef_environment instead of node['app_environment'].
  • source installation support on both client and server sides
  • initial RHEL/CentOS/Fedora support

Foodcritic Metric
            

6.0.4 failed this metric

FC002: Avoid string interpolation where not required: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/apache.rb:47
FC003: Check whether you are running with chef server before using server-specific features: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/default.rb:108
FC003: Check whether you are running with chef server before using server-specific features: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/default.rb:110
FC003: Check whether you are running with chef server before using server-specific features: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/default.rb:123
FC003: Check whether you are running with chef server before using server-specific features: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/default.rb:132
FC003: Check whether you are running with chef server before using server-specific features: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/default.rb:173
FC003: Check whether you are running with chef server before using server-specific features: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/default.rb:177
FC003: Check whether you are running with chef server before using server-specific features: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/default.rb:181
FC015: Consider converting definition to a LWRP: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/definitions/nagios_conf.rb:1
FC023: Prefer conditional attributes: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/default.rb:197
FC023: Prefer conditional attributes: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/nginx.rb:19
FC023: Prefer conditional attributes: /tmp/cook/b9241efa4231f7743e7ee4ad/nagios/recipes/server_source.rb:170