Now and again Debian package repositories change their signature key. When that
happens, apt-get update
usually starts failing. The problem is however, that
icinga’s check_apt
will not notice.
The technical reason for that is that apt-get update
needs root rights in
order to update its index. Which it doesn’t since the icinga check runs under
the nagios
(or icinga
) user and doesn’t have those rights.
Here’s the necessary components to monitor whether apt-get update
works. They
get deployed by an ansible role, but can be used standalone.
$ cd roles/icinga2/check-apt-update
$ find
files
files/icinga2
files/icinga2/zones.d
files/icinga2/zones.d/global
files/icinga2/zones.d/global/apt-check-update.conf
files/check_apt_update
files/sudo
files/sudo/check-apt-update
tasks
tasks/main.yml
$ cat files/icinga2/zones.d/global/apt-check-update.conf
object CheckCommand "apt_update" {
import "plugin-check-command"
command = [ PluginDir + "/check_apt_update" ]
}
template Service "apt-update-service" {
import "generic-service"
# only check once a day if `apt-get update` works
# since this is an expensive operation
check_interval = 24h
check_command = "apt_update"
}
# check should be active in all zones that do not
# explicitly disable it
#
apply Service "apt-update" {
import "apt-update-service"
assign where !regex("apt-update", host.vars.exclude_services
}
$ cat ./files/check_apt_update
#!/bin/sh
tmp=$( mktemp )
sudo apt-get update > "$tmp" 2>&1
if grep '^W: ' "$tmp" || \
grep '^Err: ' "$tmp"; then
echo "CRITICAL - there was a problem with apt-get update"
EXIT=2 # CRITICAL
else
echo "OK - apt-get update executed successfully"
EXIT=0 # OK
fi
rm "$tmp"
exit $EXIT
$ cat ./files/sudo/check-apt-update
# Allow the nagios user to execute `apt-get update`
nagios ALL=NOPASSWD: /usr/bin/apt-get update
$ cat ./tasks/main.yml
- block:
- name: install check_apt_update
copy:
src: check_apt_update
dest: /usr/lib/nagios/plugins
mode: +x
- name: allow check_apt_update to run apt-get update
copy:
src: sudo/check-apt-update
dest: /etc/sudoers.d
mode: 0440
# we are assuming here that the icinga master host is called `sysmon`
#
- name: add icinga2 definition for check_apt_update to sysmon
copy:
src: icinga2/zones.d/global/apt-check-update.conf
dest: /etc/icinga2/zones.d/global/
delegate_to: sysmon
register: check_apt_update_install
- name: reload icinga2
service: name=icinga2 state=reloaded
delegate_to: sysmon
when: check_apt_update_install.changed
when: ansible_os_family == 'Debian'
If you feel like you would want to create an ansible collection for this and put it on Github and into ansible galaxy then please feel welcome to do so. I’d be kind if you let me know and if the repo noted the origin of the code in the credits. The code is under the GPL. If you think that should be different the let me know why.
Happy sysadmin’ing.