Builtin Services used by Watchers

Default service

class watchghost.services.Service(name)

A Service is responsible for checking something.

A Watcher is a Service applied to a server or a group, with custom attributes.

Any watcher must have the following attributes:

  • service: the service class name.
  • server or group: the server name or group name.

Any watcher can have the following attributes:

  • description: a string representing the watcher (default: None).
  • repeat: the time period between two checks, in seconds (default: 3600).
  • after: the hour when the checks must start (default: “00:00:00”).
  • before: the hour when the checks must stop (default: “23:59:59”).
  • retry: the number of checks giving the same result before declaring the state as hard (default: 2).
  • retry_interval: the time period (in seconds), between two checks when the state is not hard (default: 15).
  • status: a mapping between statuses and filters that trigger these statuses.

Example:

[
  {
    "service": "network.Ping",
    "group": "postgres",
    "description": "Ping IPv4",
    "ip_version": 4
  },
  {
    "service": "network.HTTP",
    "server": "ceres",
    "description": "HTTP",
    "url": "http://test.org:8888/",
    "status": {"warning": [{"code": 404}]}
  }
]

This example defines two watchers. The first one pings the IPv4 of the postgres group’s servers. The second one fetches the “http://test.org:8888/” page on the “ceres” server and gives a warning status when the status code is 404 (otherwise gives what the HTTP watcher’s default config does).

Ping

Default configuration:

{
    "repeat": 60,
    "timeout": 3,
    "ip_version": 4,
}
timeout:time allowed before throwing a timeout
ip_version:version of the IP protocol (4 or 6)

HTTP

Default configuration:

{
    "repeat": 60,
    "timeout": 5,
    "url": "",
    "ip_version": 4,
    "status": {[
        ("error", [{"code": i} for i in range(400, 432)]),
        ("warning", [{"code": i} for i in range(300, 308)]),
        ("info", [{"code": i} for i in range(200, 226)]),
        ("critical", [{}]),
    ]},
}
timeout:time allowed before throwing a timeout
url:the URL checked by the watcher
ip_version:version of the IP protocol (4 or 6)
status:define the error to repport based on the http status in response

FTP

{
    "repeat": 60,
    "timeout": 5,
    "ip_version": 4,
    "url": "",
}
timeout:time allowed before throwing a timeout
url:the URL checked by the watcher
ip_version:version of the IP protocol (4 or 6)

SecuredSocket

Verify a certificate validity

{
    "repeat": 60,
    "ip_version": 4,
    "hostname": "",
    "port": 443,
    "minimum_days_left": 30,
    "status": {[
        ["error", [
            {"hostname_verified": False},
            {"in_period": False},
            {"connected": False},
        ]],
        ["warning", [{"enough_days_left": False}]],
        ["info", [{}]],
    ]},
}
ip_version:version of the IP protocol (4 or 6)
hostname:name of the host of the certificate
port:port emiting the the certificate
minimum_days_left:
 number of days of validity left in the certificate before an alert
status:error, warning or info can be throwed depending on different issues:
hostname_verified:
 true if the hostname is effectively verified by the certificate.
in_period:true the certificate is in the period of validity
connected:true the certificate can be reached
enough_days_left:
 true if there are more days of validity left than minimum_days_left

SSH

{
    "command": [],
    "status": {[
        ("error", [{"exit_code": 2}]),
        ("warning", [{"exit_code": 1}]),
        ("info", [{"exit_code": 0}]),
    ]},
}
command:the command and the parameters to be executed on the remote server
exit_code:code returned by the command

Shell

{
    "timeout": 10,
    "command": "",
    "status": {[
        ("error", [{"return_code": 2}]),
        ("warning", [{"return_code": 1}]),
        ("info", [{"return_code": 0}]),
        ("critical", [{}]),
    ]},
}
timeout:time allowed before throwing a timeout
command:the command executed on the watchghost server
return_code:the code returned by the command