Bosun SSL Certs Expiring


Example

This data is collected by the http_unit and scollector. It warns when an alert is going to expire within a certain amount of days, and then goes critical if the cert has passed the expiration date. This follows the recommended default of warn and crit usage in Bosun (warn: something is going to fail, crit: something has failed).

Template Def

template ssl.cert.expiring {
    subject = {{.Last.Status}}: SSL Cert Expiring in {{.Eval .Alert.Vars.daysLeft | printf "%.2f"}} Days for {{.Group.url_host}}
    body = `
    {{ template "header" . }}
    <table>
       <tr>
            <td>Url</td>
            <td>{{.Group.url_host}}</td>
       </tr>
       <tr>
            <td>IP Address Used for Test</td>
            <td>{{.Group.ip}}</td>
       </tr>
       <tr>
            <td>Days Remaining</td>
            <td>{{.Eval .Alert.Vars.daysLeft | printf "%.2f"}}</td>
       </tr>
       <tr>
            <td>Expiration Date</td>
            <td>{{.Last.Time.Add (parseDuration (.Eval .Alert.Vars.hoursLeft | printf "%vh")) }}</td>
       </tr>
    </table>
    `
}

Alert Definition

alert ssl.cert.expiring {
    template = ssl.cert.expiring
    ignoreUnknown = true
    $notes = This alert exists to notify of us any SSL certs that will be expiring for hosts monitored by our http unit test cases defined in the scollector configuration file.
    $expireEpoch = last(q("min:hu.cert.expires{host=ny-bosun01,url_host=*,ip=*}", "1h", ""))
    $hoursLeft = ($expireEpoch - epoch()) / d("1h")
    $daysLeft = $hoursLeft / 24
    warn = $daysLeft <= 50
    crit = $daysLeft <= 0
    warnNotification = default
    critNotification = default
}

Alert Explanation

  • q(..) (func doc) querties OpenTSDB, one of Bosun's supported backends. In returns a type called a seriesSet (which is set of time series, each identified by tag).
  • last() (func doc) takes the last value of each series in the seriesSet and returns a numberSet.
  • The metric, hu.cert.expires. is returning the Unix time stamp of when the cert will expire
  • epoch() (func doc) returns the current unix timestamp. So subtracting current unix timestamp from the expiration epoch gives is the remaining time.
  • d() (func doc) returns the number of seconds represented by the duration string, the duration string uses the same units as OpenTSDB.

Notification Preview

enter image description here

Example Section of scollector.toml referencing the config for httpunit test cases:

[[HTTPUnit]]
  TOML = "/opt/httpunit/data/httpunit.toml"