Auto Alert when a Kubernetes Job is Complete/Failed (with examples)

Updated: January 31, 2024 By: Guest Contributor Post a comment

Introduction

Kubernetes has become the de facto standard for orchestrating containerized applications. While it seamlessly manages the lifecycle of applications, teams frequently need to know the outcome of their Kubernetes jobs — whether they have completed successfully or failed. This tutorial will guide you through setting up auto-alerts for Kubernetes job completions or failures, offering a tighter feedback loop for your operations and development teams.

Prerequisites

  • A Kubernetes cluster with kubectl access
  • An understanding of Kubernetes concepts such as Jobs, Pods, and Events
  • An alerting mechanism (e.g., email, Slack bot, etc.)

Basic Notification with kubectl

Let’s start with the basics. You can manually check the status of a job using kubectl:

kubectl get jobs my-job -o jsonpath='{.status.conditions[?(@.type=="Complete")].status}'

To automate this, you can run a simple shell script that watches your job and sends an alert when the job completes or fails:

#!/bin/bash

JOB_NAME="my-job"
NAMESPACE="default"

while true; do
    JOB_STATUS=$(kubectl get job $JOB_NAME -n $NAMESPACE -o jsonpath='{.status.conditions[?(@.type=="Complete")].status}')
    if [ "$JOB_STATUS" == "True" ]; then
        echo "Job $JOB_NAME completed successfully" | mail -s "Job $JOB_NAME Completed" [email protected]
        break
    fi
    sleep 10
done

This script will keep running checks every 10 seconds until it detects that the job has completed. Replace ‘[email protected]’ with your actual alert mechanism’s contact point.

Event-driven Notification with Kubernetes Events

Next, we can make use of Kubernetes Events to trigger notifications:

kubectl get events --field-selector involvedObject.kind=Job,involvedObject.name=my-job,type=Normal

You can pair this mechanism with event-watching tools such as kubewatch to send notifications to Slack, SMTP endpoints, Webhooks, and more. Install kubewatch and configure it to watch for job events:

# Install kubewatch using Helm
helm install stable/kubewatch --set=resourcesToWatch.job=true

# Configure kubewatch to send notifications to Slack
kubectl edit configmap kubewatch -n default

Add your Slack Webhook URL in the kubewatch configuration under the slack field. With this setup, you’ll receive Slack notifications when jobs create events.

Advanced Monitoring with Prometheus and Alertmanager

If you are using Prometheus for monitoring your Kubernetes cluster, you can take advantage of Alertmanager for complex alerting rules. First, make sure your Prometheus instance is scraping Kubernetes job metrics:

# Add your job metrics to Prometheus scrape config
- job_name: 'kubernetes-jobs'
  kubernetes_sd_configs:
  - role: job

Next, define an alert rule for job completion or failure:

groups:
- name: job-status
  rules:
  - alert: JobFailed
    expr: kube_job_status_failed > 0
    for: 1m
    annotations:
      summary: Job has failed
  - alert: JobCompleted
    expr: kube_job_status_succeeded > 0
    for: 1m
    annotations:
      summary: Job has completed

Then configure Alertmanager to send notifications through your chosen method:

global:
  resolve_timeout: 5m
route:
  receiver: 'web.hook'
receivers:
- name: 'web.hook'
  webhook_configs:
  - url: 'http://example.com'

Replace ‘http://example.com’ with the actual webhook that will handle the alert.

Conclusion

In this tutorial, we explored several methods to automatically alert stakeholders when a Kubernetes job is complete or has failed. By integrating these solutions into your workflow, operations, and development teams can better collaborate and quickly address issues in real-time, streamlining the deployment and monitoring process.