KUBERNETES

Advanced Kubernetes Health Checking with Edge Stack

January 9, 2024 | 9 min read

Table of contents

Configuring Health Checks in Edge Stack

Level Up your Ingress with Edge Stack

Active monitoring to make sure your services are healthy and prevent your users from seeing errors

Edge Stack is a powerful Kubernetes-native API Gateway powered by Envoy Proxy that focuses on providing an ingress solution that is easy to use while still providing robust configuration. One of the many areas where Edge Stack can help eliminate friction for you and your users is through its advanced health-checking configuration options.

With Edge Stack as your API Gateway, you can configure health checks to run directly against your Kubernetes service's pods. These intelligent health checks track your application's responses to determine which pods are healthy and which ones need a break from incoming traffic.

Native Options in Kubernetes

Before getting into how to configure health checking in Edge Stack, let's review the tools Kubernetes provides to ensure your services are available and healthy.

The most basic setup in Kubernetes occurs when you create a new Deployment. Kubernetes marks Pods starting up as being in a 'Pending' state. If the Pod starts up without crashing or stopping, Kubernetes will set the state to 'succeeded' and begin routing traffic to it. Kubernetes will also restart any pods that have shut down or crashed to attempt to keep them all up and ready to handle traffic.

There are already a couple of flaws with this. The first one many developers run into is that Kubernetes only ensures that the containers inside the Pod are created successfully. It does not inherently understand anything about the application's state running inside the container. Kubernetes might send traffic to the Pods before the applications inside are ready. When applications are starting up, they might need to handle internal setup or initialize connections to databases and other services. Without extra config, Kubernetes won't know when the application is ready to start handling requests. This can be solved by adding probes.

Kubernetes Probes

There are three types of probes that you can configure on your Deployment to let Kubernetes know more about the state of your application within the Pod (s). Without adding any of these probes, Kubernetes will mark all pods as having 'Succeeded' as soon as possible and start routing traffic.

Readiness Probe

Lets Kubernetes know when your application is ready. Kubernetes will make requests against the configured readiness probe until it passes before routing any traffic to that Pod. You can configure a readiness endpoint on your application to keep track of the startup/initialization state and only return a 200 OK response when everything is ready.

The Liveness Probe

Makes requests against your Pods to inform Kubernetes about whether or not they are considered to be 'Alive.' If they are not, then Kubernetes will kill the container and restart the Pod according to the configured `restartPolicy` on the Deployment.

Startup probe

Helps Kubernetes know when a container application has started properly. When startup probes are configured, they stop liveness and readiness probes until the container is up and running so that they don't interrupt the application's startup. These are mainly used when your application takes a while to start since they keep the Kubernetes from stopping the Pods too soon if a liveness check fails before the application is ready to respond to it.

The startup and readiness probe are great options in your Kubernetes operator toolbelt, but like us, you might think: "Wow, the liveness probe seems way too overzealous and not very useful." This is precisely where more intelligent health-checking configurations like Edge Stack offers come in.

Health Checks in Edge Stack

You need to configure active health checking in Edge Stack just like with Kubernetes probes since not all services need them or want to be prodded all the time.

Edge Stack's active health checking supports gRPC and HTTP applications and operates by making requests against the configured endpoint. If the checks fail, the Pod (s) that failed the check will be temporarily taken out of the pool of pods serving requests to users until they begin passing the health checks again. Sometimes, a pod might get overwhelmed or temporarily run into issues. When this happens, active health checks give the Pod a chance to recover while prioritizing not allowing that Pod to return any errors to users.

One of the ways this is particularly effective is during the shutdown phase of a Pod. You can configure your Pod to start failing the configured active health checks as soon as it needs to start shutting down. Doing so lets the Pod have as much time as it needs to finish any ongoing processing and connection cleanup without getting any new requests or getting killed immediately by Kubernetes.

Configuring Health Checks in Edge Stack

Edge stack primarily uses custom resources as the configuration method. While the Kubernetes `Ingress` resource is supported by Edge Stack, it lacks the fields necessary to configure many of Edge Stack's advanced features, such as active health checking. Thankfully, Edge Stack's custom resources are straightforward to use. The following `Mapping` resource is an example of how easy it is to configure traffic routing to a specific service, and we'll modify it to perform active health checking next.

---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
  name: example-http-service
spec:
  hostname: "*"
  prefix: /test/
  service: example-http

The above `Mapping` directs all traffic that hits Edge Stack from any hostname, with a path starting with '/test/' to be routed to the example-http service. The easiest way to set up active health checking for the above service is to give it a path that the above service will use to respond to health check requests and let Edge Stack take care of all the defaults and other configurations.

---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
  name: example-http-service
spec:
  hostname: "*"
  prefix: /test/
  service: example-http
  resolver: endpoint
  health_checks:
  - health_check:
      http:
        path: /health-check

We added `resolver: endpoint` to the Mapping so that Edge Stack has full control over which Pods it routes to instead of relying on Kubernetes' default service routing. This allows Edge Stack to be aware of all the pods in your Deployment individually, enabling more advanced load balancing configuration options such as our active health checking. By default, Edge Stack will make a request to this service on the /health-check endpoint configured above every five seconds to ensure it is healthy. It will consider a specific pod to be unhealthy when that Pod fails the check two times in a row, and the Pod will start getting traffic again as soon as it passes the next health check. You have complete control over these options, and they can be configured like so:

---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
  name: example-http-service
spec:
  hostname: "*"
  prefix: /test/
  service: example
  resolver: endpoint
  health_checks:
  - unhealthy_threshold: 3
    healthy_threshold: 3
    interval: 10s
    timeout: 2s
    health_check:
      http:
        path: /health-check

With the above changes, we've configured Edge Stack to allow a pod to fail the health check three times before removing it from the traffic pool. We've also increased the number of times it needs to pass the health check to three before it starts getting traffic again, decreased the frequency with which Edge Stack sends health check requests, and configured a timeout of 2s for the health checks.

Since the `health_checks` field takes a list of health checks, you can easily configure multiple health check endpoints for the same service if needed. Configuring health checks for a gRPC service is also very easy with Edge Stack, as showcased in the example below.

---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
 name: example-grpc-service
spec:
 hostname: "*"
 prefix: /example-grpc.Test/
 rewrite: /example-grpc.Test/
 grpc: True
 service: example-grpc
 resolver: endpoint
 health_checks:
 - health_check:
     grpc:
       upstream_name: example-grpc

Level Up your Ingress with Edge Stack

Configuring active health checking using Edge Stack is a simple way to ensure users experience as few interruptions as possible when accessing your applications. The above examples only showcase a portion of the health-checking configuration options that Edge Stack provides to demonstrate how easy it is to get started while still allowing you to be more specific if you need to. Whether you've never tried Edge Stack before or are learning more about its many features, there are plenty of examples and guides in the Edge Stack documentation. Get set up with Edge Stack today and stop worrying about issues with traffic going into and out of your Kubernetes cluster today!