8 min • read

Debugging

If you’re experiencing issues with the Ambassador Edge Stack and cannot diagnose the issue through the /ambassador/v0/diag/ diagnostics endpoint, this document covers various approaches and advanced use cases for debugging Ambassador Edge Stack issues.

We assume that you already have a running Ambassador Edge Stack installation in the following sections.

A Note on TLS

TLS can appear intractable if you haven't set up certificates correctly. If you're having trouble with TLS, always check the logs of your Ambassador Edge Stack Pods and look for certificate errors.

Check Ambassador Edge Stack status

  1. First, check the Ambassador Edge Stack Deployment with the following: kubectl get -n ambassador deployments

    After a brief period, the terminal will print something similar to the following:

    Terminal
    $ kubectl get -n ambassador deployments
    NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
    edge-stack 3 3 3 3 1m
    edge-stack-apiext 3 3 3 3 1m
  2. Check that the “desired” number of Pods matches the “current” and “available” number of Pods.

  3. If they are not equal, check the status of the associated Pods with the following command: kubectl get pods -n ambassador.

    The terminal should print something similar to the following:

    Terminal
    $ kubectl get pods -n ambassador
    NAME READY STATUS RESTARTS AGE
    edge-stack-85c4cf67b-4pfj2 1/1 Running 0 1m
    edge-stack-85c4cf67b-fqp9g 1/1 Running 0 1m
    edge-stack-85c4cf67b-vg6p5 1/1 Running 0 1m
    edge-stack-apiext-736f8497d-j34pf 1/1 Running 0 1m
    edge-stack-apiext-736f8497d-9gfpq 1/1 Running 0 1m
    edge-stack-apiext-736f8497d-p5wgx 1/1 Running 0 1m

    The actual names of the Pods will vary. All the Pods should indicate Running, and all should show 1/1 containers ready.

  4. If the Pods do not seem reasonable, use the following command for details about the history of the Deployment: kubectl describe -n ambassador deployment edge-stack

    • Look for data in the “Replicas” field near the top of the output. For example: Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable

    • Look for data in the “Events” log field near the bottom of the output, which often displays data such as a fail image pull, RBAC issues, or a lack of cluster resources. For example:

      shell
      Events:
      Type Reason Age From Message
      ---- ------ ---- ---- -------
      Normal ScalingReplicaSet 2m deployment-controller Scaled up replica set edge-stack-85c4cf67b to 3
  5. Additionally, use the following command to “describe” the individual Pods: kubectl describe pods -n ambassador <edge-stack-pod-name>

    • Look for data in the “Status” field near the top of the output. For example, Status: Running

    • Look for data in the “Events” field near the bottom of the output, as it will often show issues such as image pull failures, volume mount issues, and container crash loops. For example:

      shell
      Events:
      Type Reason Age From Message
      ---- ------ ---- ---- -------
      Normal Scheduled 4m default-scheduler Successfully assigned edge-stack-85c4cf67b-4pfj2 to gke-ambassador-demo-default-pool-912378e5-dkxc
      Normal SuccessfulMountVolume 4m kubelet, gke-ambassador-demo-default-pool-912378e5-dkxc MountVolume.SetUp succeeded for volume "edge-stack-token-tmk94"
      Normal Pulling 4m kubelet, gke-ambassador-demo-default-pool-912378e5-dkxc pulling image "docker.io/datawire/ambassador:0.40.0"
      Normal Pulled 4m kubelet, gke-edge-stack-demo-default-pool-912378e5-dkxc Successfully pulled image "docker.io/datawire/ambassador:0.40.0"
      Normal Created 4m kubelet, gke-edge-stack-demo-default-pool-912378e5-dkxc Created container
      Normal Started 4m kubelet, gke-edge-stack-demo-default-pool-912378e5-dkxc Started container

In both the Deployment Pod and the individual Pods, take the necessary action to address any discovered issues.

Review Ambassador Edge Stack logs

Ambassador Edge Stack logging can provide information on anything that might be abnormal or malfunctioning. While there may be a large amount of data to sort through, look for key errors such as the Ambassador Edge Stack process restarting unexpectedly, or a malformed Envoy configuration.

Ambassador Edge Stack has two major log mechanisms: Ambassador Edge Stack logging and Envoy logging. Both appear in the normal kubectl logs output, and both can have additional debug-level logging enabled.

Ambassador Edge Stack debug logging

Much of Ambassador Edge Stack's logging is concerned with the business of noticing changes to Kubernetes resources that specify the Ambassador Edge Stack configuration, and generating new Envoy configuration in response to those changes. Ambassador Edge Stack also logs information about its built-in authentication, rate limiting, developer portal, ACME, etc. There are multiple environment variables controlling debug logging; which is required depends on which aspect of the system you want to debug:

  • Set AES_LOG_LEVEL=debug to debug the early boot sequence, Ambassador Edge Stack's interactions with the Kubernetes cluster (finding changed resources, etc.), and the built-in services (auth, rate limiting, etc.).
  • Set AMBASSADOR_DEBUG=diagd to debug the process of generating an Envoy configuration from the input resources.

Ambassador Edge Stack Envoy logging

Envoy logging is concerned with the actions Envoy is taking for incoming requests. Typically, Envoy will only output access logs, and certain errors, but enabling Envoy debug logging will show very verbose information about the actions Envoy is actually taking. It can be useful for understanding why connections are being closed, or whether an error status is coming from Envoy or from the upstream service.

It is possible to enable Envoy logging at boot, but for the most part, it's safer to enable it at runtime, right before sending a request that is known to have problems. To enable Envoy debug logging, use kubectl exec to get a shell on the Ambassador Edge Stack pod, then:

shell
```
curl -XPOST http://localhost:8001/logging?level=trace && \
sleep 10 && \
curl -XPOST http://localhost:8001/logging?level=warning
```

This will turn on Envoy debug logging for ten seconds, then turn it off again.

Viewing logs

To view the logs from Ambassador Edge Stack:

  1. Use the following command to target an individual Ambassador Edge Stack Pod: kubectl get pods -n ambassador

    The terminal will print something similar to the following:

    Terminal
    $ kubectl get pods -n ambassador
    NAME READY STATUS RESTARTS AGE
    edge-stack-85c4cf67b-4pfj2 1/1 Running 0 3m
  2. Then, run the following: kubectl logs -n ambassador <edge-stack-pod-name>

The terminal will print something similar to the following:

shell
```
$ kubectl logs -n ambassador edge-stack-85c4cf67b-4pfj2
2018-10-10 12:26:50 kubewatch 0.40.0 INFO: generating config with gencount 1 (0 changes)
/usr/lib/python3.6/site-packages/pkg_resources/__init__.py:1235: UserWarning: /ambassador is writable by group/others and vulnerable to attack when used with get_resource_filename. Consider a more secure location (set with .set_extraction_path or the PYTHON_EGG_CACHE environment variable).
warnings.warn(msg, UserWarning)
2018-10-10 12:26:51 kubewatch 0.40.0 INFO: Scout reports {"latest_version": "0.40.0", "application": "ambassador", "notices": [], "cached": false, "timestamp": 1539606411.061929}
2018-10-10 12:26:54 diagd 0.40.0 [P15TMainThread] INFO: thread count 3, listening on 0.0.0.0:8877
[2018-10-10 12:26:54 +0000] [15] [INFO] Starting gunicorn 19.8.1
[2018-10-10 12:26:54 +0000] [15] [INFO] Listening at: http://0.0.0.0:8877 (15)
[2018-10-10 12:26:54 +0000] [15] [INFO] Using worker: threads
[2018-10-10 12:26:54 +0000] [42] [INFO] Booting worker with pid: 42
2018-10-10 12:26:54 diagd 0.40.0 [P42TMainThread] INFO: Starting periodic updates
[2018-10-10 12:27:01.977][21][info][main] source/server/drain_manager_impl.cc:63] shutting down parent after drain
```

Note that many deployments will have multiple logs, and the logs are independent for each Pod.

Examine Pod and container contents

You can examine the contents of the Ambassador Edge Stack Pod for issues, such as if volume mounts are correct and TLS certificates are present in the required directory, to determine if the Pod has the latest Ambassador Edge Stack configuration, or if the generated Envoy configuration is correct or as expected. In these instructions, we will look for problems related to the Envoy configuration.

  1. To look into an Ambassador Edge Stack Pod, get a shell on the Pod using kubectl exec. For example,

    shell
    kubectl exec -it -n ambassador <edge-stack-pod-name> -- bash
  2. Determine the latest configuration. If you haven't overridden the configuration directory, the latest configuration will be in /ambassador/snapshots. If you have overridden it, Ambassador Edge Stack saves configurations in $AMBASSADOR_CONFIG_BASE_DIR/snapshots.

    In the snapshots directory:

    • snapshot.yaml contains the full input configuration that Ambassador Edge Stack has found;
    • aconf.json contains the Ambassador Edge Stack configuration extracted from the snapshot;
    • ir.json contains the IR constructed from the Ambassador Edge Stack configuration; and
    • econf.json contains the Envoy configuration generated from the IR.

    In the snapshots directory, the current configuration will be stored in files with no digit suffix, and older configurations have increasing numbers. For example, ir.json is current, ir-1.json is the next oldest, then ir-2.json, etc.

  3. If something is wrong with snapshot or aconf, there is an issue with your configuration. If something is wrong with ir or econf, you should open an issue on Github.

  4. The actual input provided to Envoy is split into $AMBASSADOR_CONFIG_BASE_DIR/bootstrap-ads.json and $AMBASSADOR_CONFIG_BASE_DIR/envoy/envoy.json.

    • The bootstrap-ads.json file contains details about Envoy statistics, logging, authentication, etc.
    • The envoy.json file contains information about request routing.
    • You may generally find it simplest to just look at the econf.json files in the snapshot directory, which includes both kinds of configuration.