DocsEdge Stack
Advanced rate limiting
Advanced rate limiting
Ambassador Edge Stack features a built-in Rate Limit Service (RLS). The Ambassador Edge Stack RLS uses a decentralized configuration model that enables individual teams the ability to independently manage rate limits independently.
All of the examples on this page use the backend service of the quote sample application to illustrate how to perform the rate limiting functions.
Rate Limiting in Ambassador Edge Stack
In Ambassador Edge Stack, the RateLimit resource defines the policy for rate limiting. The rate limit policy is applied to individual requests according to the labels you add to the Mapping resource. This allows you to assign labels based on the particular needs of you rate limiting policies and apply the RateLimit policies to only the domains in the related Mapping resource.
You can apply the RateLimit policy globally to all requests with matching labels from the Module resource. This can be used in conjunction with the Mapping resource to have a global rate limit with more granular rate limiting for specific requests that go through that specific Mapping resource.
In order for you to enact rate limiting policies:
- Each domain you target needs to have labels.
- For individual request, the service's
Mappingresource needs to contain the labels related to the domains you want to apply the rate limiting policy to. - For global requests, the service's
Moduleresource needs to contain the labels related to the policy you want to apply. - The
RateLimitresource needs to set the rate limit policy for the labels theMappingresource.
Rate limiting for availability
Global rate limiting applies to the entire Kubernetes service mesh. This example shows how to limit the quote service to 3 requests per minute.
First, add a request label to the
request_label_groupof thequoteservice'sMappingresource. This example usesbackendfor the label:Apply the mapping configuration changes with
kubectl apply -f quote-backend.yaml.Next, configure the
RateLimitresource for the service. Create a new YAML file namedbackend-ratelimit.yamland apply the rate limit details as follows:In the code above, the
generic_keyis a hard-coded value that is used when you add a single string label to a request.Deploy the rate limit with
kubectl apply -f backend-ratelimit.yaml.
Per user rate limiting
Per user rate limiting enables you to apply the defined rate limit to specific IP addresses. To allow per user rate limits, you need to make sure you've properly configured Ambassador Edge Stack to propagate your original client IP address.
This example shows how to use the remote_address special value in the mapping to target specific IP addresses:
Add a request label to the
request_label_groupof thequoteservice'sMappingresource. This example usesremote_addressfor the label:Update the rate limit amounts for the
RateLimitservice and enter theremote_addressto the following pattern:
Load shedding
Another technique for rate limiting involves load shedding. With load shedding, you can define which HTTP request method to allow or deny.
This example shows how to implement load per user rate limiting along with load shedding on GET requests.
To allow per user rate limits, you need to make sure you've properly configured Ambassador Edge Stack to propagate your original client IP address.
Add a request labels to the
request_label_groupof thequoteservice'sMappingresource. This example usesremote_addressfor the per user limit, andbackend_http_methodfor load shedding. The load shedding uses":method"to identify that theRateLimitwill use a HTTP request method in its pattern.Update the rate limit amounts for the
RateLimitservice. For the rate limitpattern, include theremote_addressIP address and thebackend_http_mthod.When a pattern has multiple criteria, the rate limit runs when when any of the rules of the pattern match. For the example above, this means either a
remote_addressorbackend_http_methodpattern triggers the rate limiting.
Global rate limiting
Similar to the per user rate limiting, you can use global rate limiting to assign a rate limit to any unique IP addresses call to your service. Unlike the previous examples, you need to add your labels to the Module resource rather than the Mapping resource. This is because the Module resource applies the labels to all the requests in Ambassador Edge Stack, whereas the labels in Mapping only apply to the requests that use that Mapping resource.
Add a request label to the
request_label_groupof thequoteservice'sModuleresource. This example uses theremote_addressspecial value.Update the rate limit amounts for the
RateLimitservice and enter theremote_addressto the following pattern:
Bypassing a global rate limit
Sometimes, you may have an API that cannot handle as much load as others in your cluster. In this case, a global rate limit may not be enough to ensure this API is not overloaded with requests from a user. To protect this API, you can create a label that tells Ambassador Edge Stack to apply a stricter limit on requests.
In the example above, the global rate limit is defined in the Module resource. This applies the limit to all requests. In conjunction with the global limit defined in the Module resource, you can add more granular rate limiting to a Mapping resource, which will only apply to requests that use that 'Mapping'.
In addition to the configurations applied in the global rate limit example above, add an additional label to the
request_label_groupof theMappingresource. This example usesbackendfor the label:Now, the
request_label_groupcontains both thegeneric_key: backendand theremote_addresskey applied from the global rate limit. This creates a separateRateLimitobject for this route:Requests to
/backend/now are now limited after 3 requests. All other requests use the global rate limit policy.
Rate limit matching rules
The following rules apply to the rate limit patterns:
- Patterns are order-sensitive and must be entered in the same order in which a request is labeled.
- Every label in a label group must exist in the pattern in order for matching to occur.
- By default, any type of failure lets the request pass through (fail open).
- Ambassador Edge Stack sets a hard timeout of 20ms on the rate limiting service. If the rate limit service does not respond within the timeout period, the request passes through.
- If a pattern does not match, the request passes through.
Troubleshooting rate limiting
The most common source of failure of the rate limiting service occurs when the labels generated by Ambassador Edge Stack do not match the rate limiting pattern. By default, the rate limiting service logs all incoming labels from Ambassador Edge Stack. Use a tool such as Stern to watch the rate limiting logs from Ambassador Edge Stack and ensure the labels match your descriptor.
More
For more on rate limiting, see the rate limit guide.