Envoy vs NGINX vs HAProxy: Why the Edge Stack API Gateway chose Envoy

Kay James

June 21, 2018

•

NGINX, HAProxy, and Envoy are all battle-tested L4 and L7 proxies. So why did we choose Envoy as the core proxy as we developed Edge Stack API Gateway for applications deployed into Kubernetes?

It’s an L7 world

In today’s cloud-centric world, business logic is commonly distributed into ephemeral microservices. These services need to communicate with each other over the network. The core network protocols that are used by these services are so-called “Layer 7” protocols, e.g., HTTP, HTTP/2, gRPC, Kafka, MongoDB, and so forth. These protocols build on top of your typical transport layer protocols such as TCP. Managing and observing L7 is crucial to any cloud application, since a large part of application semantics and resiliency are dependent on L7 traffic.

The Proxy Battle

Ambassador was designed from the beginning for this L7, services-oriented world, with us deciding early on to build only for Kubernetes. We knew we wanted to avoid writing our own proxy, so we considered HAProxy, NGINX, and Envoy as possibilities. At some level, all three of these proxies are highly reliable and proven proxies, with Envoy being the newest kid on the block.

Evaluating Proxies

We started by evaluating the different feature sets of the three proxies. We soon realized that L7 proxies are in many ways commodity infrastructure. All proxies do an outstanding job of routing traffic L7 reliably and efficiently, with a minimum of fuss. And while they weren’t at feature parity, we felt that we could, if we had to, implement any critical missing features in the proxy itself. After all– they’re all open source!

We took a step back and reconsidered our evaluation criteria. Given the rough functional parity in each of these solutions, we refocused our efforts on evaluating each project through a more qualitative lens. Specifically, we looked at each project’s community, velocity, and philosophy. We focused on community because we wanted a vibrant community where we could easily contribute. Related to community, we wanted to see that a project had good forward velocity, as it would show the project would quickly evolve as customer needs evolved. And finally, we wanted a project that would align as closely as possible with our view of a L7-centric, microservices world.

HAProxy

Several years ago, some of us at Ambassador Labs had worked on Baker Street, an HAProxy-based client-side load balancer inspired by AirBnb’s SmartStack. HAProxy is a very reliable, fast, and proven proxy. While we were happy with HAProxy, we had some longer-terms concerns around HAProxy. HAProxy was initially released in 2006, when the Internet operated very differently than today. The velocity of the HAProxy community didn’t seem to be very high. For example, v1.5 added SSL after four years. We ourselves had experienced the challenges of hitless reloads (being able to reload your configuration without restarting your proxy) which were not fully addressed until the end of 2017 despite epic hacks from folks like Joey at Yelp. With v1.8, the HAProxy team has started to catch up to the minimum set of features needed for microservices, but 1.8 didn’t ship until November 2017.

NGINX

NGINX is a high-performance web server that does support hitless reloads. NGINX was designed initially as a web server, and over time has evolved to support more traditional proxy use cases. NGINX has two variants, NGINX Plus, a commercial offering, and NGINX open source. Per NGINX, NGINX Plus “extend[s] NGINX into the role of a frontend load balancer and application delivery controller.” While that sounds perfect, we wanted to make Ambassador open source, so NGINX Plus was not an option for us.

NGINX open source has a variety of limitations, including limited observability and health checks. To circumvent the limitations of NGINX open source, our friends at Yelp actually deployed HAProxy and NGINX together.

More generally, while NGINX had more forward velocity than HAProxy, we were concerned that many of the desirable features would be locked away in NGINX Plus. The NGINX business model creates an inherent tension between the open source and Plus product, and we weren’t sure how this dynamic would play out if we contributed upstream. (Note that HAProxy has a similar tension with Enterprise Edition, but there seems to be less divergence in the feature set between EE and CE in HAProxy).

Envoy Proxy

Envoy is the newest proxy on the list, but has been deployed in production at Lyft, Apple, Salesforce, Google, and others. In many ways, the release of Envoy Proxy in September 2016 triggered a round of furious innovation and competition in the proxy space.

Envoy was designed from the ground up for microservices, with features such as hitless reloads (called hot restart), observability, resilience, and advanced load balancing. Envoy also embraced distributed architectures, adopting eventual consistency as a core design principle and exposing dynamic APIs for configuration. Traditionally, proxies have been configured using static configuration files. Envoy, while supporting a static configuration model, also allows configuration via gRPC/protobuf APIs. This simplifies management at scale, and also allows Envoy to work better in environments with ephemeral services.

We loved the feature set of Envoy and the forward-thinking vision of the product. We also discovered the community around Envoy is unique, relative to HAProxy and NGINX. Unlike the other two proxies, Envoy is not owned by any single commercial entity. Envoy was originally created by Lyft, and as such, there is no need for Lyft to make money directly on Envoy. Matt Klein, creator of Envoy, explicitly decided that he would not start an Envoy platform company. There is no commercial pressure for a proprietary Envoy Plus or Envoy Enterprise Edition. As such, the community focuses only on the right features with the best code, without any commercial considerations. Finally, Lyft has donated the Envoy project to the Cloud Native Computing Foundation. The CNCF provides an independent home to Envoy, insuring that the focus on building the best possible L7 proxy will remain unchanged.

A year later …

We couldn’t be happier with our decision to build Edge Stack on Envoy. The rich feature set has allowed us to quickly add support for gRPC, rate limiting, shadowing, canary routing, and observability, to name a few. And in the cases where Envoy’s feature set hasn’t met our requirements (e.g., authentication), we’ve been able to work with the Envoy community to implement the necessary features.

With hundreds of developers now working on Envoy, the Envoy code base is moving forward at an unbelievable pace, and we’re excited to continue taking advantage of Envoy in Ambassador. We wrote about some of the Envoy updates we’re most excited for in 2019 on our blog.

… The Journey

Both the Ambassador and Envoy Proxy communities have continued to grow. As we look at the evolution of Envoy Proxy, two additional themes are worth mentioning: the xDS API and the ecosystem around Envoy Proxy. As discussed earlier in this article, Envoy was designed for dynamic management from the get-go, and exposed APIs for managing fleets of Envoy proxies. Today, the xDS API is evolving towards a universal data plane API. With every release of Ambassador, we’re taking advantage of more capabilities of the API (and this is hard, because this API is changing at a high rate!). The popularity of Envoy and the xDS API is also driving a broader ecosystem of projects around Envoy itself. Projects such as Cilium, Envoy Mobile, Consul, and Curefense have all embraced Envoy as a core part of their technology stack. This vibrant ecosystem is continuing to push the Envoy project forward.

We’re looking forward to the continued evolution of Envoy, and seeing how we can continue to collaborate with the larger Envoy community.