Use Your Own Tools
An Ambassador Principal Engineer's Perspective
Drinking your own champagne is about the fermentation of using internal tools and sharing best practices, and the culture that evolves from both
Alex Gervais (@alex_gervais), Principal Software Engineer at Ambassador Labs, talked about the transition from being a user and contributor to an open source product to becoming one of the maintainers of the commercial version of the product.
In drinking our own champagne, Alex has discovered that even when you're using your own tools every day, great tools like Telepresence become embedded in your workflow yet continually offer surprises in terms of the extent of their functionality. At the same time, Alex argued, drinking your own champagne is about the fermentation of using internal tools and sharing best practices, and the culture that evolves from both.
Moving from external user to internal champagne champion
Alex Gervais's experience with Ambassador tools, especially Ambassador Edge Stack API Gateway (AES), began well before he joined Ambassador. Coming from a DevOps platform engineering background, Alex focused on being an enabler for engineering, discovering and implementing solutions, tools and platforms to "pave the path" for engineers in his company, i.e., achieve the goal of shipping business value into the world. Part of his responsibilities included the expected: moving to Kubernetes, CI/CD pipelines, and all the internal enablement tools that would empower the engineers to work safely and efficiently.
Upon learning about and adopting AES when making the transition to Kubernetes and microservices, Alex began attending events, contributing PRs and getting involved. His journey to Ambassador involved going from an internal DevOps platform and engineering enablement role that looked outward for the best tools to support developer productivity to switching over to Ambassador, enabling external engineering teams in a variety of organizations to adopt and make the most of the Ambassador Labs toolkit.
"Moving to Ambassador made sense to me in terms of scale," Alex shared. "The company where I worked was growing into an enterprise-grade company that started to get tied up in a lot of red tape, and I couldn’t move as fast as I wanted to. My impact was becoming more limited. Ambassador offered a lot of possibilities, a lot of ownership, and I could definitely influence the business and the direction of the tools and technology internally while reaching many more people, enabling vastly more engineers than the hundreds I had supported internally in the previous organization. That was a big factor, trying to have a broader impact."
Bringing external cloud experience internal
When Alex joined Ambassador, the learning and impact was a two-way street. Admitting that he had a bit of a hard time in the beginning, moving away from cloud-native, super quick, super automated workflows back to shipping binaries, Alex hit the ground running, bringing his external experience internal, influencing the cultivation of the champagne everyone would soon be drinking.
"When I joined Ambassador, we had a monthly release cadence. But putting binaries out into the world, you can't roll them back. The software is just out there, running. Based on my own SaaS experience, I knew we needed to shift the way we were developing software," Alex explained. "We needed to address and align on the nature of shipping binaries instead of running software in a cloud and eventually shift to a cloud model.
Now we have Ambassador Cloud. My previous experience was valuable in that we enabled fast releases and fast pipelines for Ambassador Cloud – now we are looking at adopting some of these practices for our other internal tools, such as Telepresence. That is, we can run everything internally before it is actually released to clients, and proactively provide patches, bug fixes, and so on within a day rather than in a month."
Drinking the champagne in Alex's case did not only mean using internal Ambassador tools but also improving them and helping others within (and outside of) the organization to rethink how the tools were used, i.e., moving from the binary to cloud approach in order to move much faster and achieve better visibility, insight and incident preparedness.
Drink your own champagne; avoid an incident hangover
"When I think of drinking our own champagne, I don't always think it is about the tools. And I think you can bring champagne in the form of processes and best practices along with you in your professional journey," Alex shared. "In the later days of my previous job, I was super into observability – distributed tracing was a thing of beauty. I brought a lot of the instrumentation, some of the observability and some operational best practices from my previous role. From the first line of code, everything was monitored, we had metrics all over the place. We have great engineering practices and incident management, things like Game Days and runbooks, in order to provide the best platform out there in uptime and performance. It doesn't happen by accident."
For example, the Ambassador engineering team, including VP of Engineering Katie Wilde, has extensive knowledge, both from previous experience and at Ambassador, of robust incident management. What every team member brings to the table ensures that the Ambassador champagne is infused with fresh approaches, best practices and behaviors that equally define the entire company culture and the product.
Incidents, for example, happen unpredictably, but Alex and the rest of the Ambassador engineering team prepare as a regular part of doing business to avoid the worst consequences and "hangovers" from incidents. "From the moment we released Ambassador Cloud in the wild," Alex explained, "we had runbooks, checklists, an internal tool called Rosie that we use to follow our incident response process. Every engineer is on the PagerDuty rotation. We have a clear escalation policy that lets any engineer be the '911 operator' who can assess the situation and assemble the right team of experts to resolve and mitigate the incident."
Game Days: Everyone plays
Once every couple of months, Ambassador has a fire drill of sorts, and upper management mandates that somebody from engineering wreak controlled havoc, according to a pre-existing plan, on a system. The objective of these exercises is to see how the team responds and learn from it to take action to improve our operations. These are collectively known as Game Days.
Game Days in particular allow internal teams, even outside of engineering, to test these incident management systems and give tools a run for their money. These events also give everyone the opportunity to be involved in the incident management process. "Incident response is not just about engineers fixing problems," Alex reasoned. "It can also be about our marketing team, for example, communicating externally about an issue we are having and working to fix. Incidents affect everyone in the company. An outage might not even be software related. It might be something in our process, or even an external tool. A recent Slack did not impact customers but that really affected our ability to work as a distributed team. Incident response is about how we systematically cope with something like this, and how this information becomes a part of our working culture."
Making sure internal tools go down smooth
As much as processes and culture cultivate an environment conducive to using one's own tools, the tools still have to work and serve their purpose. When Alex arrived at Ambassador, he had never used Telepresence as part of any of his workflows, for example, so learning Telepresence was atop his to-do list. Once Telepresence became a part of Alex's developer toolkit, many things in his workflow started to shift.
"Although I use Telepresence on a daily basis, I almost don’t see it because it is embedded in my other tools. I will use Telepresence in integration tests, but Telepresence is just sitting there, as a binary that’s being used and bootstrapped by my test suite. I will use deploy previews on my PRs, but I don’t use the Telepresence command line. But use different features of Telepresence almost invisibly in different workflows, whether it is just connecting to a remote cluster to have my laptop sit and use the same DNS as my cluster," Alex explained. "I keep discovering different Telepresence features all the time. It is such a powerful tool, but it is hard to identify one particular feature that I use because it is so embedded in my workflow now and in the way we build and ship things."
Conclusion: Every grape matters - Harvesting, producing and drinking your own champagne
The idea of "drinking your own champagne" is much more about creating a culture in which all the ingredients – the people, their experience, their expertise – meld to make it what it is. In that sense, it is probably more about producing the champagne than drinking it. But that's why culture is so important – it's the "mash" from which well-functioning tools and processes are created.
Alex concluded, "Once we have built and live in a strong culture, we have room for experimentation and sharing. We like to share what we are working on across the org, and feedback is always more than welcome. Engineers share demos, and we get feedback from sales, marketing, devrel and support that help us adjust and better meet demand. We have been doing this with demo videos for as long as I have been with the company and then started doing the same thing with documentation, ahead of releasing features. Once more, drinking our own champagne is a lot about sharing and collaboration. We learn from each other and each other's experience, which informs how tools are created and improved."