From Dev to Ops: Infrastructure is Hard, but Tooling and Visibility are the Key for Full Cycle Developers

Cheryl Hung, former VP, Ecosystems at the Cloud Native Computing Foundation (CNCF) and former Google software engineer, talked to Ambassador about the developer experience and the ownership of the software life cycle

Cheryl Hung, whom many readers will recognize from her work within the CNCF community, spoke with Ambassador Labs's Daniel Bryant recently about her views on the cloud-native developer experience. The conversation touched on the challenges of developer ownership of the full development life cycle, empowering developers in the cloud-native paradigm, and the painful difficulties of managing infrastructure.

Full Cycle Development: Infrastructure is hard

"It's a controversial take, but with a developer bias, I find that infrastructure can be unreliable; it fails; it is unpredictable. Compared to software that runs pretty much the same way every time, infrastructure is really, really hard." -Cheryl Hung, former VP Ecosystems, CNCF

Developer ownership of the full software life cycle is often cited as a prerequisite of cloud-native development on Kubernetes, but is not necessarily a given, according to Cheryl. Having worked on infrastructure for a number of years following many years as a software developer at Google, Cheryl states that the developer still needs to primarily focus on building things that work and run -- and the infrastructure making that happen just has to work. The developer does not care (and does not need to) what makes up that infrastructure. "My personal take: let the application developer focus on what they are good at and let the infrastructure teams focus on what they do."

Cheryl expressed skepticism about the idea of full developer ownership. As she explained, "infrastructure is very, very hard" and requires potentially years to figure out the best way to do fundamental things, such as storage and monitoring. However, a developer isn't completely let off the hook. It's unrealistic to expect a developer and a DevOps engineer to be "interchangeable", but some understanding of each other's roles and responsibilities is essential to collaborating effectively. And collaboration is the key to delivering software with the speed and safety the business requires.

Shift-left? How much infrastructure do developers need to know?

How deeply involved in the software lifecycle ownership the developer is depends largely on the kind of organization the developer is a part of. Cheryl described her own time as a developer at Google, where the developer experience was paramount and infrastructural concerns were invisible. Developers could get on with their work with ease and convenience, knowing that the underlying platforms were taken care of. At the same time, however, Cheryl explained that as nice as this is for the developer who just wants to code, this setup is not conducive to understanding the bigger deployment and delivery picture.

In an environment like Google, where resources felt unlimited, this might not have been a problem. However, for developers in leaner, smaller organizations, questions about provisioning and managing infrastructure and the monetary costs of building for scale requires more knowledge and care. This kind of developer experience is fundamentally different, demanding that these developers understand the full process. With "you build it, you run it" driving cloud-native developer experiences, the approach to the shift-left of responsibilities has become mainstream, and developer ownership rests largely on how mature the specific company is.

Despite new cloud-native complexities, Cheryl highlighted benefits: "The reason I like containers and everyone likes the cloud-native paradigm... it makes it easier. It lets you pretend that you’ve got a cluster that can scale infinitely, that has no problems, that never goes down." In this paradigm, given the right tools and visibility, developers do gain an opportunity to move toward as much or as little ownership as they (or their company) want. The evolving developer experience, then, is about flexibility.

Powered by the right tools and visibility: A centralized developer control plane

What are the right tools and visibility to empower developers to take on full lifecycle ownership?

Cheryl shared that this, too, can vary. Much like Kasper Nissen of Lunar, who described providing a "paved path" for developers to ensure that they have a set of recommended tools and the visibility they need to do their work, Cheryl echoes the importance of providing the right level of abstraction between developers and platform, and the extremes this can take. That is, some companies overwhelm developers with access to everything, while others hide everything away and build a custom abstraction. Either extreme can be problematic, but it is difficult to establish proper guardrails.

By finding a happy medium, for example by creating a developer portal or control plane, it's possible to set a baseline experience without tying a developer's hands if and when they want to dive deeper into the underlying infrastructure or swap tooling.

"Most of the time, you don't need the depths of everything to code. You need something like Backstage- a developer portal that lets you do what you need to from a dashboard. Give developers a UI from which they can do 95% of the things they need to do. This lessens the learning curve and creates a good developer experience. At the same time, if they need to access anything beyond that, empower them to learn the platform tools, to 'break the glass' to escape the default dev environment they've been given."

Conclusion: Self-service to empower developers with just enough freedom and responsibility

Balancing freedom with responsibility is central to empowering developers to move to an ownership mindset with the software they develop. Previous interviews in this series with Lunar's Kasper Nissen and CartaX's Mario Loria called out the need for platform and SRE teams to guide cloud-native developers toward the code-ship-run model and to becoming more independent by providing the fundamental platform components and tools to support them. That is, lay the groundwork for success and make the developer freedom/responsibility equation clear from the outset.

Cheryl mirrored these experiences: "The more self-service you can provide from the platform to the application developers, the better. It saves time on both sides, and empowers both the developer and the platform team to focus on their core focus areas." The platform team is best placed to build out the kinds of tools and functions developers request, helping to create a self-service culture.