Join us on July 18th for a webinar on CI/CD Pipeline Optimization with APIs & K8s. Register now

Reduce complexity with standardization but make room for exceptions

When practitioners in cloud-native software development hear the word "platform" or the term "platform engineering", no shared definition springs to mind. To understand more about what the platform concept means to people working in different roles, Ambassador Labs talked to Humanitec's CEO, Kaspar von Grünberg, nesto's Director of DevOps, Mathieu Frenette, and Syntasso's Chief Operating Officer, Paula Kennedy.

Developer platforms: Different things to different people

No single, industry-wide definition of a developer platform exists, to date, and according to our discussion, that may be the point. The idea of building a platform is not to offer an out-of-the-box platform-as-a-service (PaaS) but instead to drive standardization within the organization building it and to create ease of use for the developer. And that will mean different foundations and tools for every organization.

Humanitec's CEO, Kaspar von Grünberg, described platforms from his point of view, "For me a platform is the sum of all tech and tools that a platform engineering team binds together to pave a golden path or paths. Developers leverage this path to self-serve with low cognitive load. That is important to me because it drives standardization by design for the respective organization. What a platform is will remain a blurry definition - because it lies in the eye of the beholder."

Mathieu Frenette from nesto echoed similar sentiments, explaining, "A platform is everything that we can provide as tools, components and foundations to help developers do their job more easily, faster, and ideally with more robustness, predictability, repeatability and confidence. I would like to emphasize confidence because confidence is key to and prerequisite for innovation. if we want developers to be at ease trying out new things - they need to be confident that they have a solid platform underneath them."

Looking at platforms from the C-level, Syntasso's COO, Paula Kennedy, agreed that platforms are different things to different people and are built to satisfy the needs of the organization building them and to help create a curated developer experience.

Informed by the Team Topologies approach to platforms, Paula shared, "Team Topologies highlights the importance of the thinnest viable platform, providing just enough to give the developers what they need to get the job done – enough capabilities that developers can go fast without worrying about things like infrastructure or security. Providing a good developer experience via a developer platform is about never giving a developer too many choices or adding to cognitive load."

Think big picture on developer platforms

What does a developer platform need to deliver and who drives the process of building a platform in an organization? Once again, one size does not fit all, as every organization has different needs and different levels of cloud-native maturity. Nevertheless, the panelists acknowledged several similarities in both the motivations and challenges of building and launching a platform.

Paula shared, "Everyone wants to get value in the hands of customers, but in developing platforms to support this goal, we have seen different product teams across the same org duplicating efforts, each of them building a solution to the same problem, everyone building their own path to production with multiple platforms. Sometimes you may also have an ops team trying to centralize.

But none of these efforts succeed if those driving the development are not talking to product teams, gathering user research or meeting customers' needs – they are building a solution that may never be used, and wasting resources through both duplication of effort and not understanding the underlying need for the platform."

Sharing similar experiences, Mathieu was keen to paraphrase a quote from Albert Einstein, "Einstein said, 'You can't solve a problem with the same thinking that created the problem'. With platforms, people are too often absorbed in their daily work and immediate problems, losing the bigger picture.

One key to building and making a developer platform successful is having a platform team that can take a step back from the daily noise and figure out the value stream, bottlenecks and solve at that level, routinely bringing in a fresh perspective."

Both Mathieu and Paula described something akin to the creation of an "unintentional platform", where a team (often a developer team) creates something to solve a discrete problem, but they keep adding to it until they have a platform they never planned to build.

While this approach can work and meet developer needs early on, according to Mathieu, as things scale, the reality changes. "While the impetus to create a platform may start with developers, somewhere along the line, you need the platform team. As things scale, developer teams start to get pressure to deliver features.

Naturally there is less focus on infrastructure and foundation. Technical debt accumulates, and even the management rarely acknowledges the cost of technical debt or necessity to pay it off. It often happens that the initial platform becomes obsolete or no longer up to the task. This is why we need the big picture view on platforms."

Developer experience: A move to the right from shifting left

Extending this line of thinking, Kaspar, too, has observed similar trends, with platforms "building themselves". He recognizes that while requests for platforms often come from operations, it's developers – who are, in Kaspar's experience, reluctant to "shift left" who benefit from the self-service nature of platforms and the ops time that the adoption of platforms can liberate.

"If I am a developer, 80% of the time, it is business as usual: Git push, updating business logic, etc. With a platform and standardized tooling, the ops team is freed up to focus on the 20% of other cases with the developer. For example, in a large enterprise org, developers can be confronted with extreme waiting times for ops assistance.

A platform can turn the tide, and developers would become more demanding about 'we need more self service, more automation,' which gives developers a better experience, lets ops and platform teams focus on the big picture and exceptional cases, and keeps developers from having to shift left completely."

ROI to justify platform engineering

No organization will be prepared to invest in platforms without a full understanding of the metrics justifying the need for a platform. Each of the participants had slightly different takes on the question of ROI.

Because, as Kaspar explained, the platform engineering space is immature and constantly evolving, it is difficult to measure the success, or return on investment, of adopting a platform. However, he is seeing a significant increase in the willingness of businesses to "put their money where their mouth is", as long as there are ways to articulate the value of the platform they are building and deploying.

Kaspar continued, "To calculate ROI, I recommend that platform engineering teams write down the things that go beyond the simple update of an image, then normalize it. How often do you do that against 100 deployments? How much time does that involve for developers and ops teams? Answering these questions can build your own ROI case. Every good product manager is looking at their contribution to the business. Without these figures to justify what you're doing, you cannot blame anyone because they are making decisions with the data that they have. If you don't provide that data, they cannot make a decision and fund your platform."

Mathieu fully agreed with Kaspar but extended the equation, "Often there are specific things we want to tackle, in our case it's the struggle with integration testing and building CI, that are intricate and heavy for developers, and this takes a toll. That is, it takes time to run, it is difficult to set up when you want to troubleshoot, it is complicated to assemble and configure. By extension, it is extremely difficult to onboard new developers because of this complexity, and we end up losing developers because of this complicated work environment.

All of these are metrics that must be taken into account, too, but these are truly challenging to calculate. How can you put numbers on developer onboarding and experience? Maybe with a platform you can, and you have to because all of these factors are costs to the business."

What is important to include in a platform? Value versus cost

Proving value to the business, Paula agreed, is essential, "We have long talked about metrics in terms of the five Ss – security, scalability, speed, savings and stability. And these are important once a platform or product is built. But actually setting the right KPIs and measuring the right metrics depends on understanding true stakeholders and their journeys and what they each care about. We get an accurate reflection of what is critical and what isn't rather than building our platform and metrics around assumptions."

Paula went on to describe a two-day stakeholder workshop launched just before the inception of a platform within a large financial institution. As part of this exercise, they mapped out the journey from the developer's idea for an application through all the steps that have to happen to get it into production. Each team of stakeholders then added all their necessary steps to the journey. Paula continued, "We could easily see what steps took the most time, what could be cut, what steps are actually important to stakeholders, and build from there."

Defining what is critical before taking any action is also key to understanding what to build, for whom, and how to secure adoption and use, all of which will funnel into platform success metrics and ROI figures.

Platforms: Where to start?

Paula's views on the full journey highlighted one of the biggest questions – where should companies get started with the platform journey? The interviewees, coming from quite different roles, brought different perspectives to bear.

Paula, taking a high-level view, expanded on her earlier rationale, "My natural instinct is to examine what is commodity and what is the unique part you should focus on. You need your building blocks, analyzing the full set of requirements needed, do a build-versus-buy analysis, and assess how you can build a platform that is just the best, lowest cost, well-supported tool that already exists.

Most of your platform will be built this way. Then assess where your differentiator is, where your platform engineering team needs to focus. Analyze at a high level and determine what you need: 90-95% of what you need comes out of the box, and compose those. Your last, unique 5-10% is your special, unique sauce."

Kaspar shared Paula's take on determining the right building blocks, "I would ensure that the often under appreciated building blocks – the things that happen most often – are taken care of. For example, teams frequently need to fix configuration management.

The thing you do most often, apart from updating an image, is adding an environment variable or applying a change across all environments, and if you don't do that well, you drive change failure rate through inter-environment drift. It is not that complex but is a common problem."

Mathieu shared both other guests' opinions, bringing everything home to following best practices. "There are a lot of different ways to do things, but if you start and stick with best practices from the beginning, you still have all the possibilities open to you. For example, sticking to one app per process, sticking to regular Linux-based signaling, putting everything in containers, etc., you have not painted yourself in a corner and still have options.

Do you really need Kubernetes, or do you want to go with Heroku? All those things are still possibilities if you created your application with best practices in the first place. As Paula said, we are not trying to reinvent the wheel when we are designing our system - the main point is to focus on value."

Conclusion: Complexity reduction – Platform standardization and necessary exceptions

"Perhaps more important than fancy golden paths is to sit down and come to an agreement on the lowest common tech denominator," Kaspar declared. "Why run ultra-complex things if there is an alternative? It is like taking a tractor to do your grocery shopping, which is not productive. If you scatter things all over the place, you are not getting the effects of scale, and the tools you bring in are not delivering ROI. This is why I advocate for the value of standardization.

Standardization forms the lowest common tech denominators, clearing the way for individual freedom where needed. We cannot get too hung up on the idea of not getting locked in, as Gregor Hohpe has argued, because avoiding lock-in is a kind of lock-in. Instead, focus on the standardizations, or golden paths, that reduce complexity and help developers move faster."

Mathieu's experience mirrors Kaspar's thinking on standardization, "Golden paths should be the easiest path, which covers most of your cases. But there will always be exceptions. If developers do not have exit routes from the golden path to address exceptions, the golden path becomes a golden cage. In our case, we make exceptions but afterwards go back and figure out how to introduce that exception into the golden path, maybe make it a new standard. In our case, when we introduced Kafka as an alternate way of doing asynchronous messaging, we decided to make it the new standard. We will even try to phase out the old solution eventually. Exceptions are important - for us, they drove innovation and allowed us to move forward."

"Standardization is how to gain economies of scale and scope, helping organizations reap many benefits," Paula added. "There will always be edge cases, and making them as standard as possible, as Mathieu said, is key to creating the balance. Anyone who is thinking of building or buying a platform has to try to focus on making the 80% as standard as possible while leaving room for those critical edge cases, where you're likely to add that extra, unique, magic sauce that only you could build."