In this latest episode of The New Stack Makers podcast, we delve more deeply into the emerging practice of platform engineering. The guests for this show are Aeris Stewart, community manager at platform orchestration provider and Michael Galloway, an engineering leader for infrastructure software provider HashiCorp. TNS Features Editor Heather Joslyn hosted this conversation.
In this latest episode of The New Stack Makers podcast, we delve more deeply into the emerging practice of platform engineering. The guests for this show are Aeris Stewart, community manager at platform orchestration provider Humanitec and Michael Galloway, an engineering leader for infrastructure software provider HashiCorp. TNS Features Editor Heather Joslyn hosted this conversation.
Although the term has been around for several years, platform engineering caught the industry's attention in a big way last September, when Humanitec published a report that identified how widespread the practice was quickly becoming, citing its use by Nike, Starbucks, GitHub and others.
Right after the report was released, Stewart provided an analysis for TNS arguing that platform engineering solved the many issues that another practice, DevOps, was struggling with. "Developers don’t want to do operations anymore, and that’s a bad sign for DevOps," Stewart wrote. The post stirred a great deal of conversation around the success of DevOps.
Platform engineering is "a discipline of designing and building tool chains and workflows that enable developer self service," Stewart explained. The purpose is to give the developers in your organization a set of standard tools that will allow them to do their job — write and fix apps — as quickly as possible. The platform provides the tools and services "that free up engineering time by reducing manual toil cognitive load," Galloway added.
But platform engineering also has an advantage for the business itself, Galloway elaborated. With an internal developer platform in place, a business can scale up with "reliability, cost efficiency and security," Galloway said.
Before HashiCorp, Galloway was an engineer at Netflix, and there he saw the benefits of platform engineering for both the dev and the business itself. "All teams were enabled to own the entire lifecycle from design to operation. This is really central to how Netflix was able to scale," Galloway said. A platform engineering team created a set of services that made it possible for Netflix engineers to deliver code "without needing to be continuous delivery experts."
The conversation also touched on the challenges of implementing platform engineering, and what metrics you should use to quantify its success.
And because platform engineering is a new discipline, we also discussed education and community. Humanitec's debut PlatformCon drew over 6,000 attendees last June (and Platform 2023 has just been scheduled for June). There is also a platform engineering Slack channel, which has drawn over 8,000 participants thus far.
"I think the community is playing a really big role right now, especially as a lot of organizations' awareness of platform engineering is just starting," Stewart said. "There's a lot of knowledge that can be gained by building a platform that you don't necessarily want to learn the hard way."
Alex Williams 0:08
You're listening to the new stack makers, a podcast made for people who develop, deploy and manage at scale software. For more conversations and articles, go to the new stack dot I O. All right now on with the show
Heather Joslyn 0:49
Hello and welcome to another episode of the new stack makers. I'm Heather Jocelyn Features Editor of the new stack. And today we're going to take up a topic that's really caught fire in 2022, which is platform engineering 2022. So the first platform con a conference aimed at Platform engineers that drew more than 6000 attendees, and Gartner added platform engineering to the hype cycle for software engineering. So that's another sign of its hotness, and platform engineering was the talk of the sessions in the hallways at cube con in Detroit this fall. One of our guest today is one of the reasons the topic was on everyone's minds in the second half of 2022. You may have read the article by Eric Stewart, Community Manager at Humana tech that appeared in September on our site it was titled DevOps is dead embrace platform engineering title, that is the amount to get your attention. And so did the article. It was one of our most read posts in 2022. And Eris is here to talk about the response from that, and also how an organization is interested in platform engineering might get started in 2023. Hi, Eric, thanks for joining us.
Aeris Stewart 1:53
Hi, thanks for having me.
Heather Joslyn 1:54
And we're also joined by Michael Galloway, the new director of platform engineering, infrastructure and hashey Corp. While in his previous role as senior director of engineering at DOMA, Michael wrote a great piece for medium X, a couple of them on building platform engineering team with purpose and with sense of mission. And that's worth the read, following our larger discussion today about adopting platform engineering. Hi, Michael.
Michael Galloway 2:16
Hi, thanks for having me.
Heather Joslyn 2:18
We've got a lot to talk about. So let's just jump right in just to get all our listeners on the same pages. The two of you, how do you each define platform engineering? Eric, do you want to go first?
Aeris Stewart 2:27
Sure, I can go first, I would define platform engineering as a discipline of designing and building tool chains and workflows that enable developer self service.
Michael Galloway 2:39
And Michael, I think I probably have a similar definition, I would define platform engineering as an organization that provides tools, services and opinions that free up engineering time by reducing manual toil cognitive load. But it also enables the business to scale with reliability, cost efficiency and security. A real quick, great example of this is was at Netflix, where they built the platform organization that really focused on enabling engineers to adopt a full cycle developer approach. So that was meant all teams were enabled to own the entire lifecycle from design to operation. This is really central to how Netflix was able to scale. And so platform engineering teams built services that made it possible for engineers to deliver code, for example, with safety and reliability and consistently, without needing to be continuous delivery experts. So that would be one example of where I, I have found platform engineering to be a critical function in an organization.
Heather Joslyn 3:38
Aris, your piece on TNS really drew together some of the thinking out there about the rise of platform engineering, and the reasons why DevOps has not caught on, particularly with devs. What kind of feedback did you hear from the larger community after was published?
Aeris Stewart 3:52
Well, I wouldn't say that DevOps hasn't caught on with developers. In fact, they would say it's mostly the opposite. You have to put DevOps as a philosophy in context with the siloing, and the inefficiencies that came before it. And so when you look at it that way, it really makes sense that DevOps has become as popular and pervasive as it is today. And I would say, because of that, the reception to the original article was quite mixed. It kicked off a very big and continuing debate about what DevOps is, and if platform engineering can replace it in the first place. You know, in the article, I used to very simplistic definition of DevOps, which is like you build it, you run it. And while I think that can become the reality for a lot of developers at these organizations, it doesn't quite capture all of the contributions that DevOps has made to the space. You know, if you're building a platform, you can still be using DevOps metrics, DevOps tooling, DevOps principles. And so I think, you know, one of the best criticisms and some of the best conversations I've had with the community about the article is you know, platform engineering is making an important contribution to the space, it is helping alleviate cognitive load, it's helped helping organizations scale a lot of these solutions. And platform engineering is helping a lot of these organizations realize what you know, DevOps originally promised, which is that you can deliver software faster, just without the additional cognitive load.
Michael Galloway 5:23
I really liked the way Aris framed the challenge in that article, I generally I would very much agree, I think DevOps gets confused sometimes because it's, it's been used for so many different things. But I tend to look at DevOps as a philosophy more than what I sometimes see it applied as a title or a role or something like that. I really don't think there is a DevOps person on a team, but rather, it's the way that the team thinks about its responsibilities and role. But exactly what what Eric just said, around the idea of operating what you build or operating what you design, part of the problem there is, is that there's a significant amount of burden on engineers or others to learn how to be good at all of those other aspects of the lifecycle that they haven't really spent a lot of time on. It's why historically, we've had things like folks that could dedicated to QA or folks dedicated to operations or folks dedicated to, you know, first tier support or things like that. So this is where I think platform engineering largely enables that philosophy, it enables organizations to actually achieve that philosophy without requiring an enormous amount of burden on individual teams, or individuals to build up those skills, and rather unlocks their ability to focus even more so on the product goals, but still have the incentives and capability to manage problems in production that can then feed back into future designs. So you need to enable them to be effective at that.
Heather Joslyn 6:55
So it's more a complement to DevOps, or in terms of it enables your organization to achieve DevOps principles and achieve a DevOps way of working. A question for both of you would what kind of impact might that also have on developers happiness and or, and in operations engineers happiness as well, you know, in terms of with their jobs, I mean, we a lot of organizations struggle with, you know, how to how do you retain engineers? How do you keep them on the job and happy and make their jobs, you know, fulfilling, satisfying for them? I mean, is that is that just platform engineering have a role there as well?
Aeris Stewart 7:29
Yeah, maybe I can start with a general example. And Michael can give us a more specific example from his work, I would say that platform engineering can improve the quality of life for both developers and operations, for deaths, having that cognitive load of learning all of these different, really complex tools and workflows can massively improve quality of life, and having the support from the platform can be really helpful. That being said, you know, platforms don't necessarily provide like a golden cage for developers, either, they still have the freedom to, you know, look underneath the hood if they need to. And so having that balance is really important for operations. Sometimes they can be stuck with a lot of tickets from developers who don't know how to navigate all these complex tools or infrastructure. And this distracts from their more important responsibilities. So operations can really benefit from platforms because it reduces the need for developers to rely on them for, you know, simpler tasks, and they can focus, you know, on their more important responsibilities.
Michael Galloway 8:35
Yeah, just maybe the build on that. I think, you know, engineers don't typically care that much about the underlying infrastructure or continuous delivery or continuous integration, I have very specific and vivid memories of being in delivery engineering at Netflix and examining pipelines to have conversations about, you know, ways we can inject better safety into the delivery workflows, and this is something I was particularly passionate about. And I was very much pretty much alone in that passion. When I engage with teams. And I understand, you know, it's, it's sitting down and talking about how we can reliably deploy the three different regions and roll back if we don't, is, is not really a day to day thought process or concern right there. The teams are hired right to deliver those awesome features. So they really just want to work on on building the cool stuff for business. I think, very much engineers love the ability to be able to observe, introspect, solve issues, if they're happening in production. Nobody wants a release of theirs to be an issue. So they want that. But what this really comes back to is I think, as you start to grow and scale as an organization, if you put the responsibility of all of the vertical bits, in terms of getting certain solutions out there on engineers, each team is going to then carry that burden and in work gets really problematic is when you need cross team consistency. There isn't anybody who owns that. So usually you'll see working groups or communities of practice around delivery or other things, when you don't have engineering platform engineering groups to try to solve this problem that even more so is where the burden starts to really creep in. And engineers start to express this, Hey, 50% of my time, I'm dealing with, you know, non product engineering concerns, because I have to do this or I have to set up that. So then they get into a copy and paste strategy just to basically move past that, which ultimately causes lots of problems when you want to scale as an organization. So yes, it alleviates a lot of
Heather Joslyn 10:41
burden, there are likely some listeners who want to try platform engineering in their organization, which would be their initial steps.
Michael Galloway 10:47
So the very first thing you actually want to do is, if you want to get this started is you you do find DevOps, inclined engineers, you find those people that really do enjoy this, and you build a small working team, and that team is the seed, that team needs to come from your product engineering teams, so that they have a deep understanding of what their workflows and challenges look like. And so when you're going from zero to one, you really want to start delivering real value and impact quickly to earn engineering trust. Once you have that, then you can start basically looking at taking a step back, because now you want to go from one to 10. And you need to start understanding, you know, things like what is this organization here for what are the business outcomes that need to be achieved. And you can start having bigger direction for but it starts with a very scrappy, small working group that can quickly start replacing just small problems and workflows, build some small tools, some simpler simplification, you're not trying to replace anything yet, you just want to simplify. And that will give you a really good starting point for understanding what really the value of a platform engineering team should look like. You should not just copy and paste from some other company it because those problems are very different between one and another company in terms of where you should focus or address and the maturity cycle of that
Heather Joslyn 12:05
of each era. Steve has some thoughts about initial steps. Yeah,
Aeris Stewart 12:08
I would say echo what, what Michael says, you know, start small, tackle the lowest hanging fruit. Don't try to reinvent the wheel. And really start establishing a strong feedback loop with your developers so that you understand whether or not your platform is really addressing their needs, the way that you intend it to
Heather Joslyn 12:29
Michael, you spell out some of the ways how an organization should go about building a platform team, in your articles on medium? Can you sort of walk through some of the steps you took at your previous organization to give us a sense of how why those steps made a difference?
Michael Galloway 12:44
So one of the most important ones I think starts with defining your purpose. And so there's main four steps that I recommend are defining your purpose. This is by the way, going really from a one to attend that zero to one should be scrappy, you shouldn't spend a ton of time on this, you just are trying to find out if there's really a there they're sort
Heather Joslyn 13:02
of a minimal, minimal viable product in a way of this. Yeah.
Michael Galloway 13:07
Right. This in this case, this is I'm now I think there is a there there and I want to start scaling up and I need to think a little bit more I need to form an actual organization. So the four steps are really at a high level defining your purpose. In other words, why do you exist? What are you here for? And that's a non trivial question. I mean, there's usually common explanations for that. But you really want to go through the the effort of defining this and I go into a little bit of detail around how to get into that, after you've got your your purpose defined as why you exist, you need to define where you're gonna go, what is the outcome that you're gonna go after. And that's really your vision. And really, you're looking for about a 12 to 18 month timeline, you don't want to go too far down into down the future. But you want to get to something that is meaningful, that is resonant meaningful, then you define a plan, a plan broken up in milestones, and a delivery of real value on a steady cadence. That's really important because platform engineering organizations are notorious for accidentally slipping into a black box mode where nobody understands what they're working on or the value they're producing. So you need to be very conscious of regular drops of real value, as well as just making sure you're tying to the business needs which evolve. And the last one is really forming a comprehensive communication strategy. There's some great articles on platform engineering, communication challenges, they are unique because of the nature of that organization, in terms of who you need to stay in communication with how you communicate your value, how you regularly establish those touch points. And they're more than just reaching out to the engineering teams. And internally, you need to factor in upper leadership and you need to factor in other stakeholders because those are all people that impact your ability to really move the needle when it comes to things like horizontal migrations that might need to happen like moving from one type of database system to another if that's a really important thing for the business. In order to be able to do that you have to have have lines of sight to a variety of different parts of the organization.
Heather Joslyn 15:03
Yeah, I mean, the communication aspect, I think sometimes you'll forget about that just sort of make your make your work visible to the rest of the organization make, especially the people that decide your budget. Right? Yes. One thing that you address in the medium pieces is the importance of dealing with unaddressed inefficiencies, the the sort of the pebble in your shoe that can slow down during tations overall productivity, how do you incorporate time to deal with that with, you know, platform team, like sort of balancing, you know, dealing with that sort of product and support aspects of the platform team?
Michael Galloway 15:34
Yeah, the best part is Heather, I think your question is actually the answer is right there. And that's the one thing I think Eris probably can talk to this a lot, too, that a very big theme that has come out a lot in 2022 is recognizing that a platform organization is and must function like a product organization. So it's both actually a support organization and a product organization. So it's very interesting, because you have very different cycles and things you need to pay attention to in each type of organization. But a platform organization that is formed from an IT group tends to over index on functioning as a support organization without enough product planning or direction setting. And that's where they fall into a trap of all the support requests coming in are effectively their roadmap, which means that they are only able to scale through sheer amount of humans, and some degree of automation that it can try to introduce. Similarly, one that only comes from the product engineering organization tends to want to go into a black box and build a massive solution, that's going to get dropped 12 months later, and hopefully change the planet, which is also a risky anti pattern. From platforms they should see, you really need to balance both, you know, as the support organization, some tactical things you can do to find those pebbles, or things like when you are getting support requests, make sure that you start gathering context about that, ask, what are you trying to accomplish? And gather that context, start capturing that information, stay close to your customers, interview them engage in research on terms of how they're utilizing the tools, ask questions and surveys, and get more curious, ultimately, about what it is that they're trying to accomplish. What are their concerns? What are the pressures that they're receiving? Crucially, this is icy to icy, honestly, I mean, I know I'm in leadership, but I'm going to just disparage the leadership slightly here, which is leadership conversations alone. So me going in meeting with other managers or directors of the other organizations very rarely uncovered. The pebbles. pebbles are really at the individual contributor level where they're interacting directly with tools and solutions. And so those signals you need to get at that level, and you need to just budget time for it, like you would budget for anything else. That's important. Arrows. Do
Heather Joslyn 17:45
you have some thoughts?
Aeris Stewart 17:46
Yeah, it's always been very striking to me that a lot of you know, finding the balance between support and product is really cultural or people issue unless, so a technical one. And I think that is where the the platform engineering community can really step in and start to share more of that knowledge that the practitioners have the technical side down. It's the how do we communicate and build this feedback loop effectively, so that we're building the right features into the platform, and we're addressing the right problems, that that tends to be a really big issue. So yeah,
Heather Joslyn 18:22
just to follow up what what signals do platform teams need in order to build relevant solutions for engineering.
Michael Galloway 18:28
So some of the key signals that I've seen first they do come from directly from customer interaction, there's a jobs to be done framework that I found is very helpful when you're trying to ask questions from customers about what it is they're trying to achieve, how do they stand up a new application? How do they add a database to their service? How do they debug their service? How do they know if a service is failing? Those kinds of questions help to get you really a clear understanding as to the constructs in their minds on how to do their jobs. And that's when enough of those will start to give you signals as to where there's problems, other important inputs, our product roadmaps, you actually, as a platform organization, need to understand where those product teams are going. Because when you are engaging them, you need to talk about the things that are relevant in their world. And those are the things that are relevant in their world. Another signal is of course, upper leadership. I always liked the question of what keeps you up at night that helps you understand what senior leadership is most concerned with? Are they concerned about reliability or cost efficiency or scaling? Are they concerned about security? I mean, probably the answer is yes. But to what degree what is the focus that they have? So those are really important signals in in planning your work?
Heather Joslyn 19:35
And how do you measure success as a platform team?
Aeris Stewart 19:38
I would say it's a combination of quantitative and qualitative feedback. So you're going to want to keep track of the performance across key DevOps metrics, but also you're going to want you know, qualitative feedback from your customers on what's working for them. What doesn't, you know, you don't just build a platform and leave it there. You're gonna to iterate upon it and improve it. So, you know, asking the right questions, figuring out where the platform can still improve, um, is going to be really important in making sure that you're building the best solution for your developers,
Heather Joslyn 20:13
hopefully have some thoughts as well, on that, what might make success?
Michael Galloway 20:17
Yeah, I would agree. I think there's some very simple metrics. So the metrics that are gonna matter, they're gonna depend on who you're communicating those to, right. There's your own internal metrics, in terms of what kinds of use cases do we formally support? How much of the path have we paved, right? For customers? There is also the degree to which you know, how many individual steps have you eliminated that somebody needs to do? Those are very meaningful to your customers directly, the business as a whole may not look at those. But those matter in terms of your customer relationships, the business as a whole is going to care about overall feature velocity time, certainly they'll look at the door metrics, platform engineering does not itself, move those door metrics generally, though, and I think it's really important to know that we can influence them. But a lot of the things that affect those metrics are cultural in nature and depend on the team heavily. So I think platform engineering can provide guidance, support and the tooling to help them and that in that way, we can influence it. But we can't just point to those metrics alone. So you do need to take a few things that you can more directly influence and speak to those,
Aeris Stewart 21:25
I forgot about one of the most important metrics adoption, are people actually using
Heather Joslyn 21:30
your platform? Yeah. Yeah. I mean, you can build the best solution possible. But if no one's using it, you know, that's, that's a problem. So Eris, can you were one of the forces behind the first platform con and in 2022, can you tell people a bit about the role of community and supporting organizations that are on this platform engineering journey, and, and how they might find that community?
Aeris Stewart 21:54
I think the community is playing a really big role right now, especially as a lot of organizations, their awareness of platform engineering is just starting or growing right now, you know, there's a lot of knowledge that can be gained by building a platform that you don't necessarily want to learn the hard way. And I think, you know, having a venue for practitioners to share their platform journey is really important in supporting organizations and those first steps, figuring out what works, what doesn't work, I think, especially going into 2023. Now that there is a greater conversation around platform engineering, we're going to see more case studies and reference implementations and deep dives into tooling with combination of tools that platform engineering organizations can take advantage of, and I think this will be really helpful in getting platform engineering initiatives off the ground and working just a lot faster. And I also think, you know, the community provides a place to just you know, discuss and learn what works and what doesn't, you can access the platform engineering community on the website, I think it's platform engineering.org, there is a Slack channel with over 1000 members across the world, I think Michaels also in there to some of his insights. Yeah. And then, um, you know, we have webinars in person meetups about like product management, best practices, and open source tooling, and all that great stuff. But then platform con 2023 is just around the corner. And that is also another great resource for folks who are looking for you know, that expert insight when it comes to platform best practices.
Michael Galloway 23:29
There's really a great community, I've met a lot of folks in there, I would highly recommend it to anybody interested in this space.
Heather Joslyn 23:36
Perfect. And we'll link to up we'll link to all the platform engineering dot Oregon platform con website in our show notes that go with this podcast, so people can access it pretty easily. Thank you very much Eris Stewart and of humanity tech and Michael Galloway of hashey Corp. And thank you for joining us, and we'd like to thank all of you for listening today to our conversation. I'm Heather, Jocelyn for the new stack and see you next time.
Alex Williams 24:00
Thanks for listening. If you'd like to show, please rate and review us on Apple podcast, Spotify, or wherever you get your podcasts. That's one of the best ways you can help us grow this community and we really appreciate your feedback. You can find the full video version of this episode on YouTube. Search for the new stack and don't forget to subscribe so you never miss any new videos. Thanks for joining us and see you soon.
Transcribed by https://otter.ai