The New Stack Podcast

Chronosphere Nudges Observability Standards Toward Maturity

Episode Summary

DETROIT — Rob Skillington’s grandfather was a civil engineer, working in an industry that, in over a century, developed processes and know-how that enabled the creation of buildings, bridges and road. “A lot of those processes matured to a point where they could reliably build these things,” said Skillington, co-founder and chief technology officer at Chronosphere,an observability platform. “And I think about observability as that same maturity of engineering practice. When it comes to building software that actually is useful in the world, it is this process that helps you actually achieve the deployment and operation of these large scale systems that we use every day.” Skillington spoke about the evolution of observability, and his company’s recent donation of an open source project to Prometheus, in this episode of The New Stack Makers podcast. Heather Joslyn, features editor of TNS, hosted the conversation. This On the Road edition of The New Stack Makers was recorded at KubeCon + CloudNativeCon North America in the Motor City. The episode was sponsored by Chronosphere.

Episode Notes

DETROIT — Rob Skillington’s grandfather was a civil engineer, working in an industry that, in over a century, developed processes and know-how that enabled the creation of buildings, bridges and road.

 

“A lot of those processes matured to a point where they could reliably build these things,” said Skillington, co-founder and chief technology officer at Chronosphere, an observability platform. “And I think about observability as that same maturity of engineering practice. When  it comes to building software that actually is useful in the world, it is this process that helps you actually achieve the deployment and operation of these large scale systems that we use every day.”

 

Skillington spoke about the evolution of observability, and his company’s recent donation of an open source project to Prometheus, in this episode of The New Stack Makers podcast. Heather Joslyn, features editor of TNS, hosted the conversation.

 

This On the Road edition of The New Stack Makers was recorded at KubeCon + CloudNativeCon North America, in the Motor City. The episode was sponsored by Chronosphere.

A Donation to the Prometheus Project

Helping observability practices grow as mature and reliable as civil engineering rules that help build sturdy skyscrapers is a tough task, Skillington suggested.

 

In the cloud era, he said, “you have to really prepare the software for a whole set of runtime environments. And so the challenges around that is really about making it consistent, well understood and robust.”

 

At KubeCon in late October, Chronosphere and PromLabs (founded by Julius Volz, creator of Prometheus) announced that they had donated their open source project PromLens to the Prometheus project, the open source monitoring and alerts primitive.

 

The donation is a way of placing a bet on a tool that integrates well with Kubernetes. “There's this real yearning for essentially a standard that can be built upon by everyone in the industry, when it comes to these core primitives, essentially,” Skillington said. “And Prometheus is one of those primitives. We want to continue to solidify that as a primitive that stands the test of time.”

 

“We can't build a self-driving car if we're always building a different car,” he added.

 

PromLens builds Prometheus queries in a sort of integrated development environment (IDE), Skillington said. It also makes it easier for more people in an organization to create queries and understand the meaning and seriousness of alerts.

 

The PromLens tool breaks queries into a visual format, and allows users to edit them through a UI. “Basically, it's kind of like a What You See Is What You Get editor, or  WYSIWYG editor, for Prometheus queries,” Skillington said.

 

“Some of our customers have tens of thousands of these alerts to find in PromQL, which is the query language for Prometheus,” he noted. “Having a tool like an integrated development environment — where you can really understand these complex queries and iterate faster on, setting these up and getting back to your day job — is incredibly important.”

 

Check out the full episode for more on PromLens and the current state of observability.

Episode Transcription

Colleen Coll  0:08  

Welcome to this special edition of the new stack makers on the road. We're here in cube con North America and discussions from the show floor with technologists giving you their expertise and insights to help you with your everyday work. RONIS fear is the only observability platform that puts you back in control by taming rapid data growth in cloud native complexity, delivering increased business competence, engineering organizations trust chronosphere to help them operate, scalable, highly available and resilient application.

 

Heather Joslyn  0:47  

Hello, and welcome back to another edition on the road edition of the new stack makers podcast. I'm your host, Heather Jocelyn Features Editor of the new stack and we're here at Q con plus, plus cloud native con North America here in Detroit, the Motor City. And we're talking about what lies beyond open source software. Everyone uses open source software, what but what should you consider it you know, if you're thinking of building something in house, or you're thinking about purchasing something from a vendor in terms of the issues around open source software, and especially around that was durability, which can be especially challenging in a distributed cloud environment or multi cloud environment. We're joined here by Rob Skellington, co founder and chief technology officer at chromosphere. Hi, Rob. Hi, Heather. Thanks for having me. Oh, you. You're quite welcome. Thanks for joining us. Um, do you want to tell us a little bit about chronosphere?

 

Rob Skillington  1:41  

Yeah, so Cronus here is, you know, a product company, born out of a new set of operating environments and procedures that we find ourselves in today. You know, the shift to cloud native is something that myself and my co founder felt all too real. Going from, you know, slower, less, less the thermal environments into it, this fast paced world of Kubernetes containers, also operating in deploying services yourself in a DevOps model. And, and currency is really a product born out of that set of problems that kind of intersects, which, you know, for engineers makes life pretty tough without the right tools.

 

Heather Joslyn  2:29  

And it's a pretty new company as well. It's like 29th, right? Yes, just

 

Rob Skillington  2:32  

three years ago, one whole pandemic?

 

Heather Joslyn  2:35  

Oh, yeah. It seems like just yesterday, destiny, and 100 years ago, at the same time. So I just want to also mention, we thank chronosphere for being our sponsors for today's episode of makers. So let's get to talking about now chronosphere. Does, it's an observability platform, how would you define observability.

 

Rob Skillington  2:57  

So the way that I define observability is essentially the practice of operating, deploying, and developing software. And in my mind, you know, my, my granddad was a civil engineer, and over 100 years, the, the processes and, and the know how to actually build the structures reliably in a way that serves everyone that coexists in these fantastic buildings and bridges, and roads and whatnot. You know, a lot of those processes matured to a point where they could reliably build these things. And I think about observability, as that same maturity of engineering practice, when it comes to like building software that actually is useful in the world. It is this process that helps you actually achieve the deployment and operation of these large scale systems that we use every day.

 

Heather Joslyn  3:54  

So you can see what's going on in every stage, you can see what's happening,

 

Rob Skillington  3:57  

you can set up monitoring so that some of our customers have 10s of 1000s of alerts. So even the most minut problem bubbles up and is aware to a human before you know, it fails spectacularly. So it's really the the underpinnings and the foundation of building something that runs reliably

 

Heather Joslyn  4:18  

when you're deploying into a multi cloud environment or hybrid environment of some kind. It's distributed, it's all over. It's everywhere. How do you what kind of challenges does that bring up in terms of, especially with open source software,

 

Rob Skillington  4:30  

the challenges of modern day software development, like deploying into multiple cloud environments, having your your application run and be portable? Is is a real challenge. And back in the day, you know, you would have a fewer set of larger servers that you could kind of treat like, you know, pets, and these days, it's unfortunately you're not able to do that. at, you know, there's a lot of upsides to making it more compartmentalized, portable, able to run everywhere, but it's a lot more challenging. And you have to really prepare the software for, you know, a whole set of runtime environments. And so the challenges around that is, is really about making it consistent, well understood and robust. And that that is something that obviously observability plays a big part of that.

 

Heather Joslyn  5:27  

I want to ask you a little bit about Prometheus. I mean, that's, that's an open source monitoring tool that a lot of a lot of places use, what are some of the challenges, especially as an organization scales that you see with that?

 

Rob Skillington  5:39  

Yes, sir. Prometheus is obviously a fantastic tool that, you know, really is married with, with Kubernetes. It's the second most mature project only behind Kubernetes, and the cloud native foundation, it's that way for a reason, you know, when I mentioned, how observability is just absolutely crucial to kind of operate, you know, Kubernetes provides the platform on which to operate your application. Prometheus is the partner to do that, to actually be able to see it, monitor it and correct things when would obviously inevitably software goes wrong. The challenges with Prometheus is it's a it's a primitive, and it's a very strong primitive, and it has matured, obviously, during his lifetime. But it really is, you know, a core building block, and you need a lot more than just a magnifying glass to do this very difficult task of deploying into Kubernetes. And cloud native environments. Like you need a whole tool, toolset and toolbox. Yeah, it's not just a magnifying glass unit.

 

Heather Joslyn  6:53  

I wanted to ask you about problems, the project, which donated it to Prometheus, can you tell us about what it does? And also the journey to the discussions you had with Julius voles at at Prometheus about how that donation came about?

 

Rob Skillington  7:09  

Yeah, I mean, Julius, you know, is is really a as the co founder of Prometheus is really passionate about. And likewise as a contributor to open metrics myself, which is part of the Prometheus ecosystem. And as part of its It's come a formalizing of the Prometheus text exposition. There's this real yearning for essentially a standard that can be built upon by everyone in the industry, when it comes to these core primitives, essentially. And Prometheus is one of those primitives. And we want to continue to solidify that as a primitive primitive that stands the test of time, we can't build a self driving car if we're always building a different car. Every few years, right, yeah. And so prom lens is it's almost like a ID or development environment for building queries that run against Prometheus. And so, you know, when you think about, like, some of our customers have, you know, 10s of 1000s of these alerts to find in prom QL, which is the query language for Prometheus, having a tool like an integrated development environment where you can really understand these complex queries, and iterate faster on you know, setting these up and getting back to your day job is is incredibly important. And so prom lens is a tool and a query builder and a developer environment that sets us up for Prometheus to continue to just be a like, you've held it for five years tool that we hope will continue to solidify the penetration of the standard and bring more people to a a world where everyone has transferable skills they can work together. And you know, prime lens really is such a powerful tool in a complex ecosystem that exists, that should make it much more usable and accessible to everyone.

 

Heather Joslyn  9:10  

So it helps to sort through all these alerts, it really

 

Rob Skillington  9:14  

helps to like look at a query and an or an alert. And obviously, like the core part of an alert is kind of defining what data is being accessed to, to monitor. It really helps you pull out that underlying manipulation of the metrics data to give you something meaningful to a human. Yeah, and, you know, a lot of the time that these companies a only a small subset of power users really understand how to craft these like, they really already like five lines of text. These queries are not that long. Yeah, they can get complicated very quickly. And so when you have 1000s of dashboards, 1000s of alerts, all that have these queries that kind of our built in this text format prompt QL to access that metric data, you don't want just one or 5% of your company to understand how to author these, you want everyone to be able to author these answer problems, really, it breaks it down in the visual format, to understand how to query the different parts of the query, relate to each other, and then allows you to edit it in a three user interface instead of having to like write the text query language yourself. Yeah, so basically, it's kind of like a What You See Is What You Get editor for WYSIWYG editor for Prometheus queries. WordPress, you know, successful reason, like the ecosystem around it is very strong and Prom lens is yet another edition that will just make the function of being a software engineer and and performing the practice of observability. Much easier for many more people. And this is why it needs to exist in open source, because it's a great addition to our data, obviously, to put on top of an observability product, but it's also just a standard that you want to exist. So everyone can level up and be at the same playing field and speaking the same language, which is Prometheus, prom, QL. And the cloud native ecosystem,

 

Heather Joslyn  11:13  

what's the difference between legacy? APM? And observability?

 

Rob Skillington  11:17  

That's a great question. I mean, to some people, it's, it's the same thing under a different roof, or umbrella. But, you know, it's to me, it's actually delivering the promise of APM in a new world, one of the sessions that I do with new employees at our company, carniceria, that, you know, we're observability vendor. And a lot of the time we like to kind of paint the picture of the history of like, why are we here? What came before us. And the way I like to describe how APM came to exist is that all the primitives we had before APM were very much that like roar tools, that you know, maybe as a five year tenure engineer, you may finally know how to work together with all of them. Yeah, APM brought a lot of those under one experience of practicing observability. So you know, it had an opinionated view, and kind of gave you a crisp kind of product experience around kind of visualizing all this data, and understanding at a contextual level that mattered to you as an application developer, how your products was performing what the performance is like. And for instance, like with my sequel, it will tell you like these the slow queries, it knew enough about the thing you were monitoring, to give you real insights that didn't require you to spend hours looking at the roared tools that you use before. And so you know, cloud native observability. And what comes after that is bringing that vision in a way that's it's actually more generic again. So we, we started out with these really generic tools before IBM that were really hard to use APM, made them opinionated and gave you like, specific views into that, that set of data problems and whatnot, and then, you know, observability, and cloud native observability. And all the tools in this ecosystem, is about giving you that kind of like rich, cross cutting experience of being able to view and debug things from one place, but not specific to an individual runtime environment. Like, we don't build tools for just to monitor just my sequel anymore. You know, we build one experience that can monitor everything. And then we plug that data into that one tool. Yeah. But it's, you know, you're not learning how to use Microsoft Word and Microsoft Access, and Microsoft, all the different tools under the Office Suite, you're just learning one tool that can provide all the authorship and sets of tools under one roof.

 

Heather Joslyn  13:50  

So to bring it back to the electric car analogy, you're you're just working on one electric car, you're not It's not you don't have to learn all these other electric. Exactly. If you're,

 

Rob Skillington  13:59  

it's, you know, under one control plane, it's not, you know, you don't have the knobs all spread out through the car.

 

Heather Joslyn  14:10  

Good point. I guess that's all for for now. But thank you, thank you very much for joining us. I really appreciate it.

 

Rob Skillington  14:15  

Well, thank you for having me again. Yeah. It's been a great, great to catch up and great to see people here under one roof again. And

 

Heather Joslyn  14:23  

yeah, it's been it's been a good conference for that seeing people people seem to seem to be in a good mood here. Thank you very much, Robin. Thank you for chronosphere to chronosphere for sponsoring us today. And this has been another episode of the new stack makers. I'm Heather Joslin, we'll see you next time.

 

Colleen Coll  14:41  

Cronus fear is the only observability platform that puts you back in control by taming rampid data growth in cloud native complexity, delivering increased business confidence engineering organizations trust chronosphere to help them operate, scalable, highly available and resilient. An application.

 

Alex Williams  15:01  

Thanks for listening. If you liked the show, please rate and review us on Apple podcast Spotify, or wherever you get your podcasts. That's one of the best ways you can help us grow this community and we really appreciate your feedback. You can find the full video version of this episode on YouTube. Search for the new stack and don't forget to subscribe so you never miss any new videos. Thanks for joining us and see you soon.

 

Transcribed by https://otter.ai