The New Stack Podcast

Case Study: How SeatGeek Adopted HashiCorp’s Nomad

Episode Summary

LOS ANGELES — Kubernetes, the open source container orchestrator, may have a big footprint in the cloud native world, but some organizations are doing just fine without it. Take, for example, SeatGeek, which runs a mobile application that serves as a primary and secondary market for event tickets. For cloud infrastructure, the 12-year-old company’s workloads — which include non-containerized applications — have largely run on Amazon Web Services. A few years ago, it turned to HashiCorp’s Nomad, a scheduler built for running for apps whether they’re containerized or not. “In the beginning, we had a platform that an engineer would deploy something to but it was very constrained. We could only give them certain number of options that they could use, as very static experience,” said Jose Diaz-Gonzalez, a staff engineer at SeatGeek, in this episode of The New Stack Makers podcast.

Episode Notes

For cloud infrastructure, the 12-year-old company’s workloads — which include non-containerized applications — have largely run on Amazon Web Services. A few years ago, it turned to HashiCorp’s Nomad, a scheduler built for running for apps whether they’re containerized or not.

“In the beginning, we had a platform that an engineer would deploy something to but it was very constrained. We could only give them certain number of options that they could use, as very static experience,” said Jose Diaz-Gonzalez, a staff engineer at SeatGeek, in this episode of The New Stack Makers podcast.

“If they want to scale, an application required manual toil on the platform team side, and then they can do some work. And so for us, we wanted to expose more of the platform to engineers and allow them to have more control over what it is that they were shipping, how that runtime environment was executed, and how they scale their applications.”

This On the Road episode of Makers, recorded here during HashiConf, HashiCorp’s annual user conference, featured a case study of SeatGeek’s adoption of Nomad and the HashiCorp Cloud Platform. The conversation was hosted by Heather Joslyn, features editor of TNS.

This episode was sponsored by HashiCorp.

Nomad vs. Kubernetes: Trade-Offs

SeatGeek essentially runs the back office for ticket sales for its partners, including Broadway productions and NFL teams like Dallas Cowboys, providing them with “something like a software as a service,” said Diaz-Gonzalez.

“All of those installations, they're single tenant, but they run roughly the same way for every single customer. And then on the consumer side we run a ton of different services and microservices and that sort of thing.”

Though the workloads run in different languages or on different frameworks, he said, they are essentially homogeneous in their deployment patterns; SeatGeek deploys to Windows and Linux containers on the enterprise side, and to Linux on the consumer, and deploys to both the U.S. and European Union regions.

It began using Nomad to give developers more control over their applications; previously, the deployment experience had been very constrained, Diaz-Gonzalez said, resulting in what he called “a very static experience.

“To scale an application required manual toil on the platform team side, and then they can do some work,” he said. “And so for us, we wanted to expose more of the platform to engineers and allow them to have more control over what it is that they were shipping, how that how that runtime environment was executed and how they scale their applications.”

Now, he said, SeatGeek uses Nomad ‘to provide basically the entire orchestration layer for our deployments

Foregoing Kubernetes (K8s) does have its drawbacks. The cloud native ecosystem is largely built around products meant to run with K8s, rather than Nomad.

The ecosystem built around HashiCorp’s product is “a much smaller community. If we need support, we lean heavily on HashiCorp Enterprise. And we're willing, on the support team, to answer questions. But if we need support on making some particular change, or using some certain feature, we might be one of the few people starting to use that feature.”

“That said, it's much easier for us to manage and support Nomad and its integration with the rest of our platform, because it's so simple to run.”

To learn more about SeatGeek’s cloud journey and the challenges it faced — such as dealing with security and policy — check out the full episode.

Episode Transcription

Colleen Coll 0:10

Welcome to this special edition of the new stack makers on the road. We're here in beautiful Los Angeles had hacky calm logo, discussions with technologists, giving you their expertise and insights to help you with your everyday work. Infrastructure enables innovation hashey Corp provides consistent workflows to provision, secure, connect, and run any infrastructure for any application.

Heather Joslyn 0:40

Hello, and welcome to another on the road episode of the new stacks makers podcast. We're coming to you from beautiful sunny downtown Los Angeles, California at hashey. conf global, which is hashey corpse user conference. I'm Heather Jocelyn, the Features Editor of the new stack. Today we'll be sharing with you a case study about the use of hashey corpse Nomad, which is a Workload Scheduler and orchestrator for containerized and non containerized applications and particularly the challenges and solutions involved in scaling Nomad, we'll talk about how SeatGeek a mobile application that serves as a primary and secondary market for event tickets, use Nomad and how its engineers interacted with deployments and how it worked alongside seeks best organizational practices in terms of scaling and security. Our conversation is brought to you by hashey Corp, makers of TerraForm, Vault console, Nomad, and other solutions for consistent cloud operations. We're joined today by Jose Diaz Gonzalez, staff engineer at SeatGeek. Welcome, Jose,

Jose Diaz-Gonzales 1:44

thank you for having me.

Heather Joslyn 1:45

Thank you for joining us. Can you tell us a little bit about your role at SeatGeek?

Jose Diaz-Gonzales 1:49

Yeah, I currently work on our release engineering team. So that's basically managing the lifecycle of an application and how it comes from an engineers laptop, into Ci CD, and then inevitably, in front of our customers and what that experience looks like both once to deliver it as well as like what the engineer I interact with, and how they, you know, support that application.

Heather Joslyn 2:10

See, you see the whole the whole lifecycle of, of the application. My understanding is that CPQ was started about those who you probably are very familiar with, with CPQ people in our audience, if you've bought tickets to a Broadway show, if you've bought tickets to a football game of soccer game, you probably you probably have used SeatGeek at some point. My understanding is that company started about 13 years ago,

Jose Diaz-Gonzales 2:31

something like that. It was 2020 10. Okay, 2009 2010.

Heather Joslyn 2:35

Okay. Can you sketch in for us a little bit, see geeks relationship to the cloud? And, you know, was it always on the cloud? For example?

Jose Diaz-Gonzales 2:43

Yeah, you know, when we started, I would say the cloud offerings were very immature and comparisons today, you know, we used essentially everything that was on AWS, but that was like four things at the time. And so you know, since then we've been slowly adopting new technologies, newer frameworks, patterns, that sort of thing as they come up. Okay.

Heather Joslyn 3:02

And then, just to give us a sense of the scale in the environment that your developers were deploying into, you mentioned, it was kind of homogenous, the environment.

Jose Diaz-Gonzales 3:11

Yeah, that's correct. We, on the back office, for our partners, and Dallas Cowboys, the same sort of thing. We provide them with something like a software as a service, right. And so all of those installations, they're single tenant, but they run roughly the same way for every single customer. And then on the consumer side, you know, we run a ton of different services and micro services and that sort of thing. And those run while they're on different languages, or frameworks, they run in basically the same exact way. So largely the same patterns. If you're on one side of the business on the other

Heather Joslyn 3:42

end, do you deploy to multiple clouds or just we're

Jose Diaz-Gonzales 3:45

mostly on AWS, we deploy to Windows and Linux containers for the on the enterprise side windows and on the consumer side Linux, and then we deploy to both the US and EU regions as well.

Heather Joslyn 3:56

Okay, so I guess we'll start talking about Nomad, what, what is nomad?

Jose Diaz-Gonzales 4:01

For us no matter? Well, it's a Workload Scheduler, right? So we use that to provide services to our engineering teams and to other other folks, maybe they have like, jobs that they need to run on a certain period of time, or they need to run them ad hoc, or things that are persistent. You know, we use that to provide basically the entire orchestration layer for our deployments, whether that's on the Windows side for enterprise customers, or more more traditional Linux based applications.

Heather Joslyn 4:27

Do you use Kubernetes? at all, or no,

Jose Diaz-Gonzales 4:28

no, we do not. That's kind of unfortunate for us, because we see a lot and a lot of cases, you know, folks say will be support Kubernetes. Yes. Do you have Kubernetes as a vendor, and we'll say no. So like, we need to figure out creative ways to make the relationship work there for whatever it is,

Heather Joslyn 4:44

have you felt are some of the trade offs in terms of you mentioned the vendor issue, but it's

Jose Diaz-Gonzales 4:49

a much smaller community. So you know, if we need support, we you know, we lean heavily on Hashi core enterprise, and we lean on the support team to answer questions, but if we need support on it Making some particular change or using some certain feature, we might be one of the few people starting to use that feature. So the community is much smaller. But that said, it's much easier for us to manage and support Nomad and its integration with the rest of our platform because it's so simple to run. Better, good and bad.

Heather Joslyn 5:17

Yeah. And you can run containers and non containerized applications. That's

Jose Diaz-Gonzales 5:21

correct, right. And we've we've used it for running iOS workloads directly on Windows, we've used it for raw execution jobs that are directly on the host. Of course, we use container based workloads.

Heather Joslyn 5:33

See begin using nomad? And what were the problems that you were looking for it to solve for you?

Jose Diaz-Gonzales 5:39

Yeah, I think in the beginning, we had a platform that, you know, an engineer would deploy something to, but it was very constrained, we only give them certain number of options that they could use. It was very static experience. If they want to scale an application required manual toil on the platform team side, and then they can do some work. And so for us, we wanted to expose more of the platform to engineers and allow them to have more control over what it is that they were shipping, how that runtime environment was executed. And you know, how they scaled their applications?

Heather Joslyn 6:08

What have been some of the challenges that you encountered? Did you start to use Nomad in terms of, for example, deployment? Were there things that you needed to do so that your teams could work more efficiently, more efficiently?

Jose Diaz-Gonzales 6:22

Yeah, I think the big thing for us is sort of like applying policies or you know, whether that's a security or like an audit policy that says you can't do X, you can't perform X operation, or this operation is the only safe way to do something. That sort of stuff has been a little bit more difficult for us to do. There have been tools and features that no man has provided over the years. But really early on, that was a problem for us. But it's it sounds like they're there. It's it's more or less solved for us at this point.

Heather Joslyn 6:47

All right. What about where there are other challenges? I went to your talk earlier today, and there was talking about the teams creating some things to help. Yeah, no, don't forget,

Jose Diaz-Gonzales 6:57

one of the things for us was trying to find what the best dividing line is for the platform, right? Nomad is very powerful, and allows you to pretty much do whatever you need to do in order to get your job running in your in your workload, you know, in front of your customers. But that a lot of knobs for folks to tweak and turn. And so what we wanted to do was make sure we were providing the right knobs for folks. So we weren't inundating them with options, and also providing you know, that best practice experience around deployments and making it simpler, that sort

Heather Joslyn 7:26

of thing. Yeah, there was mention of a deli.

Jose Diaz-Gonzales 7:29

Yeah, we build a couple of different tools internally. So there's, the UI is called deli short for delivery. But you can go in and kind of pick your environment, whether that's a production environment, or pre production environment, pick your application, pick a particular version of the app that you want to deploy and get it deployed in front of customers.

Heather Joslyn 7:46

So that's great. You mentioned that also in the talk that you'd asked your your teams for feedback? And what was some of the feedback that you received? And how did it change how you worked with no man?

Jose Diaz-Gonzales 7:56

Yeah, again, going back to providing a simple interface for things as an engineer, you're probably spending a minimal amount of time in the deployment world, you're mostly working on the product that you're building and how to deliver it. And so every three months, you go into this config file, and you find it's eight lines, and you copy it from here to here. And you know, maybe you forget a line or two and that was that was a big concern for our engineers is like, what do I need to configure and like, what can I like leave apart? So that's really something that we focused on is removing that extra configuration they may not have needed, or that was kind of standard across the board.

Heather Joslyn 8:32

So making it a little more like, again, fewer options, more convention over configuration? Yeah, yeah. Were there other best practices that you had to adapt or adapt Nomad to? That you?

Jose Diaz-Gonzales 8:45

Yeah, I don't know if this isn't necessarily nomad. But you know, there's a lot of in many cases where we were deploying something from a vendor, for instance, or from from a third party, where it wasn't really adapted to the cloud, those are things that like you couldn't potentially put in a container or was much harder to do. Those were things that were, you know, if we had been in a pure container world, if we were on ECS, or Kubernetes, for instance, we would have had a rough time. But with Nomad, we were allowed to sort of step outside of the box and, and do that directly on the host, when necessary.

Heather Joslyn 9:16

What about security issues? Were there the nice new security challenges that either using no matter using integrating it with the rest of your

Jose Diaz-Gonzales 9:23

definitely not anything introduced? But certainly there were things where, you know, we especially on the policy side, where we don't want to allow folks to do something insecure. And so run and grab that ahead of time. I think like those were, those were and still remain very concerned about security team, you know, how do we how do we safely expose the runtime environment and how do we allow engineers to have as much access as possible without, you know, giving them the keys to the kingdom and potentially breaking things?

Heather Joslyn 9:50

Yeah, yeah. There's and also with what your your company does, you have people's date, you know, people are buying tickets, yeah, giving you their credit card numbers and so on. Yeah,

Jose Diaz-Gonzales 9:59

you definitely Like, one thing we want to avoid is, you know, we have a fairly flat internal organization fairly open access to a lot of things. But we want to make sure that if you buy a ticket to the Super Bowl, those tickets are yours, you like we've, we've been guaranteed that no one in our organization has access to them. And so those are the kinds of challenges that we have internally making sure that at a security level, folks are not accessing things that they shouldn't be accessing. And if they do have access to that, we audit it, that sort of thing.

Heather Joslyn 10:25

We wanted to go back to a little bit about building things internally, like you mentioned that that's a benefit. In your presentation, you mentioned that's a benefit. Are you able to like offer feedback to Nomad about, you know, to the company about oh, yeah,

Jose Diaz-Gonzales 10:38

no, definitely, we're constantly giving them feedback on their tooling, and how it integrates with our systems and what we're trying to see in the future. So, you know, there's they have features such as, like, auto scaling. So, you know, how do we debug that? And how do we ensure that, you know, the policy that we've applied for scaling a particular part of our stack, is actually what has executed? And when did it execute? Like, that sort of thing? We constantly are falling back and they're building into the into their platform?

Heather Joslyn 11:04

Yeah. Are there any challenges around around scaling? Or is it gone pretty smoothly? Like if you have a, you know, a big tour is announced, or something or World Series tickets are Super Bowl tickets, because

Jose Diaz-Gonzales 11:13

we have on sales all the time? You know, there's certainly challenges every now and then when we see, you know, for instance, like high performing concert come in, and huge, we're not really clear, we scale up preemptively. We know that that's coming. For the most part, we're not like sort of on demand scaling. We know that those events are coming, we know that there's an on sale at 11am. Tomorrow for whoever it is. So we need to we need to be prepared ahead of time versus acting reactively in a sense, what do you

Heather Joslyn 11:40

sort of wish you'd known at the beginning of this journey with Nomad that you know now or that you might advise people who are looking into this as a solution for orchestration? The sort of advice you might you might give them?

Jose Diaz-Gonzales 11:52

Yeah, I think I would, I would have wished I'd had a better grasp of the types of workloads that we were in, they were like now that we've built over the past five, six years, we know that most of our workloads are relatively the same. They're all homogenous, everyone runs roughly the same thing. And so we didn't necessarily need all of the features that the platform that Nomad provides us. And so if we had constrained it better, we would have had a much smoother experience, I think.

Heather Joslyn 12:16

Okay. Well, thank you very much for joining us today, Jose. We're really enjoyed hearing about your CX journey with Nomad and how you made it really work for your company. This has been an on the road edition of the new stack makers. I'm Heather Joslin, and we'll see you next time.

Colleen Coll 12:36

Infrastructure enables innovation. hashey Corp provides consistent workflows to provision, secure, connect, and run any infrastructure for any application.

Alex Williams 12:48

Thanks for listening. If you liked the show, please rate and review us on Apple podcast Spotify, or wherever you get your podcasts. That's one of the best ways you can help us grow this community and we really appreciate your feedback. You can find the full video version of this episode on YouTube. Search for the new stack and don't forget to subscribe so you never miss any new videos. Thanks for joining us and see you soon.

Transcribed by https://otter.ai