#206 - The Fundamentals and Future of DevOps and Software Delivery - Yevgeniy Brikman

17-Feb-2025 1 hour 22 mins Yevgeniy Brikman

included in DevOps Infrastructure Culture & Practices Software Craftsmanship Architecture & Design Security

“If you are a developer, but you can’t ship your code, it doesn’t matter. It won’t provide any value. So you have to fix how you ship code.”

Brought to you by Lemon.io

Lemon.io is your go-to platform for hiring top-tier, pre-vetted software engineers from Europe and Latin America. You'll get matched with your developer in just 48 hours.

Tech Lead Journal listeners get 15% off your first 4 weeks of work at lemon.io.

Want to learn the key principles and future of DevOps that can help you ship code faster and more reliably?

In this episode, I sit down with Yevgeniy Brikman, co-founder of Gruntwork and author of “Terraform: Up & Running,” to discuss his upcoming book, “The Fundamentals of DevOps and Software Delivery.”

We explore:

Common pitfalls and anti-patterns in DevOps implementations
The concept of “minimum effective dose” and “incrementalism” in adopting technologies
Why application developers should understand infrastructure and software delivery
The future of DevOps, including “infrastructureless” and the impact of GenAI
The importance of “secure-by-default” practices in modern software development
Recent changes in open source licensing and their impact on the tech industry
The power of continuous learning and sharing knowledge in tech careers

Listen out for:

(00:02:15) Career Turning Points
(00:08:32) Deliberate Time for Learning
(00:16:27) Transitioning from App Dev to Infra
(00:24:19) Understanding How to Deliver Software
(00:32:05) Minimum Effective Dose
(00:40:34) DevOps Antipatterns
(00:44:02) Incrementalism
(00:49:37) The Future of DevOps and Software Delivery
(01:10:39) Recent Trend in Open Source License Changes
(01:20:32) 3 Tech Lead Wisdom

_____

Yevgeniy Brikman’s Bio
Yevgeniy (Jim) Brikman loves programming, writing, speaking, traveling, and lifting heavy things. He does not love talking about himself in the 3rd person. He is the co-founder of Gruntwork, a company that offers products & services for setting up world-class DevOps Foundations. He’s also the author of three books published by O’Reilly Media: Fundamentals of DevOps and Software Delivery, Terraform: Up & Running, and Hello, Startup. Previously, he spent more than a decade building infrastructure and products that served hundreds of millions of users while working as a software engineer at LinkedIn, TripAdvisor, Cisco Systems, and Thomson Financial.

Follow Yevgeniy:

LinkedIn – linkedin.com/in/jbrikman
X / Twitter – @brikis98
Website – ybrikman.com
📚 The Fundamentals of DevOps and Software Delivery – https://www.oreilly.com/library/view/fundamentals-of-devops/9781098174583/

Mentions & Links:

📚 Terraform Up and Running – https://www.oreilly.com/library/view/terraform-up-and/9781098116736/
Steve Yegge - You Should Write Blogs – https://sites.google.com/site/steveyegge2/you-should-write-blogs
Project InVersion – https://www.linkedin.com/pulse/case-study-linkedins-2011-operation-inversion-through-gene-kim-dht2c
Terragrunt – https://terragrunt.gruntwork.io/
Terraform – https://en.wikipedia.org/wiki/Terraform_(software)
Elisha Otis – https://en.wikipedia.org/wiki/Elisha_Otis
Ruby on Rails – https://en.wikipedia.org/wiki/Ruby_on_Rails
Groovy on Rails – https://en.wikipedia.org/wiki/Grails_(framework)
Gruntwork – https://www.gruntwork.io/
OpenTofu – https://opentofu.org/
HashiCorp – https://en.wikipedia.org/wiki/HashiCorp
MongoDB – https://en.wikipedia.org/wiki/MongoDB
OpenBao – https://openbao.org/
Vault – https://www.vaultproject.io/
OpenSearch – https://en.wikipedia.org/wiki/OpenSearch_(software)
Elasticsearch – https://en.wikipedia.org/wiki/Elasticsearch

Our Sponsor - JetBrains

Enjoy an exceptional developer experience with JetBrains. Whatever programming language and technology you use, JetBrains IDEs provide the tools you need to go beyond simple code editing and excel as a developer.

Check out FREE coding software options and special offers on jetbrains.com/store/#discounts.
Make it happen. With code.

Our Sponsor - Manning

Manning Publications is a premier publisher of technical books on computer and software development topics for both experienced developers and new learners alike. Manning prides itself on being independently owned and operated, and for paving the way for innovative initiatives, such as early access book content and protection-free PDF formats that are now industry standard.

Get a 45% discount for Tech Lead Journal listeners by using the code techlead24 for all products in all formats.

Our Sponsor - Tech Lead Journal Shop

Are you looking for a new cool swag?

Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.

Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.

Like this episode?

Follow @techleadjournal on LinkedIn, Twitter, Instagram.

Buy me a coffee or become a patron.

Buy me a coffee

Quotes

Career Turning Points

The biggest thing to share, which I share with everyone I talk to and in most of my books, is this idea of taking time to learn every day or every week.
What I generally see is we go to school, some of us go to college, and we spend this whole time doing very deliberate learning. And then you get a job and you stop. People stagnate. I remember hearing that people read about four books a year. And in the tech industry, less than one book a year on the tech industry itself. Which makes me really sad. Books aren’t the only way to learn, but I think people don’t invest enough in continuing to learn new things.
That has been the single biggest thing that has turned my career and let it grow, forcing myself to spend time every week doing deliberate learning. It’s always fun, always something interesting. I’m not forcing myself to learn something I don’t care about. I pursue things that are intriguing.
This can be by reading books, by playing with code. In fact, it’s better if you do a mix of both. Learning without doing doesn’t work, and that’s the basis of most of my books - you learn and you do tons of hands-on examples.
I encourage folks to carve out, find the time every week to learn something. Take the time to learn. Don’t stop learning as soon as you graduate from school or college.

Deliberate Time for Learning

I used to have a tradition. After everybody else around me went to bed, I would carve out 20 minutes to read something or watch an interesting video and then play with the technology.
I find that you learn well if you have theoretical, conceptual learning, and then you get to practice it on something concrete. I’d find a project I wanted to do. And no one has to see this thing. I have dozens of code snippets that will never see the light of day, and that’s fine. Some of them made it through, and I’ll open source them, but that doesn’t have to be the goal. They’re something I wanted to try and play with.
The whole point of that project was I wanted to do something, build something. But I didn’t do it using stuff I already knew. I explicitly built it with things I hadn’t used before as the way to learn them. It’s not a lot of time.
This is the thing that blows my mind - the difference between zero deliberate learning, which is where the average person is, versus somebody who carves out an hour a week. Find 20 minutes three times in a week. The difference is tremendous because that stuff adds up. It doesn’t sound like much, but there are 50 weeks in a year. The years go by and you’ve spent hundreds or thousands of hours doing something that no one else around you is doing. And you look like an amazing expert. So that’s one piece - to carve out that time.
Another thing that makes this effective is to share what you learned with others. That can be in the form of writing blog posts, giving talks, doing podcasts, writing books, whatever you’re able to do. Share it.
One of the biggest fears when I tell people to start blogging, start sharing what you know, they say, “Who’s going to read my stuff? No one’s going to want to read what I write, right? I’m not good at it, no one cares, yadda yadda yadda.”
First of all, that doesn’t matter. You’re sharing this stuff primarily for yourself. The act of writing down and sharing something you learned will make you learn it better. It makes it stick better. It makes you do extra research. You’ll connect it to concepts you hadn’t thought about. The primary audience is yourself. Even if no one reads it, that doesn’t matter.
The reality is almost anybody can write something that will find at least a small audience. Steve Yegge’s blog post talks about this idea where we picture in our heads that other people know more than us. They have this giant mass of knowledge, and I only know this tiny bit of the world.
That’s not what reality looks like. Reality looks like everybody has these random circles of knowledge like a Venn diagram, and then these huge vast areas that we know nothing about. Everyone’s collection of these circles of what they know is different, and everyone has a different path for how they navigate through it.
So if you start sharing what you know, you are going to find someone else out there who’s on a similar path to you, or on a path that benefits from what you know, but might not benefit from others’ writing because it’s a different path. You will find an audience. And by writing, you will learn this stuff better.
And you will get these serendipitous, unpredictable impacts on your life and career. You never know where this stuff will go, but it will go somewhere. Almost everyone I talk to who regularly shares their work has this unexpected, profound impact.
In terms of how to learn, find time. And then if you can, share what you learned with others.

Transitioning from App Dev to Infra

Part of the reason I ended up on the infrastructure side because LinkedIn had many issues. The term DevOps didn’t exist or had appeared on the scene, so we didn’t call it that. We had issues delivering our software. We had app developers who wrote code and built things, but getting it out to users, running it, maintaining it, scaling it, and securing it were tough. We struggled with that enormously.
What got me in there was being thrown into the middle of this, and we had to fix it. It doesn’t matter if you’re an app developer - if we can’t ship your code, it won’t provide value. So we had to fix how we shipped code.
I personally struggled tremendously to learn it. It’s different now, but back then, it was hard. There was the app development side of the world which takes time to learn, but there were many resources for understanding it.
Then there was the other side which felt like magic. There were people who knew these magic incantations in Linux or DNS or whatever else, and they would wave their wand and suddenly your app was running. Then it would crash and they’re on vacation. You have no idea what to do. It was very frustrating.
So I learned it the hard way. And many people learn it the hard way. Something breaks. You go through tremendous pain. You fix it. You probably fix it wrong. It breaks again. You’re awake at four in the morning, you fix it. Eventually you figure out better ways to do it. But it was a slow and painful process.
One reason I built Gruntwork is because this stuff was driving me nuts. I did not like this aspect of software development where I’d have a great idea, build it, design it, we’d be excited, and then you think, how do we get it out there in a way that’s not a security nightmare and is maintainable? You can deploy it, but how do you update it without breaking everything? There must be an easier way.
This was also the motivation for writing this new book. I couldn’t find a comprehensive resource for learning this stuff. This book brings together a hands-on guide to the most common things you have to deal with to deliver software.
There are resources on the DevOps side that teach the cultural aspects of DevOps. Those are valuable. But I haven’t found a single guide that says: you built an app, here’s your Java app, your Ruby on Rails app. What’s next? How do you put it on www.something.com so everyone can use it and it scales and it’s secure? That was a big motivation for the book.
How to learn this stuff? Get my book, obviously. There are guides to the cultural aspects of DevOps that are solid. And there are online courses that do a decent job. Other than that, practice it. Find a side project. Find something at work. Find a way where you need to deploy an application in front of other humans and start practicing. That’ll make these concepts come together. It’s not easy to learn. It’s easier. And with this book, it can be easier.
In the preface of the book, I write that I’m hopeful we’ll have a generation of application developers and infrastructure developers who can learn this stuff not the hard way. Here’s the truth: When you have bugs in application code, there are severe bugs and nightmare stories, but usually they’re minor things, they’re annoying, and you fix them. When you have bugs on the software delivery side, you take everything down. You accidentally delete your production database. You have a security breach - these are serious problems.
Learning this the hard way is not only painful and unpleasant for you. It’s harmful for the software industry. We’re building software that isn’t secure, reliable, or stable, because people don’t know how to do it. I’m hopeful the next generation will benefit from this stuff and learn it better than I had to.

Understanding How to Deliver Software

It was around 2011, LinkedIn was looking great. We had just had our initial public offering. The stock price was going up. The product was doing great. It was crazy hyper growth. The company internally was growing through hyper growth. Things seemed good. Revenue was up.
But we got to the point with our software delivery practices where we could not deploy. What we were doing back then was deployment once every two weeks. This was the release train model. If the train leaves the station, either you’re on it or you wait for the next one, two weeks from now.
We did all this stuff, the release branch, and went to deploy it. We rolled out code, and things died, broke, and crashed. We spent time bringing everything back up, fixing bugs, rolled out new code, that broke more stuff. We rolled out code to fix those issues, that broke more stuff. It went on for a day. Then to a second day, and we could not stabilize the new code. We had to roll everything back and patch things up to get back to working before the deployment started. We could not deploy. We were stuck. The company had no option.
A new VP of Engineering came in. He saw no other option, and we had to put all product development on hold completely. This was Project InVersion. We said, no more product development. Every person in the company - app engineer, designer, marketing - had to work on internal stuff and tooling. There was no DevOps term then, but that’s what we were trying to figure out - how to reliably and effectively ship code.
The first reason to learn software delivery is to avoid that state. That’s not a good place to be. That put the entire company at risk. We got through it - we went from deploying once every two weeks, to deploying hundreds of times per day with fewer outages and issues. It became a success story. A lot of good technologies came from that work. Apache Kafka was developed at LinkedIn during this period. It worked out, but it could have gone poorly.
It doesn’t matter if you’re an app engineer or whatever your career is, understanding these basics of software delivery is important. You don’t want to go there. You don’t want to get to those dark, long nights.
The second reason to learn it is independence. There are companies where you’re the app engineer, you build something, and then you toss it over the wall. Someone else deploys it for you. That’s becoming rare these days. But even if you’re at a company where that’s the case, there’s a good chance you won’t be tomorrow.
One reason is you might want to start your own company, a side project, or something for fun. Understanding how to take your app and make it work in production is incredibly liberating. I could take any idea and ship it. This gave me confidence to start a company, and even did a lot of other projects before that, just from understanding these basic things.
These days, getting a startup noticed is hard. Marketing is still really hard. But starting a startup is easier than ever before because of the cloud, the technologies out there, the knowledge that’s available, and the reach through the internet and mobile apps. It is easier than ever to start your own thing but if you don’t know how to deploy things, you won’t go very far. That’s another good reason to learn this stuff.
The third reason is that even if someone else handles the DevOps and software delivery for you, you’re not going to be a particularly good app developer if you don’t understand how your code gets deployed and maintained in production. The code you write is probably terrible. I can say this because this was me earlier in my career - my code was terrible. I see that now. You only realize this once you understand these concepts.
You have to understand the architecture of your site. That’s where app development and infrastructure development overlap. Is there one copy of your app running in the world? Or are there 200 copies on different servers? Are you a single monolith? Or are you 127 microservices that communicate over the network? These have profound impacts on how you build your application.
Security is another huge one. The amount of insecure code out there that exists just because people don’t understand what’s really happening on these servers is terrifying for our industry.
You don’t have to be an expert in this stuff. You don’t have to go super deep like the people handling this. But if you’re completely unaware of how code is deployed, scaled, what high availability means, things like CAP theorem, etc. If you’re completely blind to those, you’re going to write terrible application code, and your career probably won’t go very far. Your bosses won’t be happy, your customers won’t be happy. So that’s a very compelling reason for everybody to learn at least a little bit from the other side of the house.

Minimum Effective Dose

One thing our industry is guilty of is following trends. We’re like the fashion industry. Someone puts a blog post out there saying, “Look at this microservice architecture that Netflix moved to.” And suddenly thousands of developers think “we must do microservices like Netflix.” Then someone writes about service meshes, and everyone rushes to implement them because they’re the hot new thing.
We follow these trends because they sound cool and sexy - it’s resume-driven development. “Oh, I have to put Kubernetes on my resume to get hired.” So we’re all going to use Kubernetes.
What people miss is context. When Netflix moved to microservices, there was a specific context where that made sense. They have thousands of developers working on millions of lines of code, serving hundreds of millions of customers, handling petabytes of data through their systems. And within that context, the technical decisions they’re making are a good fit.
If you’re a three-person startup trying to get off the ground, and you have zero customers, you don’t have that context. And if you try to apply their solutions to this wrong context, it will be actively harmful. It’ll be a really poor fit.
And I’ve seen this again and again. The classic one is like three-person startup. They have three developers, and they have like 37 microservices to manage and a service mesh, and they’re using Kubernetes and all these fancy things. It’s just not the right fit.
The amount of complexity, these things have a cost. They all have overhead. They all have drawbacks. Every technology does.
That’s almost like the hallmark of a senior engineer. A junior engineer will tell you what’s cool. The senior engineer will tell you all the drawbacks to every single thing, because all of them have drawbacks. Every single thing you use has advantages and drawbacks. And those drawbacks are worth it in a certain context.
Like microservices, there’s a huge number of drawbacks, but those are worth it in a certain context like Netflix or Google or LinkedIn. They don’t make sense in all contexts.
You typically want the minimum effective dose. This is a term from medicine. The reality is most medicines at the wrong dose will kill you. So you usually aim for the minimum effective dose. What’s the smallest amount of this medicine I can take that gives me the benefits I’m looking for without all the drawbacks, without killing me?
And you want the same thing with almost any technology choice, and that includes all of these DevOps and software delivery processes. What’s the smallest amount I can do that gives me the benefits I want while minimizing the drawbacks.
Sometimes you need to look and ask, what’s the concrete problem we are actually facing? For example, if the problem is that you have an outage every time you do a deployment, maybe moving to a microservices architecture will help that, but maybe all you need is to automate your deployment. Maybe you need a little bit of infrastructure as code or some scripting. Maybe you need a better CI/CD pipeline. Maybe you just need better automated testing in your existing code. And the key thing is you want to invest as little as you can in this stuff, and still get the benefits and minimize the drawbacks.
Here’s something I’ve learned as a startup founder who’s seen the business world: your customers don’t care what technologies you use. They don’t care if you’re using Kubernetes or service meshes. Your investors probably don’t care either. Nobody cares - they just want something that works. Every minute you spend on under-the-hood stuff that isn’t necessary is wasteful.
Most companies live or die based on their product and ability to reach customers. If you reach Netflix’s scale, yes, you’ll need complex architecture to make the product work. But the product is the goal, not the complex infrastructure. The infrastructure complexity is a cost you pay to support the product. If you don’t need to pay that cost, don’t!
Architectures and software delivery processes tend to evolve within companies. There are some very common patterns. Almost everybody starts with a single monolithic app connected to a little database and that’s it. And that’s fine. That’s actually the minimum effective dose to start out. There are tremendous businesses that use that architecture very successfully. And there’s not one compelling reason to make that any more complicated if they don’t need to.
But if you now have millions of users, you start to make it a little more complicated, and you add some caches, and you add some other things. And eventually, usually not from scale in terms of customers, but from scale from the number of teams internally, then you might break the monolith into some services, and then you could do service ownership. But you want to align your company to the right kind of stage in this evolutionary process.
And it does feel like evolution. It feels like growth. We don’t design and build our technology. It grows. And if you try to design and build it and jump 10 steps ahead, it usually doesn’t work, and then you have to go back to the simple thing and let it grow.
Think of it as the minimum effective dose. Think of it as incrementalism, as the way you want to approach these things. Don’t go for the technology because you read a blog post. That’s not the best way to design your architecture.

DevOps Antipatterns

One antipattern that comes up quite often, I’ve seen this especially with infrastructure as code, but it comes up everywhere, is somebody gets excited about a technology that they adopted in the company, but they didn’t do the legwork to get everyone bought in and gave everyone time to learn and master this technology. In the infrastructure as code space, this happens a lot. Somebody gets excited about Terraform, OpenTofu, Pulumi, whatever kind of tech.
The most common pattern is there’s just one person at the company who loves it. And they bring it in and write tons of code. And it can be amazing code, and it’s lovely and beautiful. But the rest of the team has no idea what that person’s doing. They haven’t been given the time to learn. It’s not something they’ve used before.
And inevitably what happens is there’s some sort of problem, an outage, something crashed, they need to fix it. And they don’t know how to use the code to fix it. So what do they do? They fix it manually.
The thing with infrastructure as code and these tools is if you do something manual, your code doesn’t reflect that anymore. The next time you try to use the code, it runs into issues because of this mismatch. So the code doesn’t work. Then even if somebody tries to do it the right way and use the code, they can’t. They hit a bug, hit a problem, so they do more stuff by hand. ClickOps, essentially. Which makes the code more problematic, and rinse and repeat a couple of times, and the code doesn’t work at all.
And all this work that one person did is thrown away, and they have to start from scratch or do something different. That’s wasteful, and it’s a shame. And people blame the infrastructure as code tool.
The reality is, it’s not about the tool - it’s that the team never had time to adopt it and buy into it. It’s a genuine change in process. You’re taking people who are used to SSH-ing to a server and running commands as the way to do things. You’re asking them to check out repos, open them in an editor, find some piece of code, run tests, commit code, and follow automated processes. It’s a completely different way of working.
It has advantages, but if the team isn’t bought in and doesn’t have time to learn it, you won’t gain those advantages. I’ve seen this again and again - teams not getting full buy-in and adoption. Sometimes it’s because only one person is excited. Sometimes because someone at the top mandates it.
You’re better off not using the “better technology” if you don’t have time to do it properly. If you’re going to do it, do it the right way. Otherwise, use your time for something more valuable. Again, minimum effective dose.

Incrementalism

The second closely related issue, which ties into the minimum effective dose concept of incrementalism, is the attempt to do the big bang migration. This usually happens from somebody above, and we have to do all the things at once. And if you have any complexity - these are often companies that have been around for a decade or two with millions of lines of code and lots of customers - you can’t do that stuff quickly, and you don’t want to try to do it as one giant ball of mud.
Trying to do everything at once, trying to do a massive rewrite usually ends poorly. If I’m being honest, I would say 100 percent of the time. Every single one of these large companies, I have not seen a single one of these big projects completed on time, on budget, or usually at all.
What I typically recommend is to do things incrementally. A lot of people hear incrementally, and they just think, okay, chop it up into small pieces. And that’s not really what it means.
To understand what incrementally means, the most useful way is to look at the opposite, which is false incrementalism. False incrementalism is where you take a project and slice it up into small pieces. But until the very last item is delivered, the project provides no value whatsoever. That’s what a lot of people think of with incrementalism, but until they’re all done, you can’t ship it live, can’t put any production traffic in it. That’s not incrementalism. That is a big bang migration that you happen to do in little pieces, but you get zero value until the very last item is delivered.
That’s a really risky way to approach a project. Because in the tech industry, large projects almost always fail. They don’t get completed. There’s a million reasons for that. The most common one is people just lose patience. If you haven’t gotten any value from your work, well congratulations, you just threw away 18 months and got nothing in return. That’s the worst possible outcome.
The better approach is to use actual incrementalism. The key is you don’t just chop it up into pieces. Each piece must be something you can deliver and get value from by itself, even if the other pieces never happen. That is real incrementalism. So you’re able to do one little thing, and if the rest of the project is canceled, well, that little thing was still worth doing. It still made your company better in some way. That’s the way to approach these things that actually works in the real world.
In practice, you look for concrete, specific pain points that your company is facing. You don’t look to do DevOps or cloud migrations - these are solutions. You look for problems. The problems might be outages, security problems, or that your team is really slow. Maybe you’re not agile or not delivering software quickly enough. You fix one problem at a time - whatever is most painful - and fix it all the way. Then you go to the next highest priority problem. And you pick the minimum effective dose to solve each problem.
You have to find out why the team is slow - that’s the most important question. The answer isn’t just “do DevOps.” There’s a specific cause and you want to fix that cause. Find specific problems, fix those specific problems, then repeat the process.
You get there in a way where every step delivers some value, which makes management happy, makes you happy, and makes your company more successful. Even if eventually somebody runs out of patience and you can’t do any more.

The Future of DevOps and Software Delivery

The first pattern we’ll see across all trends is the move toward higher and higher level of abstractions. Looking at programming language history, we’ve seen a continuous move to higher and higher level programming languages.
At each level, two things happen. First, you give up some low-level control and power. But most use cases don’t need that low-level control. Second, these higher-level languages make development easier, faster, and more productive when you don’t need that low-level control. This makes programming more accessible and lets us build things faster.
That’s why the industry is shifting toward higher-level languages. Not completely - code is still being written in C, machine code, and assembly, and will be for a long time. But most developers are moving to higher-level languages.
The same evolution is happening in DevOps and software delivery. We moved from buying physical servers, shoving it in a rack, deploying software on it, and hooking up cables, to cloud with virtual machines, containers, to now serverless where it’s not even containers, it’s just a deployment package.
We’ll keep climbing the abstraction ladder. I pitched the idea of “infrastructureless” as the next evolution from serverless. Serverless today is pretty close. If I were betting, I think the future will look much more like serverless than Kubernetes. I don’t think Kubernetes is the industry’s long-term direction.
When I say serverless, what we have today is you hand it a deployment package. And you say, please run it when a certain trigger happens, like an HTTP request comes in. So that’s pretty high level. Here’s code. You go figure out how to run it. I don’t want to be bothered with the details. But in practice, I’m still bothered with an awful lot of details with serverless. You still have to think about the networking aspect. You have to think about provisioning concurrency. If you need to talk to a database, you have to think about how to do long-running connections with serverless. So there’s just a bunch of Infrastructure stuff that is still left.
When we go from serverless to this concept of infrastructureless, and as a reminder, serverless doesn’t mean there aren’t servers. There’s obviously still servers. And infrastructureless doesn’t mean there isn’t infrastructure. It’s still there. It’s just not something you have to think about as a developer on a regular basis.
You give up some control. I can’t control what’s happening under the hood. And in certain use cases, that’ll be a problem and I won’t be able to use it. And that’s okay. But for the majority of use cases, I don’t need that control. I just want to hand you some code and say, you go figure out how to run it, how to scale it, how to secure it, how to handle these details.
A second one is what’s the impact of AI on the infrastructure and DevOps world? When they talk about AI, they’re usually referring to these large language models, LLMs, things like ChatGPT, and technologies of that sort.
There’s some early enthusiasm from folks saying, okay, AI is going to replace us. It’s going to write all the code for us. It’s so productive. To me, that strikes me as less likely to be our future. And especially in the DevOps and infrastructure world.
The reason I say that is the things that matter more than almost anything else on the infrastructure side are things like reliability, reproducibility, security, predictability - that’s what matters in the DevOps world. And the problem with all the LLM things that I’ve seen to this date is those are the exact areas where they’re really weak.
LLMs are notorious for hallucinating. This isn’t just a small bug in one LLM implementation - it’s fundamental to their design. They’re inherently random with built-in random seeds, and tiny prompt changes can dramatically affect results. It’s hard to get consistent, reliable output. The last thing I want is my company’s security depending on an AI that might hallucinate or give different answers each time. I’m not bullish on AI replacing this. With today’s technology, that doesn’t seem plausible.
AI becomes interesting in DevOps with retrieval augmented generation (RAG). You take a large language model and add some additional context - usually a database with extra up-to-date information about your specific use case. It combines the trained model with your up-to-date info to provide better responses about your context.
This can be interesting in two ways. The basic one, which we see today with various tools, is feeding it information about your current infrastructure - how you deploy, your metrics, your structured events. This additional context lets RAG tools answer questions about your infrastructure. So during an outage, you can ask: What changed? What did we just deploy? What happened? Which metrics changed?
We’re starting to see this emerge. It’s pretty interesting, and I think eventually it’ll be reliable enough that you can use it and accelerate either preventing or debugging outages and under even just navigating and understanding how your infrastructure works.
What excites me more, though it might not be feasible yet, is feeding in not just my context, but also context from thousands of other companies. The AI models could then recognize common patterns across companies. When a security vulnerability appears, instead of manually updating code, the model could say “957 other companies just updated their OpenSSL library. Here’s a patch for yours - I’ve rolled it out into your test environment. Does this look good? Should we push to prod?”
It’s the ability to see patterns that everyone faces and extrapolate them to my cases. That would be incredible. The reality is that thousands of DevOps engineers at thousands of companies are doing the same things over and over and over again. We don’t get to leverage much of that shared work, but maybe this will help.
The challenge is how do you get one of these large language models to expose information about other companies without leaking that company’s proprietary data, secret sauce? Can those models tell the difference and not hallucinate and not get it wrong between what should be kept private and what’s okay to share? I think that’s going to be a challenge, but if we can pull it off, I think that could be a really profound acceleration to how we build software.
The third idea is secure by default. The analogy I use has to do with elevators. What’s clever about the elevator design is that it is secure by default. The default state of the elevator is safe. It cannot move or fall. The only way it moves is when something else proves it’s in a safe state, which is the cable. That made people confident enough to ride in elevators, transformed cities, and allowed us to build tall buildings.
Right now, the state of software delivery and DevOps is not secure by default. We’re very much in that early 20th century state where we want to build fancy tall buildings and infrastructure, but we can’t because we’re afraid of plunging to our deaths. It feels like all the defaults are not secure.
When you build something, the networking is usually wide open. Nothing’s encrypted. No one verifies third-party dependencies or keeps them updated. There’s no monitoring. Worse, vendors often charge extra for security features - single sign-on is typically only in expensive enterprise plans. So we’re the exact opposite of safe by default. We’re like horrifically dangerous and deadly by default, and if you pay us enough, maybe, we’ll make you more secure. So that’s bad.
The good news is the industry is beginning to move toward secure-by-default.
One concept that’s been appearing more and more recently is shift left. The idea is to move security testing further left - closer to the development of the code. You catch security issues as early in the life cycle as possible, which is when they’re easiest to fix and most secure. You certainly don’t want them all the way on the right when you’re already in production and you’ve been hacked. That would be catching it on the very right side. There’s a whole bunch of shift left tools that allow you to enforce policies around your code and automated testing tools. It’s really nice to see security being considered early on.
The second pattern is about supply chain security. In the world of software, your supply chain is basically all the software you depend on - your open source libraries, vendor libraries, even the cloud you deploy into. All this software is written and maintained by somebody else. They’re part of your supply chain. And if somebody compromises any part of that, they can do some amazing damage. Hackers have caught on to this, and they’re trying real hard.
Something like 70% of the code that a typical company deploys is not written by that company. And that’s an underestimate. That’s just the open source portion. If you factor in things like the cloud you’re running on and the Linux operating system and all the tools on there, it’s probably like 99% of the code that you rely on. And hackers know that. If they can go into some little library hidden somewhere in Linux and put a backdoor in there, they can take over everybody’s software and hardware.
The big challenge now is how do you secure the supply chain? How can you be confident that all the software you depend on is the software you think it is and is secure and properly tested? There’s a lot of interesting technologies emerging in that space.
Another trend with secure-by-default is a push to move to memory safe languages. Again, this is moving up the abstraction ladder. It turns out that about 70% of security bugs are due to memory safety issues. 70% of the security issues we face as an industry are due to that issue alone. And those bugs don’t exist for the most part in memory safe languages. You just can’t do them in higher level languages where memory is managed automatically. So if we switch programming languages, 70% of our security issues go away. That’s a huge deal.
You’re starting to see that the US government is pushing to move away from languages like C and onto languages like Rust and Go and things that are more memory safe.
That’s a really hard thing because all of our operating systems are written in not memory safe languages. So there’s a tremendous amount of work to do there. But if we can do it, we will make things vastly more secure by default.
The final emerging pattern is zero trust networking.
The older way of doing things is the moat and castle approach. The idea is you create a secure perimeter around your infrastructure, like a castle with a moat. The perimeter is really secure, hard to get through. But once you’re inside that perimeter, you can do whatever you want.
In the software and networking world, this means having strong firewalls on the edge of your network. But once you’re in, if you somehow found a way in, you can access the wiki page over there and that service over there and that database and do whatever you want. You have free reign once you’re inside.
This approach made sense when everyone was in a company office, with all infrastructure in the same building and all computers in that office. That’s not the world we live in now. We work from home, coffee shops, and co-working spaces. Your infrastructure isn’t in your office - it’s in somebody else’s cloud. The devices you’re using are smartphones that aren’t particularly secured. So the moat and castle model doesn’t make sense anymore.
The new model is zero trust networking. The idea is that being in the network doesn’t give you any special privileges. Every request and connection must be authenticated, authorized, and encrypted. Just because I can access the wiki page doesn’t give me access to that database or issue tracker. Every single thing has to authenticate, authorize, and encrypt separately.
We’re starting to see this appearing from vendors. It’s not the default yet, but things like service meshes and similar tools are making it easier to implement, because zero trust networking is really hard to do properly. Maybe at some point it’ll become the default, which would be a tremendous improvement to security.

Recent Trend in Open Source License Changes

Terraform and other HashiCorp tools were open source under a fairly permissive license, the MPL license, for over 10 years. They built up huge communities, tons of contributors, third parties, etc. Recently, HashiCorp changed from this open source license, shifting almost everything to what’s called a business source license, which isn’t really an open source license. The short version is you can use this stuff unless you’re competitive with HashiCorp. And that’s problematic for many reasons.
But HashiCorp is not the only one. Many other companies across the industry have been changing from permissive open source licenses to these business-friendly licenses that say you can use this stuff as long as you don’t compete with us, or have various other restrictions.
I am a huge believer in actual open source, proper open source. Not these business licenses, but the permissive MIT, Apache, those types of licenses. I think they are one of the most important and valuable things we’ve done as an industry. They’re one of the biggest accelerators to all software. And I personally find it devastating to see companies doing anything to hurt trust in open source.
This will have profound negative consequences on the industry. There are thousands of DevOps engineers and companies doing the same thing, and that’s true of every kind of software development. We are often doing the exact same things, and sometimes we can share that commercially. But we’ve seen that open source, especially for infrastructure and the building blocks of the internet and our operating systems, needs to be truly open source to be reliable. So I’m personally devastated to see the industry repeatedly move away from these licenses.
How do we solve this? And what led to this? I don’t have any internal visibility into the companies that made these license changes, so this is me speculating.
What I’ve generally seen as a pattern is things being open sourced by companies trying to achieve hyper growth. These are typically venture-backed companies, either trying to go public or recently public. Their goal isn’t just to make a profit - it’s to grow massively huge and produce tremendous investment returns. And it makes sense. That’s literally why they took venture capital. It’s to produce that kind of growth.
I don’t think these license shifts happen when someone isn’t pursuing that growth. It’s rare to see somebody shift license just because they can’t afford to pay two developers. It’s mostly the VC-backed company that isn’t satisfied with making hundreds of millions of dollars - they need to be making billions. And if you need to make billions, you have to find every lever and advantage and eliminate competition to build a monopoly. Hence, the license change to prevent competition.
Here are a few key takeaways. If I was picking technology for my company to depend on, especially important foundational technologies that would be painful to swap out, I would only pick things that are on a truly open source license. Chat with your lawyers about what that means, but it’s usually things like MIT, Apache, MPL, and a handful of others like Berkeley.
Second, I would look carefully at the companies behind those projects, if there’s a company. Some projects are just individual developers, some are built by foundations, which I think is more secure. Look who is behind the project. If it’s owned and operated by a venture-backed company aiming for hyper growth, or if it was acquired by private equity firms - that’s a similar set of incentives - be very cautious. Think twice about whether to adopt it, because eventually, the incentives will push them to change the license. That’s what we’re seeing.
The best version is something managed by a foundation like Linux Foundation or CNCF. These are specifically designed so no single company can mess with the license due to their incentives. Second best is a solo developer, who has the advantage that if they can make enough money or have enough spare time, they can make it work. They’re not likely to chase venture-backed returns.
Another option is something backed by a large company where they’re not trying to monetize that piece of open source. Google and Facebook open source things they’re not trying to sell. Those can be reasonably safe bets, but the foundation approach is still the best.
In the short term, there are many other factors to consider when picking open source libraries - it’s part of supply chain security. But from a licensing perspective, the hard lesson I’ve learned is to be very careful about who is behind the project and what license it has.
In the longer term, this is something we need to grapple with as an industry. What does open source mean? Do we need to start creating licenses that basically say, this is not only open source, but I also pledge to never swap this thing? I legally prevent myself from swapping this to some sort of not open source license in the future.
Even more generally, how do we fund open source? I haven’t seen anything that seems like a great solution, but there are at least some positive trends for solo developers or really small teams where you can make a good salary to live on and focus on open source.
And the final thing is if you pick projects with truly open source licenses like Apache or MIT, then if the worst happens and somebody changes the license or moves the project in a direction you don’t agree with, well, the whole point of open source is that you have the right - honestly, even the responsibility - to fork it. You can still use the code and you’re not completely stuck.
So that is what we’re seeing with a lot of these projects. We are seeing major forks being developed. That’s kind of the fallback, but nobody wants to fork things. It’s not great. You want to have one joint strong community. That’s the best option for everyone. And that can only happen around a truly open source license. But if things happen and you don’t have any other choice, well, make sure that you at least have the option to fork the code and do what you need to do as the backup.

3 Tech Lead Wisdom

Never stop learning. Always carve out time for learning.
Share what you learned.
Do things incrementally and iterate, iterate, iterate.

Transcript

[00:01:40] Introduction

Henry Suryawirawan: Hello, everyone. Welcome back to another new episode of the Tech Lead Journal podcast. Today, I have with me Yevgeniy Brikman. So some of you from the infra world probably knows Yevgeniy. Yevgeniy is a co-founder of Gruntwork. So if you are into the infrastructure as code, you might have heard about Terragrunt or Terraform. He’s the author of “Terraform Up and Running”. He’s coming up with a new book, which is currently in progress, titled “The Fundamentals of DevOps and Software Delivery”.

So today we’re going to cover topics from the book, right? And I’m happy to have a chat with you, Yevgeniy. Welcome to the show.

Yevgeniy Brikman: Thanks so much for having me.

[00:02:15] Career Turning Points

Henry Suryawirawan: Right. Yevgeniy, I always love to start my conversation by asking you to share a little bit about your career, any turning points that you think we can learn from that journey?

Yevgeniy Brikman: Sure. So really quick, kind of big picture on career. Started my career as on the app development side of the house, software engineering. Worked at Cisco Systems, TripAdvisor, LinkedIn. And then started Gruntwork. And there’s been a lot of interesting twists and turns and things like that throughout the career. But maybe the biggest thing to share, and I share this with pretty much everyone I talk to and share it in most of my books as well, is just this idea of taking some time to learn every day or every week. But basically what I generally see is, you know, we go to school, some of us go to college, and so we spend this whole time doing this very deliberate learning. And then you get a job and then you stop. And people kind of just stagnate.

And so I’ve written a few books. And something that’s really sad to me is, and I don’t know how accurate these statistics are. Maybe they’re dated, but I remember hearing somewhere that people read something like four books a year. And in the tech industry, less than one book a year on the tech industry itself. Which makes me really sad. And, you know, books aren’t the only way to learn, but I think the general trend is people don’t invest enough in just continuing to learn new things.

And that, for me, has been the single biggest thing that has turned my career and let it grow, is just forcing myself to spend a little bit of time every week doing some sort of deliberate learning. And it’s always fun, right? It’s always something interesting. It’s not like I’m forcing myself to learn something I don’t care about. I just go and pursue the things that kind of are intriguing. Ooh, what’s this technology that I read about over here? What’s this kind of cool thing over there? And that can be by reading books, that can be by playing with code. In fact, it’s better if you do a little mix of both. Just learning without doing doesn’t quite work, and that’s kind of the basis of most of my books is you learn and you do this tons of like hands-on example. So that’s probably the biggest thing is I would encourage folks is just carve out, somehow find the time to carve out every week to learn something.

And I’ll give just a couple examples of how that’s had this just profound impact on my own career. One thing that I was just randomly curious about in, that must’ve been like 2007-2008, I started playing with CSS a lot at the time for building websites and decorating them. And I got curious and I’m going to skip all the technical details of it, but basically I ended up building in my spare time, a little tool that had to do with like creating CSS sprites and things like that. There wasn’t really good tooling around that. And I just did it cause I thought it was interesting. There was no other reason. I didn’t do anything with it. It was just like a fun little hobby thing.

As it happens, about a year later, I had an interview and questions around this stuff came up and I was literally able in the interview to bring up this tool, to talk about it, to share with it. Which was a big part of the reason I got my job at LinkedIn, which had a pretty profound impact on my career. And then I ended up meeting some of the folks that built some of the open source tooling, like around Compass CSS, if anybody remembers that from back in the day. So just this random little project literally helped me get a job.

Fast forward another five years or so working at LinkedIn, and something that caught my curiosity at that time that I spent just a bunch of time playing around. Honestly, that’s what it feels like. It doesn’t feel like learning. It feels like playing around with this stuff. I started looking at all sorts of web frameworks. I was looking at Ruby on Rails, and at the time there was a Groovy on Rails, and all these different things. Kind of became knowledgeable on these things.

And lo and behold, LinkedIn goes through this big DevOps transformation, something maybe we’ll chat about later. And one of the big things we end up doing there is changing the web framework that we use within the company. And I ended up leading that whole project just because I had been playing with these things and I knew about more about it than most other folks. That project got me from kind of the application work that I’d been doing as a software engineer into this DevOps infrastructure software delivery side of the house, which then led to me starting a company in that space and writing books in that space. So these little things that are just kind of fun, but you know, I had to take time to do them, have completely caused my career to change direction in very positive ways. That’s probably the biggest thing I can share with anybody. Take the time to learn. Don’t stop learning as soon as you graduate from school or from college.

Henry Suryawirawan: Very interesting insight from your turning points in your career, right? So I think, uh, it’s really key for everyone in the tech to actually spend time to learn, right? Because obviously these days tech moves so fast, and especially there are so many plenty of technologies being invented, being, you know, I don’t know, maybe reinvented all over, you know, the time, right? So I guess without learning, continuous learning, I think it’s pretty hard to actually catch up.

[00:08:32] Deliberate Time for Learning

Henry Suryawirawan: Maybe, I know it’s probably hard to generalize, right? Maybe if you can give some tips for the techies here, how do you actually spend your time learning? Is it like every day you spend, I don’t know, like focus time to actually read books? Or is there any kind of a tips that you would want to advise people to, you know, start their own learning, so they can be able to, maybe, I don’t know, like improve themselves in their career?

Yevgeniy Brikman: Yeah, there’s two things that I do, and exactly how I implement them has varied throughout my life. It’ll be different for everybody. One, I used to have just a little tradition where, like, I’m a night owl, naturally, so this is something that has changed in my life, but for most of my life, I’d be up really, really late. And so one of the things I used to do is after everybody else around me, my girlfriend, everybody went to bed, I would carve out like 20 minutes to just either read something or watch an interesting video and then play with the technology. And so, what this ends up looking like is, I find that you learn really well if you both have a little bit of like theoretical, conceptual learning, and then you get to practice it on something very concrete.

So I’d usually find some little project that I just wanted to do. And you don’t have to, like, no one has to see this thing. I have dozens and dozens of little code snippets that will never see the light of day and that’s fine. Some of them, a handful of them made it through and I’ll open source them or whatever else, but that doesn’t have to be the goal. They’re just something I wanted to try and play with. So I mentioned that CSS tool. That was just purely a side project that never really saw the light of day. Whereas the web framework stuff, I actually built a little project using several different web frameworks and then actually ended up putting it in production with one of them, which was a Ruby on Rails thing. I put it on Heroku.

But the whole point of that project was, I wanted to do something, build something. But I didn’t do it using stuff I already knew. I explicitly built it with things I hadn’t used before as the way to learn them. So then I’d watch a video on Ruby on Rails, and then I’d go and try to build something with Ruby on Rails. I’d watch a video on something or read a book on something and then go try to build something. So I think that’s one of the key ingredients is just find that it’s, and it’s not a lot of time. So this is the thing that really blows my mind, is the difference between essentially zero, deliberate learning, which is I think where the average person is, versus somebody who just carves out an hour a week. You know, just find 20 minutes three times in a week sometime.

The difference is tremendous because that stuff really, really adds up. It doesn’t sound like much, but there’s 50 weeks in a year. The years go by and all of a sudden you’ve spent hundreds or thousands of hours doing something that no one else around you is doing. And you look like you’re some kind of amazing expert. And you’re like, I don’t know, I was like in my pajamas like at like two in the morning like playing with this thing, but you still are far ahead of everybody who isn’t doing that. So that’s one piece is to carve out that little bit of time. When it is, really varies. I used to do it late at night. I’m no longer a night owl. I’ve kind of reworked my schedule. Now, I do it kind of a different time of day. It doesn’t matter. Just find some little time.

The one other thing that I would suggest that really makes this effective is to also, if you can, and this is a little bit of a bigger ask, but it has, it’s even higher leverage, in my opinion. Share what you learned with others. And that can be in the form of writing blog posts. That can be in the form of giving talks, doing podcasts, writing books, whatever you’re able to do. Share it. And this is something, I think it was a Steve Yegge blog post I read years and years ago. One of the biggest fears when I tell people, you know, start blogging, you know, start sharing what you know, is they say, but who’s going to read my stuff? No one’s going to want to read what I write, right? I’m not good at it, no one cares, yadda yadda yadda.

So first of all, that doesn’t matter. You’re sharing this stuff primarily for yourself. The act of trying to write down and share something that you learned will make you learn it better. It makes it stick better. It makes you, you know, you’ll do a little bit of extra research. You’ll connect it to some other concept you hadn’t thought about. So the primary audience really is yourself. So even if no one reads it, that doesn’t matter.

But the reality is almost anybody can write something that will find at least a small audience. And this is this Steve Yegge blog post where he talks about this idea where we picture in our heads that other people know more than us, right? They have this giant mass of knowledge, and I only know this tiny little bit of the world. And that’s not really what reality looks like. What reality looks like is everybody has these random like a Venn diagram, random little circles of the world that they know, and then these huge vast areas that we know nothing about. And everyone, everyone’s collection of these little circles of what they know is different, and everyone has a different path for how they navigate through it. So if you start sharing what you know, you are going to find someone else out there who’s on a similar path to you, or on a path that benefits from what you know, but might not benefit from others’ writing because it’s a different path. So you probably will find an audience. And by writing, you will learn this stuff a lot better.

And the other thing is you will, again, get these really random, serendipitous, completely unpredictable impact on your life, on your career. And I’ll give, again, a couple examples from my own career. I’ll give just one example because it’s such a bizarre and unexpected one. While I was at LinkedIn working on some of this web framework stuff, redoing LinkedIn’s infrastructure, I decided to start blogging about that on LinkedIn’s engineering blog. And so I was just writing and kind of sharing what I learned, and it found a small audience of people that found it valuable. That led to a few talks, which was kind of cool. Got invited to some conferences.

But the really unexpected one is, I was at work one day. And this guy who I’ve never seen before just comes up to my desk and he says, hey, you’re Yevgeniy Brikman, right? I’m like, yeah, who are you? He’s like, oh, I found your blog posts on this framework stuff. And I read them and I really liked them and I thought, oh, this is so cool. LinkedIn is doing this kind of cutting edge, interesting stuff. And that’s what made me decide to be okay with LinkedIn acquiring my company. So this was a guy who had started a little startup and they got acquired by LinkedIn. And obviously this wasn’t the only factor, but like it kind of shared the culture and it made him excited. So I got to know the guy, he’s a really smart guy, really interesting. So first of all, that was just one random thing where I just connected with him.

And believe it or not, a few years later, I leave LinkedIn, I start Gruntwork, and one of my absolute first customers is this guy. He had moved on to a different company. They needed some infrastructure work, and they ended up hiring Gruntwork, and it’s one of the customers that helped us get off the ground, honestly, as a company. All because, and, and he knew he could trust me, and he met me, and all of this, all because, like, years and years ago, I’d taken a few hours to write a couple blog posts on some stuff I learned.

So you never know where this stuff will go, but it probably will go somewhere. Almost everyone I talk to who regularly shares their work, and I’m sure you’ve seen this in your own career. It just has this weird, unexpected, profound impact. So yeah, I guess in terms of how to learn, find a little bit of time, and then if you can, share what you learned with others.

Henry Suryawirawan: Thank you for sharing your interesting personal story, right? I think I can relate to some of what you shared just now, right? So I think the first thing is carve out time to actually do some kind of learning or maybe do some project, right? You can do some mini project, even like, uh, I had one guest before, John Crickett. Coding challenges, right? He advised people to actually do some kind of mini project to actually get hands on experience and learn some fundamental concepts. And sometimes, actually, initially if we carve out maybe 20 minutes, if you like that kind of project, you’ll easily find time to actually do more because simply you enjoy what you do, right?

And I think sharing, definitely, yes. I find also there are great tips, just now you mentioned, right? Don’t share it for other people, share it for yourself first. So I think make yourself as the first audience.

[00:16:27] Transitioning from App Dev to Infra

Henry Suryawirawan: So one thing that I also piqued interest from what you shared in the beginning, right? So you came from the application development background. And then gradually, slowly moving into the infrastructure and, you know, DevOps kind of world. I find some engineers actually are interested in doing this, although some people actually find it really hard, because they are typically two different worlds all together, right?

One is more like application design, system design, things like that, while infrastructure, maybe things like cloud, you know, data centers, you know, servers and things like that. So maybe from your experience as well. How do you actually bridge the gap between, you know, like knowing application development and then moving into the infrastructure world? Because for some people, this is typically a hard thing to do.

Yevgeniy Brikman: Yeah, that’s a good question. So one thing that I’ll say, part of the reason I ended up on the infrastructure side of the world was I mentioned LinkedIn had all of these issues, honestly, with all of its… the term DevOps didn’t really exist or had just appeared on the scene, so we didn’t call it that. But we had issues delivering our software, right? We had a bunch of app developers who wrote code and built this stuff and then getting it out to users and running it and maintaining it and scaling it and securing it. Those were really, really, really tough. We struggled with that enormously. And so part of what got me in there was kind of being thrown into the middle of this, and it’s like, well, we got to fix it. It doesn’t matter if you’re an app developer, if you can’t ship your code, it’s not going to provide any value. So we got to fix how we shipped code.

In terms of learning it, I personally struggled tremendously to learn it. I found the space, especially back when I was starting to do this stuff, like 2011, that’s roughly the timeframe. So it’s a little different now, but back then, it was just legit hard. It seemed like there was this app development side of the world, which like takes a while to learn, but there seemed to be a lot of resources for understanding it and learning it. And then there was the other side of the house which was just felt like magic, right? Like there’s just these people that like literally knew these little magic incantations in Linux or in DNS or in whatever else and then they would wave their wand and suddenly your app was running and then it would crash and they’re like out on vacation. You have no idea what to do about this. And it was, it was very frustrating, to be quite honest.

So I kind of learned it the hard way. And I think a lot of people learn it the hard way. That is, something breaks. You go through a tremendous amount of pain. You maybe fix it. You probably fix it the wrong way. It’ll probably break again. You’ll be awake again at four in the morning, you’ll fix it. And eventually you kind of figure out better and better ways to do it. But it was a slow and painful process.

To be just honest, one of the reasons I built Gruntwork is because this stuff was driving me nuts. Like, it’s a little weird to say, but I started a company to work on something that I just really didn’t like. I just really did not like this aspect of software development where I’d have a great idea, I’d build it, I’d design it, we’d all be excited, and then you’re like, oh, God, how do we get it out there in a way that’s like not a security nightmare and like, is maintainable, right? You can like deploy it, but then how do you update it without breaking everything? It was just really, really tough space and that’s part of the motivation for building Gruntwork was, all I could think to myself was, there must be an easier way to do this.

And this was also the motivation for writing this new book, is I couldn’t find a single kind of comprehensive resource for learning this stuff. As I said, I had to learn it the hard way, piece it a little bit from here and a little bit from there, some from doing it wrong. Oop, I found a book that accidentally mentioned, oh, now I understand how to do this. This guy over here knows how to do it. This person over there knows how to do it. So I tried to, in this book, kind of bring together a very hands on guide to the most common things you have to deal with to deliver software.

There are a lot of resources these days on the DevOps side that will teach you the cultural aspects of DevOps. And I think that’s also incredibly valuable, and I do recommend reading those. At the end of the book, I have a whole set of recommended reading on a variety of topics, including that. And so things like, you know, how do you structure your teams or how do you do a post mortem? Or how do you do things like error budgets and SLAs and SLOs and all of these things? And I think those are really valuable. But as far as I know, I still haven’t found a single guide that’s like, hey, you built an app. You know, here’s your little Java app, your Ruby on Rails app. What’s next? How do you put it on www.something.com so everyone can use it and it scales and it’s secure, etc. So that was a big motivation for the book as well.

So I guess my answer now, in 2024, for how to learn this stuff is, you know, get my book, obviously And beyond that, there’s a few other things that are quite valuable. I list, again, a whole bunch of them in the back of the book. There are a few guides to the cultural aspects of DevOps that I think are very solid. And there are a few online courses that do a decent job walking through things. Other than that, do it. Just practice it. Find some kind of a side project. Find something at work. Find some way where you need to deploy an application in front of other human beings and just start practicing it. And I think that’ll make all these concepts kind of come together. So it’s still not easy to learn. It’s a little easier. And maybe with this book, it can be a whole lot easier.

One of the things I write right in the preface of the book is, I’m really, really hopeful that we’re going to have a generation of application developers and infrastructure developers who can learn this stuff not the hard way. Because here’s the other truth. When you have bugs in your application code, occasionally, there’s some pretty severe bugs, and we all have, you know, occasional nightmare stories, but usually they’re minor things, they’re annoying, and you kind of fix them. When you have bugs on the software delivery side, you tend to like just take everything down. You tend to like accidentally delete your production database. You tend to have a security breach, right? Like these are serious problems. So learning this stuff the hard way is not only just painful and unpleasant for you. It’s also legit harmful for the software industry, in general, right? Like we’re building software in a way that isn’t very secure, that isn’t very reliable, that isn’t very stable. Because people don’t know how to do it. So I’m really hopeful that the next generation of folks that comes along will benefit from some of this stuff and maybe learn it just a little better than what I had to do.

Henry Suryawirawan: Thanks for a good segue to actually explain the motivation behind your book, right? So I find also, like, for application developers, right, it is definitely very, very valuable if you also know the infrastructure or software delivery side, right? Because you cannot just write code and let someone else ship it, you know, like making it into production, right? If you actually know end-to-end how you can deliver your software to production and to users, even better, right, if you know how to scale it, if you know how to make it secure, right? And making sure how you can, you know, update, rollback, and things like that, I think that will make you as a very, very valuable engineer. And those skills is really valuable in the industry.

So I think you mentioned something about software delivery that is a hard thing to learn. You typically learn from your mistakes. And there are so many concepts, especially now, you know, you have cloud, you have, uh, different products, different paradigms, like for example, VMs, containers, serverless, and all that. I think there’s just too many permutations for some people, right? And I think, I find your book is pretty, one of the rarest resources that you can actually use to learn because it’s, there’s a mix of infrastructure related, and there’s a mix of application development and software delivery practices as well. So I highly encourage people who are interested in learning this to actually pick up your book and actually try to learn from there.

[00:24:19] Understanding How to Deliver Software

Henry Suryawirawan: So you mentioned about the story of LinkedIn, right? So initially they were struggling, you know, to deliver great software, right? In terms of, you know, scaling and all that. So what makes these kind of skills really valuable? So maybe if you can tell us why someone should learn how to deliver their software in a much better way?

Yevgeniy Brikman: Yeah. So I’ll add a little bit of context on this LinkedIn story, and that might help answer this question. So back in, it was around 2011, to some extent, LinkedIn was looking great. We had just had our initial public offering. The stock price was going way up. The product was doing great. We were adding something like two new members to the website every single second of the day. So it was just crazy hyper growth. It also sometimes felt like we were like hiring two employees every single second of the day, because the company internally was also growing through hyper growth. So things seemed really, really good. Revenue was way up, etc.

But the reality was we got to the point with our software delivery practices where we could not deploy. And I don’t mean hypothetically. I mean, like in actual practice. What we were doing back then is there’d be a deployment once every two weeks. And so this was like this release train model. You know, if the train leaves the station, either you’re on it or you got to wait for the next one, two weeks from now. And so we had one of these, we did all this stuff, the release branch, etc, and then we went to deploy it. And we rolled out a bunch of code, and a whole bunch of things died, and broke, and crashed, and things were pretty bad.

So we spent a bunch of time bringing everything back up, and trying to fix the bugs, and rolled out new code, and that broke more stuff. And then we rolled out new code to fix those issues, and that broke more stuff. And it went on for a day. And then to a second day, and we just could not stabilize the new code. And we literally just had to roll everything back and kind of patch things up and mostly get things back to working before the deployment started. So we literally could not deploy. Like we were completely, totally stuck. And so the company had no option.

At this point, a new VP of engineering came in. And he basically saw no other option and we had to put all product development on hold completely. This was something called Project Inversion. You can Google it if you want a lot more of the details, but literally, we just said, no more product development. Every single person in the company, whether you’re an app engineer, whether you’re a designer, marketing, whatever, is going to go and work on internal stuff and tooling. And, again, there was no DevOps term at the time, but that’s basically what we’re trying to figure out is like DevOps processes and software delivery practices until we could reliably and quickly and effectively ship code.

So maybe the very first reason to learn software delivery is so you don’t get into that state. That’s not a good place to be. That’s the kind of thing that put the entire company at risk. And we got through it, and you know, we went from deploying once every two weeks, or not, as the case may be, to deploying hundreds of times per day with far fewer outages and issues. So it ended up being a success story. A lot of good technology came out of that work. Like Apache Kafka was developed at LinkedIn during this kind of period. It worked out okay, but it could have gone very, very poorly. And so that’s one reason that it doesn’t matter if you’re app engineer, whatever your career is, understanding some of these basics of software delivery. You don’t want to go there. You don’t want to get to the kind of the darkest, long nights that you can get to.

Second reason to learn it is honestly to be independent. There are companies where you can work, and I, early in my career did this as well, where you’re the app engineer, you build a thing, and then you kind of toss it over the wall. And again, somebody magically deploys it for you. That’s actually more rare these days. But even if you are today at a company where that’s the case, there’s a real good chance you won’t be tomorrow. One of those reasons is you might want to start your own company, or a side project, or something for fun. And so understanding how to take your app and do something useful with it so it works in production in a reasonable capacity is incredibly liberating. This was one of my favorite kind of discoveries as part of this process is all of a sudden I could take any idea I had and ship it, right? This was one of the things that gave me confidence to like start a company and also do a lot of other projects even before that, is just understanding some of these real basic things.

So being independent and these days, more than ever, I think these days, it’s hard to get a startup noticed. So the marketing thing is still really, really hard. But, oh man, is it easier to start a startup now than ever, ever before. Because of the cloud, because of the technologies out there, because of the knowledge that’s out there, because of the reach through the internet, through mobile apps, through things like this. It is easier than ever to start your own thing, uh, but if you don’t know how to deploy things, you’re not going to go very far. So that’s, I think, a real good second reason to learn this stuff.

The third one is, even if you’re in a place where somebody else does the DevOps-y software delivery stuff for you. To call it out frankly, and I can say this because this was me earlier in my career, you are not a particularly good app developer if you don’t understand how the stuff gets deployed and maintained on the site. The code you write is probably terrible. And again, I can say this because the code I was writing was terrible. I see that now. And you understand that once you understand things.

Like you have to understand the architecture of your site, right? That is kind of the, one of these places where app development and infrastructure development overlap. It’s the architecture, right? Is there one copy of your app running in the world? Or are there 200 copies on different servers? Are you a single monolith? Or are you 127 microservices that communicate over the network? These have profound impacts on how you build your application. Security is another huge one. The amount of insecure code out there that exists just because people don’t have the concepts of what is really happening on these servers, is kind of terrifying for our industry.

So that, I think, is a huge reason, is you don’t have to be an expert in this stuff. You don’t have to go super, super deep like the people that are handling this. But if you are completely unaware of how code is deployed, scaled, you know, what high availability means, things like cap theorem, etc. If you’re completely blind to those, you’re going to be writing terrible, terrible application code, and your career probably won’t go very far. Your bosses won’t be happy, your customers won’t be happy. So I think that’s a very compelling reason for everybody to learn at least a little bit from the other side of the house.

Henry Suryawirawan: I highly agree for your last point there, right? So if an application developer doesn’t know how the code gets shipped, scaled, and being secured, right, so I think typically the code that they wrote won’t be that great, simply because they don’t know how to support the system later on, what kind of issues that typically happen. And worse, they don’t actually have ownership. Because they think their job is just to write code and somebody will operate that. So I think if you want to be a good software engineer, right, so make sure that you know how to deliver the software.

And thank you for sharing your, the LinkedIn story as well. I find it really fascinating, right? So almost every company that goes big, right? Whatever scale up out there, right? Would face this challenge at, I don’t know, a certain point in their, you know, life, right? So typically, they couldn’t deploy, they couldn’t scale when there are so many traffic and demands. They couldn’t release new features or new updates, or it takes a long while to actually deploy their features. So I think typically these are the turning points where companies will invest a lot more in doing software delivery. And I’m sure we don’t want to wait until that time before it actually happens, right?

[00:32:05] Minimum Effective Dose

Henry Suryawirawan: So I think the other thing that you mentioned in the book, is something that I find really, really important to discuss here. Because typically when people talk about software delivery these days, there’s so many technologies. Okay, let’s use Kubernetes. Let’s use, you know, serverless. Let’s use containers. Let’s use whatever, whatever technologies that are available out there. And they start, you know, even though it’s a simple problem, they start with, you know, all these different technologies and techniques available. In your book, you mentioned this concept called minimum dose, right? Minimum kind of like MVP style, you know, like the minimum thing that you need. So maybe tell us why this is important to actually think of software delivery in this manner.

Yevgeniy Brikman: So, one of the things that our industry is very much guilty of is following. We’re like the fashion industry, right? Somebody puts a blog post out there that’s like, look at this microservice architecture that Netflix moved to. And now you have thousands of developers that are like, we must do microservices like Netflix. And then somebody else put something out about service meshes. And now everybody rushes to do service meshes. They’re the hot new thing. And we kind of are following these trends, these things that sound cool and sexy and this like resume-driven development, right? Ooh, I have to put Kubernetes on my resume, right, to get hired. So we’re all going to use Kubernetes.

And I think what people really miss out on is context, right? Netflix, just to use them as an example, you know, when they move to microservices, there’s a certain context in which that decision makes sense. And that context is typically in the sense that they have like thousands of developers working on just millions and millions of lines of code to serve hundreds of millions of customers, to serve, I don’t know, how many petabytes of data going through their systems and things like that. And within that context, the technical decisions that they’re making are a good fit. If you are a three person startup trying to get off the ground and you have zero customers, you don’t have that context. And if you try to apply their solutions to this wrong context, it will be actively harmful. It’ll be a really, really poor fit.

And I’ve seen this again and again. The classic one is, again, this like three person startup. They have three developers and they have like 37 microservices to manage and like a service mesh and they’re using Kubernetes and all these fancy things. It’s just not the right fit. Like the amount of complexity, like these things have a cost. They all have overhead. They all have drawbacks. Every technology, right? That’s almost like the hallmark of a senior engineer. A junior engineer will tell you what’s cool. The senior engineer will tell you all the drawbacks to every single thing, because all of them have drawbacks. Every single thing you use has advantages and drawbacks. And those drawbacks are worth it in a certain context. Like microservices have, in the book, I have like I dunno how many pages of drawbacks to using microservices architecture. There’s a huge number of drawbacks, but those are worth it in a certain context. You know, ala Netflix or Google or LinkedIn. They don’t make sense in all contexts.

So one of the things I talk about is you typically want the minimum effective dose. This is a term from medicine, where in the world of medicine, if you take some kind of a pill, the reality is most medicines at the wrong dose will kill you. That’s the reality. They’ll be really bad for you. So you usually aim for the minimum effective dose. What’s the smallest amount of this medicine I can take that gives me the benefits I’m looking for without all the drawbacks, without killing me, essentially? And you want the same thing with almost any technology choice, and that includes all of these DevOps and software delivery processes. What’s the smallest amount I can do that gives me the benefits I want while minimizing the drawbacks?

So jumping to like a giant ultra super fancy microservices architecture, that’s probably not the minimum effective dose, right? Sometimes you need to look and ask, like what’s the concrete problem we are actually facing? So for example, if the problem is that you have an outage every time you do a deployment, maybe moving to a microservices architecture will help that, but maybe all you need is to automate your deployment, right? Maybe you need a little bit of infrastructure as code or some scripting or things like that, right? Maybe you need a better CI/CD pipeline. Maybe you just need better automated testing in your existing code.

And the key thing is you want to invest as little as you can in this stuff, and still get the benefits and minimize the drawbacks. Cause here’s the other thing I can tell you as a startup founder, and someone that’s, you know, seen the business world. Your customers do not care what technologies you’re using. None of them care if you’re using Kubernetes or service meshes or whatever. Like they just don’t care. It doesn’t matter. Honestly, your investors probably don’t care. Nobody cares. They just want something that works. And so every single minute that you invest in the under the hood stuff that isn’t absolutely necessary, it’s probably wasteful.

Like most companies live or die based on the product, based on their ability to reach customers. So product I’m using as a broad umbrella for marketing, sales, product, right? That’s the stuff that matters. And if you get to the scale of a Netflix, yes, you’re going to need to do a lot of crazy stuff in your architecture to make that product work. But it’s the product, like making that work is the real goal, not the under the hood craziness. The under the hood craziness, it’s a cost. It’s the cost you pay to support that product. So if you don’t need to pay that cost, don’t. Absolutely, don’t!

So yeah, I talk about that quite a bit in the book. I talk about how architectures and software delivery processes tend to evolve within companies. Because there’s some very, very common patterns, right? Almost everybody starts with like the single monolithic app and maybe it’s connected to a little database and that’s it. And that’s fine. That’s actually the minimum effective dose to start out. There are tremendous businesses that use that architecture very successfully. And there’s not one compelling reason to make that any more complicated if they don’t need to.

But if you have to, right, if you now have millions of users, well now you start to make it a little more complicated and you add some caches and you add some this. And eventually, usually not from scale in terms of customers, but from scale from the number of teams internally, then you might break the monolith into some services and then you could do service ownership. But you want to align your company to the right kind of stage in this evolutionary process. And it really, it does feel like evolution. It feels like growth. Every company just, we don’t design, build our technology, right? It grows. And if you try to design and build it and jump 10 steps ahead, it usually doesn’t work, and then you have to go back, you know, go back to the simple thing and let it grow.

So, yeah, think of it as the minimum effective dose. Think of it as incrementalism, which I can talk about at some point, maybe today, as the way you want to approach these things. Don’t go for the technology because you read a blog post. That’s not the best way to design your architecture.

Henry Suryawirawan: Right. I think this is gold, right, for everyone who just listened to the explanation, right? So I think context is king, right? Whatever technologies that you read, whatever technologies that you learn, right? So obviously they will paint all the great stuff. But sometimes it has a cost associated behind it. Be it, for example, maintaining it, the skill set that you need to learn. Don’t forget about that. And typically, right, it’s not just, okay, I managed to run it the first time, right?

So there are many other things that you need to take care about, like for example the administrative stuff. You know, for example, if there are massive scales suddenly coming, do you know how to operate that in the most efficient manner? And also the security part, right? How do you secure the actual infrastructure and the thing that you use, right? I think that’s also another thing that you should not forget. And the people aspect, uh, don’t forget. Like when you operate multiple microservices, you will probably need more people, right, to actually help you maintain that. So definitely context is king and I really agree with you. Fully agree. If you’re a small startup, don’t start with, you know, like the fancy kind of technologies.

So I think definitely stage also matters, because I think in most of the stories that I heard or read as well, there will be some time where you will need to split, rewrite, re-architect, right? But that is associated with the growth or the success of the company, right? People like to kind of like gold plate their solution, right? They think, okay, we will succeed but don’t actually assume that you will reach that stage, right? So I think just wait until the growth happens and then you can adapt accordingly.

[00:40:34] DevOps Antipatterns

Henry Suryawirawan: So you mentioned about minimum effective dose. From your experience working with many infrastructure stuff or DevOps. is there any other anti-patterns or pitfalls that you want to advise people to avoid?

Yevgeniy Brikman: There’s a whole bunch, and again, they’re going to often be context specific. One of the ones that comes up quite often, I think you actually touched on it a little bit. I’ve seen this especially with the infrastructure as code space. But it comes up just about everywhere, is somebody gets excited about a technology that they ostensibly adopted in the company but they didn’t actually do the legwork to get everyone bought in and to get everyone the time to learn and really master this technology. So in the infrastructure as code space, this happens a lot. Somebody gets really excited about Terraform, OpenTofu, Pulumi, whatever kind of tech.

And the most common pattern is there’s just the one guy at the company who just loves it. And he brings it in and he writes tons and tons of code. And it’s like, it can be amazing code and it’s lovely and it’s beautiful. But the rest of the team has no idea what that guy’s doing. They haven’t been given the time to learn. It’s not something they’ve used before. And inevitably what happens is there’s some sort of problem, an outage, something crashed, they need to fix it. And they don’t know how to use the code to fix it. So what do they do? They fix it manually.

Well, the thing with infrastructure as code, and a lot of these tools, is if you go and do something manual over here, well, now your code doesn’t really reflect that. And the next time you try to use the code, it’s going to run into issues because of this mismatch. So now the code doesn’t work. So then even if somebody tries to do it the right way and use the code, they can’t. They hit a bug, they hit a problem, so then they go and do some more stuff by hand, manually. ClickOps, essentially. Which makes the code even more problematic and, you know, rinse and repeat a couple times, and suddenly the code doesn’t work at all. And all of this work that this one guy did is essentially just thrown away and they have to start from scratch or do something different.

And that’s wasteful, and it’s a shame. And people of course blame the infrastructure as a code tool. But the reality is, it’s not really the tool, it’s just that the team never had the time to adopt it, to buy into it. Because it is, it’s a genuine change in your process, right? You’re taking folks that are used to like SSH-ing to a server and running a bunch of commands as the way to do things. And you’re asking them to like, go check out a repo, open it in an editor, find some piece of code, maybe run some tests, commit the code, maybe some more tests to run. Then there’s some automated process, right? Like it’s a completely different way of doing things. And it has advantages. But if the team’s not bought in, if they don’t have the time to learn it, you’re not going to gain those advantages, essentially.

So that, that has been something I’ve seen again and again and again, just not getting that full buy in and adoption. And that sometimes happens because there’s one person excited. That sometimes happens because somebody at the top just mandates something. Just marching orders, we will use technology X. And everyone below them kind of shrugs and says, okay, but you know, they’re not really bought in so they don’t do it. So that’s usually, that’s a very, very common pain point. And honestly, I would say you’re better off not using the quote unquote “better technology” if you don’t have time to do it properly. Like if you’re going to do it, you need to do it the right way. Otherwise, use your time for something else that’s more valuable. Again, minimum effective dose.

[00:44:02] Incrementalism

Yevgeniy Brikman: The second, I think, closely related issue, and this also really, really ties into the minimum effective dose concept that I mentioned incrementalism, so probably a good time to talk about it, is the attempt to do the big bang migration, right? This usually happens from somebody above, says, we will move out of our on prem data center in six months. And also we’re going to adopt DevOps, whatever that means, and we’re going to use Kubernetes, and we’re going to be cloud native, and it’s like buzzword, buzzword, buzzword, buzzword, long list of things, and we have to do all the things all at once. And if you have any degree of complexity, right, and mostly these are companies that have been around for a decade or two, and they have, you know, millions of lines of code, and lots of customers. Like you can’t do that stuff quickly, and you don’t even want to try to do it as one giant kind of ball of mud, essentially, is what it turns into. Trying to do everything at once, trying to do like a massive rewrite usually ends poorly.

And I would say usually, but if I’m being completely honest, I would say 100 percent of the time. Every single one of these large companies, and I’ve had the privilege to work at this point with I have probably over a thousand different companies, I have not seen a single one of these big projects completed on time, on budget, or usually at all. What I typically recommend is to do things incrementally. And the best way to understand the word incrementally, because this is one that’s very easy to misinterpret. A lot of people hear incrementally, and they just think, okay, chop it up into small pieces. And that’s not really what it means.

So to understand what incrementally means, the most useful way to do it is to look at the opposite, which is false incrementalism. False incrementalism is where you take a project, and you slice it up into small pieces. But until the very last item is delivered, the project does not provide any value whatsoever. So if you’re doing this big rewrite where you gotta rewrite maybe the way your front end app works, and the way the back end app works, and the way you deliver these apps, and the way you secure these apps. And until every, and even though you could do these as one piece, and that’s what a lot of people think of with incrementalism, but until they’re all done, you can’t actually ship it live, you can’t put any production traffic in it. That’s not incrementalism. That is a big bang migration that you just happen to do in little pieces, but you got zero value until the very last item is delivered.

And that’s a really, really risky way to approach a project. Because in the tech industry, large projects almost always fail. They just, they don’t get completed. And there’s a million reasons for that. I’m not gonna have the time to dig into them, but the most common one, honestly, is people just lose patience. The CEO says, yes, you can do that rewrite, but after 18 months of waiting for your rewrite to complete, they’re like, no, we got a business to run. Your rewrite has been canceled, right? And if at 18 months it’s that, you know, that person upstairs says you’re done, you’re out of time. If you haven’t gotten any value from your work, well congratulations, you just threw away 18 months and you got nothing in return. That’s the worst possible outcome.

So the better approach is to use actual incrementalism. And the key is you don’t just chop it up into pieces. But each of those pieces must be something you can deliver and get value from by itself, even if the other pieces never happen. That is real incrementalism. So you’re able to do one little thing, and if the rest of the project is canceled, well, that little thing was still worth doing. It still made your company better in some way. That’s the way to approach this, to approach these things that actually works in the real world.

And in practice, what that translates into is you look for concrete, specific pain points that your company is facing. You don’t look to do DevOps. You don’t look to do cloud migrations. These are solutions. You look for problems. The problems might be outages. The problems might be security problems. The problems might be that your team is really slow. Like you’re not agile or you’re not delivering software as quickly, right? And you fix one, whatever that most painful problem is, you fix one at a time and you fix it all the way. And then you go to the next highest priority problem. And you pick the minimum effective dose to solve each of these problems. So if the team is slow, maybe what you need to do is to deploy more often, or maybe you need to test in production, or maybe you need to change the web framework, right? You have to find out why is the team slow, which is the most important thing you can ask. The answer isn’t just do DevOps, right? Like there’s a cause and you want to fix that very specific cause.

So that’s the approach I recommend. Go find very, very specific problems. Fix those very, very specific problems, and then just repeat the process over and over and over. And eventually, you’ll have something where you maybe are using the cloud or infrastructure as code, whatever. But you get there in a way where every single step has delivered a little bit of value, which makes the person upstairs happy, which makes you happy and actually makes your company more successful. Even if, eventually, somebody runs out of patience and you’re not allowed to do any more of those.

Henry Suryawirawan: Wow. I think those are really great learning points, right? The first is about bringing people in, you know, getting the buy in, making sure they follow the journey, right? Not something that just a few people want to champion, but then it dies off after some time, right? Incrementalism, thanks for explaining the concept, right? The false incrementalism and the true incrementalism. And the last thing is probably, yeah, find from the pain points first, the problems, rather than, you know, implementing from buzzwords or resume-driven development.

[00:49:37] The Future of DevOps and Software Delivery

Henry Suryawirawan: So speaking about, you know, some of these evolution, right? So I know that recently we have this AI and so many other new cool technologies coming. What do you think is the future of DevOps and software delivery? Maybe you have some crystal ball here that you can share.

Yevgeniy Brikman: Yeah. The very last chapter in the book, I try to take some guesses at some interesting trends that are coming. I’ll put a big caveat here. Anytime you try to guess the future, like you’re hilariously wrong, and that’s okay. Um, I don’t mind being wrong. I think this is just interesting to look at and kind of what are the kind of cool trends that are coming out. And my guess is a few of these will be kind of interesting and then it really what will happen is something none of us saw coming will be the big thing. And that’s fine. So what are some of the trends that I, I’ve been seeing? Um, I’ll list a few of them. Probably won’t have time to get into all of them.

But the first one, and I think it’s the pattern that kind of is we’re going to see through all the trends, is this idea towards a move to higher and higher level abstractions. The analogy that I use is if you look at the history of programming languages, what we’ve seen is a move to higher and higher level programming languages. So you started with, you know, binary machine code, you moved on to assembly languages, you moved on to, you know, things like C, eventually Java, and then, you know, these days Scala, and even more modern languages over the last few years. And at each level, a couple things happen. One, you give up some degree of low level control and power, right? And we can all hear like the C programmers yelling at Java programmers, you know, hey, I can’t control memory the way I want to, and you know, I need this control. And they’re right, right? For some percentage of applications, if you need that control, yeah, stick with a lower language. But in reality, the vast majority of use cases don’t need the lower level control.

And so the second thing that happens with these higher level languages is they make it easier. They make it faster. They make it more productive to do what you need to do when you don’t need the lower level control over things. And so that makes programming more accessible to more people, and that lets us build things faster. And so that’s why the whole industry gradually is shifting towards higher level languages. Never 100%. You know, there’s still code being written in C and machine code and assembly, and there will be for a long time. But the majority of developers gradually move to these higher level languages.

I think the same thing is happening and will continue to happen in the DevOps and software delivery space. We moved from, you know, I need to buy a physical server and shove it in a rack and deploy my software on it and hook up cables and all this stuff to the cloud using virtual machines, to the cloud using containers, to now serverless where it’s not even containers, it’s kind of this little deployment package. And I think there’s going to be, we’re going to keep climbing that abstraction ladder. I don’t have a good name for it. I kind of pitched this idea of infrastructureless, which is the next evolution from serverless.

So serverless today is pretty close. And if I was a betting man, and again, I could be hilariously wrong here, but if I was a betting man, I think the future is going to look a lot, lot more like serverless than it does like Kubernetes, for example. I don’t think Kubernetes is the long term direction of the industry. And when I say serverless, you know, what we have today is you kind of hand it a deployment package. Here’s a little piece of code, sometimes a function with an entry point. And you say, please run it when a certain trigger happens, like an HTTP request comes in. So that’s pretty high level, right? Here’s code. You go figure out how to run it. I don’t want to be bothered with the details.

But in practice, I’m still bothered with an awful lot of details with serverless. I still have to think about, you know, if you’re using for example Lambda in AWS, you still have to think about AWS accounts. You still have to think about the networking aspect. You still think, have to think about provisioning concurrency. If you need to talk to a database, you have to think about how to do long running connections with serverless. So there’s just a bunch of Infrastructure stuff that is still left.

I think when we go from serverless to this concept of infrastructureless, and just as a reminder, serverless doesn’t mean there aren’t servers. There’s obviously still servers and infrastructureless doesn’t mean there isn’t infrastructure. It’s still there. It’s just not something you have to think about as a developer on a regular basis. You give up some control, right? I can’t control what’s happening under the hood. And in certain use cases, that’ll be a problem and I won’t be able to use it. And that’s okay. But for the majority of use cases, I don’t need that control. I instead just want to hand you some code and say, you go figure out how to run it, how to scale it, how to secure it, how to handle a lot of these details.

So I think that’s what we’re going to. I think serverless is like a very early glimpse of what that can look like. And you can tell it’s not a very mature technology yet. So there’s all these like problems and drawbacks to using it. But it really, it like, to me, it smells like the future. And I think if you extrapolate it out, things are gonna look much more like that and kind of turn into this infrastructureless thing for the most companies. Not everyone. There will always be companies that need the lower level control and they’ll stick with either running their own servers or something a little lower level. But most use cases will move on. So I think that’s one interesting pattern.

A second one. I, I’ll talk about three, I guess. A second one that I think is interesting, and I get questions about this all the time, is what’s the impact of AI on the infrastructure and DevOps world? And when they talk about AI, they’re usually referring to these large language models, LLMs, the things like ChatGPT, and technologies of that sort. I’m not an expert on this AI stuff, so I will put that right out there. My experience is with using a whole bunch of them and usually tearing out a lot of hair with it. I think there’s some early enthusiasm from folks that are saying, okay, AI is going to replace us. It’s going to write all the code for us. It’s so productive. To me, again, could be hilariously wrong, but that strikes me as less likely to be our future. And especially in the DevOps and infrastructure world.

The reason I say that is the things that matter more than almost anything else on the infrastructure side of the house are things like reliability, reproducibility, security, predictability, right? Like it’s kind of the boring stuff, but that’s what matters in the DevOps world. And the problem with at least all the LLM things that I’ve seen to this date is those are the exact areas where they’re really, really, really weak, right? LLMs are notorious for just hallucinating things, right? Just making things up. And that doesn’t seem like a small bug of one LLM implementation. I think that’s just core to the infrastructure, to the way that they’re designed.

They’re also kind of random, like literally they have a random seed built-in and the tiniest little change to a prompt can profoundly impact the results, right? You change like two words, like a pronoun here, and all of a sudden you get a completely different response. So it’s really hard to get like repeatable, dependable results. And the last thing I want is like the security of my company being dependent upon an AI that will like hallucinate an answer or will give me a slightly different answer. You know, it used to work, but now today they changed something in the model or I changed my prompt and now I get something that’s not actually secure. So I’m not very bullish on the “AI is gonna replace this” angle. That strikes me, at least with what I’m seeing today, as not super plausible.

I think the place where AI can get interesting on the DevOps world is what’s called retrieval augmented generation, RAG. I don’t know if you’re supposed to pronounce it as RAG or not. You know, if this is like the JIF GIF debate. But anyway, I’ll call it RAG. And the idea here is you take one of these large language models, and then what you add is some additional context, usually some sort of database with extra up-to-date information about your very specific use case. And so this thing can combine these two pieces of information, the model it was trained on and your up-to-date info, to give you, hopefully, much better responses about your specific context.

And I think there’s two ways that that can be interesting. The basic one, and that’s the one we do see today, and there’s a bunch of tools trying to do this, is you feed it the information about your infrastructure as it is today. Here’s how I deploy things. Here’s all my metrics. Here’s all my structured events going through my systems. And now that’s the additional context that I give these RAG tools. Now they can answer questions for me about my own infrastructure. So if I have an outage, I can say, what changed? What did we just deploy? What happened? What metrics changed, right?

And so we’re starting to see that. You know, there’s like a tool called Honeycomb, and it’s starting to use AI to kind of intelligently help you debug outages, for example. We’re seeing that in some monitoring tools where they are trying to predict, you’re about to have an outage based on some kind of metrics. I think that stuff is pretty interesting and I do think eventually it’ll be reliable enough that you can really use it and really accelerate either preventing outages or debugging outages and under even just navigating and understanding how your infrastructure works. I think this could be very useful.

The one that I’m more excited about, but I don’t know if we can do it, is if I can feed in not just my context, but the context of a thousand other companies. And now these AI models know, can see these common patterns amongst all the companies. So when there’s another security vulnerability, it’s not me having to go in there and say I need to update my code. It’s this model can say, hey, I saw that 957 other companies just updated their OpenSSL library. Here’s a patch to update yours, and I’ve already rolled it out into your test environment. Does this look good? Should we push to prod, right? It’s the ability to see patterns that everyone is facing and to extrapolate those into my case. I think that would be an incredible thing. Because that’s the reality, right? Like there are a thousand DevOps engineers at a thousand companies doing the same stupid thing over and over and over again. But we don’t get to leverage a lot of that work, so maybe it’ll help.

The challenge, as far as I can tell, is how do you get one of these large language models to expose information about other companies without leaking that company’s proprietary data, secret sauce? Can those models tell the difference and not hallucinate and not, you know, get it wrong between what should be kept private and what’s okay to share? I think that’s going to be a challenge, but if we can pull it off, I think that could be a really profound acceleration to how we build software. So that’s the second item.

Third one that I’ll mention, a little bit tied to that, but I think it’s its own thing, is the idea of secure by default. And this is something I’m starting to see a little bit of it and some of it is like wishful thinking, and maybe if I say it enough, maybe people in the industry will start going in this direction. So what do I mean by this?

The analogy I use actually has to do with elevators. There’s this really famous demonstration, and there’s a little bit of controversy of whether it actually happened this way, but it makes for a great story. So back in the early 20th century, we wanted to make our cities taller, you know, taller and taller buildings. And we had elevators, but people were terrified of the cable snapping and plunging you to your death.

And so this Elijah Otis guy, of Otis Elevator, comes along and he does this amazing demonstration in front of huge crowds where he has this open elevator shaft and he has his assistant lift him up and he’s, Elijah is standing on the elevator as it’s raised to like the fifth story or 60 feet up or something like that. And then the assistant comes up with a knife and cuts the cable of the elevator while Elijah is standing on it. And the elevator drops about an inch and then immediately stops. And Elijah’s like, all safe, all safe, gentlemen, everything’s fine. And it’s a really cool demonstration.

And the way that it works is incredibly clever. The basic idea with what’s called the safety elevator, this thing that Elijah came up with, and has been improved on since, is that if you look at the outside of the elevator, it has these metal hooks that stick out into the elevator shaft. And the elevator shaft has a bunch of metal teeth running along the side of it. So these hooks stick out and they grab onto those teeth and the elevator can’t move at all. And that’s its default state. It cannot move. The only way to pull these hooks in is if there’s an intact elevator cable. If there’s an intact cable, it kind of pulls up and these little springs come in. And these little hooks come into the elevator, now the elevator can move. And as soon as the cable is not intact, if somebody cuts it, they spring right back out into the shaft and the elevator can’t move.

Why do I bring this up? What’s so clever about this design is that it is secure by default. The default state of the elevator is safe. It cannot move. It cannot fall. And the only way it moves is when something else proves that it’s in a safe state, which is, you know, the cable. And so that made people confident enough to ride in elevators, transformed cities, allowed us to build tall buildings. Really cool stuff. So right now, I would argue that the state of software delivery, DevOps, etc. is not secure by default. We are very much in that early 20th century state of like we want to build fancy tall buildings and infrastructure, but we can’t because we’re afraid of plunging to our deaths.

It feels like all the defaults are not secure, right? Usually you build something and like, all the networking is wide open. Nothing’s encrypted. If you have third party dependencies, nobody verifies, you know, where the hell they came from. Nobody’s keeping them up to date. There’s no monitoring. And maybe, worst of all, is a lot of vendors charge extra money for the more secure things, like using single sign on usually is only in the expensive enterprise plan. So we’re like, we’re the exact opposite of safe by default. We’re like horrifically dangerous and deadly by default, and if you pay us enough, maybe, we’ll make you more secure. So that’s bad.

The good news is there are some signs the industry is starting to move towards a more secure by default state. And there’s a whole bunch of little signs and hopefully we’ll get more. And I’ll mention just a few of these. One that’s been appearing more and more recently is this concept of shift left. And the idea here is if you kind of look at like a, a diagram of how something goes from code to being deployed on the site, you know, from left to right. There’s like write the code, test the code, deployment, etc.

The idea with shift left is to move security testing further and further left. In other words, closer and closer to the development of the code. So you catch security issues as early in the life cycle as possible, which is when they’re easiest to fix and is the most secure. You certainly don’t want them all the way on the right when you’re already in production and you’ve been hacked. That would be the catching it on the very right side. So there’s a whole bunch of these shift left tools from things that allow you to enforce policies around your code, all sorts of automated testing tools. So those are really nice to see. Basically, thinking about security really early on.

A second pattern is, you’re going to hear a lot more, I think in the future, about supply chain security. So in the world of software, your supply chain is basically all the software you depend on, right? All of your open source libraries that you use, all the vendor libraries, even the cloud that you deploy into. Those are all software written and maintained by somebody else. They’re part of your supply chain. And the reality is if somebody compromises any part of that, they can do some amazing, amazing damage. And hackers have caught on to this and they’re trying real hard.

I think these days, I forget the exact number, but something like 70 percent of the code that a typical company deploys is not written by that company. And I guess that’s an underestimate. I think that’s just the open source portion. I think if you factor in things like, you know, the cloud that you’re running on and the Linux operating system and all the tools on there, it’s probably like 99 percent of the code that you rely on, you didn’t have anything to do with it. And hackers know that. So if they can go into some little library hidden somewhere in Linux and put a little backdoor in there, well, they can take over everybody’s software, right, and everybody’s hardware.

So this is a big battle that’s happening right now is how do you secure the supply chain? How can you be confident that all the software you depend on is the software you think it is and is secure and maybe they shifted left and has been tested? And so there’s a lot of interesting technologies emerging in that space. I think it’s super, super early days. I think our answers there are still really weak, but I think I’m so happy to see someone is thinking about it at least. So supply chain security is going to be, I think, a big, big deal.

Another trend with secure by default is a push to move to memory safe languages. Again, this is moving up the abstraction ladder, as I mentioned before. It turns out that something like 70%, again I don’t remember the exact number, but very, very high percentage of bugs, security bugs specifically, are due to memory safety issues. You take a language like C, where you have, you know, pointer arithmetic and things like that, and you put a little bug in there, you didn’t quite do the arithmetic. And somebody can pass in some little input and makes you read or write to memory they’re not supposed to. It turns out 70 percent of the security issues we face as an industry, 70% are due to that issue alone. And those bugs do not exist for the most part in memory safe languages. You just can’t do them in higher level languages where memory is managed automatically. So if we switch programming languages, 70 percent of our security issues go away. Like that’s a huge, huge deal.

And so you’re starting to see that the US government is now pushing to move away from languages like C and onto languages like Rust and Go and things that are more memory safe. And I think that’s a really good thing. That’s a really hard thing because like, all of our operating systems are written in not memory safe languages. So there’s a tremendous amount of work to do there. But if we can do it, we will make things vastly, vastly, vastly more secure by default.

And then the final one, another pattern that’s appearing more and more is this idea of zero trust networking. I talk a lot about this in the book. The kind of the older way of doing things is the moat and castle approach. The idea is you create this really secure perimeter around your infrastructure. A little bit like a castle with a moat, right? You have a really tall wall and you have a moat with alligators, so the perimeter is really hard to, is really secure, really hard to get through. But once you’re inside that perimeter, you can do whatever you want.

And the equivalent of that in the software world, in the networking world, is you have really, really strong firewalls and things like that on the edge of your network. But once you’re in, if you somehow found a way in, well, now you can access the wiki page over there and you can access that service over there and you can access that database and do whatever you want. You know, you have free reign once you’re inside. Turns out that approach used to make sense when everybody was in an office owned by the company, all the infrastructure was in, this is probably the same building owned by the company, all the computers you used were in that office owned by the company, right? It kind of worked. That’s not the world we live in, right? We all now work from home, the network now extends into like coffee shops and libraries and co-working spaces. Your infrastructure isn’t in your office, it’s in somebody else’s cloud. The devices you’re using are these smartphones, right, that, uh, are not particularly secured. So it doesn’t really make sense to use the moat and castle model anymore.

The new model is zero trust networking. And the idea is your location in the network, the fact that you’re in the network, doesn’t give you any special privileges of any kind. Every single request, every connection, must be authenticated, authorized, and encrypted. Every single thing. So just because I’m able to access the wiki page does not give me any access to that database or any access to that issue tracker or anything else at all. Every single thing has to authenticate, authorize, and encrypt separately. And we’re starting to see that appearing from vendors. It’s not the default. It’s very far from being the default. But things like service meshes and tools of that sort are starting to at least make it a little easier to do it, because it’s really hard actually to do zero trust networking properly. So we’re at least making it more accessible. And maybe at some point in the future, it’ll become the default, which I think will be a tremendous, tremendous improvement to security.

So yeah, those are the three patterns that I’m seeing in the future: higher level abstractions, some role for generative AI, especially RAG, and this move towards secure by default.

Henry Suryawirawan: Wow, really fascinating just hearing, you know, all the predictions and, you know, up and coming technologies and techniques that you just mentioned and shared, right? So I think definitely we are optimistic to the future, right? So higher level abstractions, memory safe language, secure by default. Hopefully software gets easier and easier to deliver. But at the same time also, hearing about, you know, this supply chain attack and all that, sometimes we have to be paranoid as well to what we use from the open source world.

[01:10:39] Recent Trend in Open Source License Changes

Henry Suryawirawan: Speaking about supply chain attack, right, and I know one other trend that happens quite recently is the changes in the open source licensing. And I know you are one of the core founding members of OpenTofu, right? So the alternative to Terraform. Maybe also tell us a little bit about your, I don’t know, prediction about what actually happening in the open source licensing, and what should we all know about, you know, as a software developers, what should we know about in terms of how do we protect ourselves from all these open source changes?

Yevgeniy Brikman: Yeah. So to add a little bit of context for folks that haven’t been following along specifically with like the Terraform, OpenTofu thing. The context there is Terraform and a bunch of other HashiCorp tools were open source under a fairly permissive license, the MPL license, for 10 years plus, for a long time. And built up huge communities, tons of contributors, third parties, etc. Recently, HashiCorp made a change from this open source license, shifted almost everything they have to a what’s called this business source license, which isn’t really an open source license. The short version, as it says, you can use this stuff unless you’re competitive with HashiCorp. And that’s problematic for a whole bunch of reasons. But HashiCorp is not the only one. Many, many other companies across the industry have been changing from permissive open source licenses to some flavor of these kind of business friendly licenses that basically say you can use this stuff as long as you don’t compete with us, or various other restrictions.

I have a lot of thoughts on this. I’ll try to keep it short. I’ll say first, I am a huge, huge believer in actual open source, proper open source. Not these business licenses, but you know, these permissive MIT, Apache, those types of licenses. I think they are one of the most important and valuable things we’ve done as an industry. I think they’re one of the biggest accelerators to all of software. And I personally find it devastating to see companies doing anything to hurt trust in open source. I think that is going to have profound negative consequences on the industry. Because as I said, there’s a thousand DevOps engineers and a thousand companies doing the same thing, and that’s true of every kind of software development. We are often doing the exact same things, and sometimes we can share that stuff commercially. But we’ve seen that the open source stuff, especially for things like infrastructure, you know, for these building blocks of the entire internet and our operating systems, etc, they need to be open source, truly open source, to be reliable. So to an extent, I’m personally just devastated to see the industry again and again move away from these licenses. I’m hopeful there are a lot of people like me.

And so, now we can ask, well, how do we solve this? And what led to this? I don’t have any internal visibility into the companies that made these license changes, so this is me speculating. I’m sure they can tell you, you know, a more accurate story. But, what I’ve generally seen as a pattern is the things that were open sourced by companies that are trying to do hyper growth. Typically, these are companies that are venture backed, either trying to be public or just became public. And their goal isn’t to just make a profit. It’s to grow massively huge, to produce tremendous investment returns.

What I’ve generally seen is these are the companies that are doing these license shifts on us. And it makes sense, right? They, that’s literally why they took venture capital. It’s to, you know, produce that kind of growth. So in a sense, that was part of the contract. But we, I don’t think those sorts of license shifts have happened in cases where someone isn’t trying for that growth, right? It’s rare to see somebody shift license just because like they can’t afford to pay two developers, right? Like that’s, it happens and we need a better way to do that. But it’s mostly the VC-backed company that isn’t happy with making only hundreds of millions of dollars, they need to be making billions of dollars. And if you need to be making billions of dollars, yeah, you got to find every little lever and advantage and eliminate competition and basically build a monopoly. Hence, the license change to be not competitive with them.

I think there’s a few key takeaways from that. I think if I was picking technology for my company to depend on, especially like really important foundational technologies that would be very painful to swap out. First of all, I would only pick things that, at least as of today, are on some sort of truly open source license. Chat with your lawyers about what that means, but it’s usually things like MIT, Apache, MPL, and a handful of others, Berkeley, etc. Second of all, I would look very, very carefully at the companies behind those projects, if there’s a company. Some projects are just an individual developer, some are built by foundations, which I think is a much more secure thing to do.

But basically, look who is behind this project. If it’s owned and primarily operated by a venture backed company, you know, somebody that wants to go hyper growth. Another pattern might be somebody that got acquired by private equity firms. That’s a very similar kind of set of incentives. Be really, really cautious. I think at this point, you just need to think twice if that’s a good idea to adopt or if you’re better not. Because eventually, the incentives are going to push them to change the license. That’s just what we’re seeing. And instead of venture backed companies, try to pick tools.

The best version I think is if it’s something that’s managed by a foundation. Linux foundation, CNCF, things like that. Because those are specifically designed so no one company due to whatever incentives they have can go and mess with the license. That’s the best option. Second best is going to be either like solo developer, which has the advantage of, well, if they can make enough money or have enough spare time, they can make it work. And they’re not likely to chase, you know, venture backed returns. Or possibly something that’s backed by a large company where they’re not trying to make money off of this piece of open source, right? You know, Google open sources stuff, Facebook open sources stuff, they’re not trying to sell a lot of that stuff. Those can be reasonably safe bets as well, but the foundation one is the best. So that’s, I guess, in the short term, what I would really look for. There’s a lot of other things you need to look for when picking open source libraries or anything else, you know, part of supply chain security. But at least from a licensing perspective, that’s kind of the hard lesson learned for me is be really wary of who is behind the project and what license it has.

I think in the longer term, this is something we need to grapple with as an industry. What does open source mean? Do we need to start creating, and maybe there are some out there, licenses that basically say, hey, this is not only open source, but I also pledge to never swap this thing. You know, I legally prevent myself from swapping this to some sort of not open source license in the future. Maybe we need some sort of guarantees along those lines. But even more generally, you know, how do we fund open source? And this is something a lot of people have been thinking about. There are some interesting directions that I’ve seen. I haven’t seen anything that seems like a great solution, but there are at least some positive trends for solo developers or really small teams where you can make like a good salary to live on and to be able to afford to focus on open source. And I think that’s great.

And so really what we’re going to have to reckon with as an industry is do we trust companies with hyper growth as an incentive to release something open source that is core to their business. So I should put that as a big asterisk. Again, it’s with HashiCorp, you know, Terraform is what they make money on. With MongoDB, MongoDB was what they make money on, right? So if it’s the thing they’re trying to monetize and some aspect of it is open source, should we really trust that? Or is that just like clearly an anti-pattern in our industry? Is there a way we can build trust around that? Um, I think that’s something that smarter people than me will have to look at and figure out. But yeah, so I guess that’s, that’s where my mind goes with that stuff today.

And the final thing is if you do pick projects that have truly open source licenses, Apache, MIT, etc, then if the worst thing happens and somebody does change the license or in any way move the project in a direction that you really don’t agree with, well, the whole point of open source is that you have the right, honestly, even the responsibility to fork it. And, you know, you still can use the code and you’re not completely stuck. So that is what we’re seeing happening with a lot of these projects. We are seeing major forks being developed. OpenTofu for Terraform as an example, OpenBao for Vault and many others. OpenSearch, Elasticsearch. So that’s kind of the fallback, but nobody wants to fork things, right? I, we ended up forking Terraform. It’s not great. We asked HashiCorp to not change the license. I think that’s the best option. You want to have one joint strong community. That’s the best option for everyone. And that can only happen around a truly open source license. But if things happen and you don’t have any other choice, well, make sure that you at least have the option to then fork the code and do what you need to do as the backup.

Henry Suryawirawan: Thank you for explaining the intricacies of these open source changes, right? Because I find that many engineers just use open source, because it’s so simple. You know, like, I don’t know, npm install, whatever install, right? But they don’t actually understand the true intricacies behind open source. And especially these licensing changes, right? Because sometimes it can affect, you know, the software that you are delivering, right? Like you mentioned 70 percent to maybe 90 percent things that we use now are relying on open source technology and dependencies. So if something changes, definitely, it can break your software altogether, right? Or you have to find a way to rewrite or change some parts of your code base, which is not easy to do and typically can be risky.

And I personally would like to thank you as well for championing, you know, this OpenTofu. You know, making sure that open source stays truly open source. I think it’s really, really important as you mentioned, right? Open source has been accelerating the progress in the world today. So many technologies, so many apps being built simply because of the existence of open source technologies.

[01:20:32] 3 Tech Lead Wisdom

Henry Suryawirawan: So Yevgeniy, thank you so much for your time today. I think we covered a lot of things and I’m sure listeners here are quite fascinated by some of the things you mentioned. Unfortunately, due to time, we have to, you know, wrap it up soon. I have one last question, which I always ask my guests, what I call the three technical leadership wisdom. You can think of it just like advice. Maybe if you can share something for us to learn, maybe from your journey or from your experience, what would they be?

Yevgeniy Brikman: I’ll keep it really, really short, just repeat a few of the points we talked about earlier. One, never stop learning. Always, always, always carve out time for learning. Two, share what you learned. And three, do things incrementally and iterate, iterate, and iterate. That’s it.

Henry Suryawirawan: Beautiful! So I think that kind of like sums up pretty nice, the conversation that we did today. So Yevgeniy, if people want to, you know, reach out to you or speak to you, ask you more questions, is there a place where they can find you online?

Yevgeniy Brikman: Yeah, I’m on most of the typical social media things. I’m on Twitter, LinkedIn, Facebook, etc. And my homepage is ybrikman.com. If you Google my name, anything else, one of those things will come up. Happy to chat on any of those.

Henry Suryawirawan: All right. So thank you so much, you know, for speaking with me today. I hope you good luck for the writing the last part of the book. So looking forward for the release of the book. So thank you so much for your time, Yevgeniy.

Yevgeniy Brikman: Thanks for having me. It was fun.

– End –