#137 - Lean DevOps: A Practical Guide to On-Demand Delivery - Robert Benefield
“It’s not about the tools or processes. Most important is you understand the target outcomes for your customers and establish the right level of shared situational awareness across the teams.”
Robert Benefield is the author of “Lean DevOps: A Practical Guide to On Demand Service Delivery”. In this episode, Robert shared insights on how we can apply the Lean DevOps mindset for building successful IT delivery organizations. Robert started by sharing what initiated him writing the book and how it differs from the other available DevOps books. Robert described the concept of on-demand service delivery and important concepts, such as knowing the target outcomes, building situational awareness, and making effective and timely decisions based on the OODA loop. Robert also shared a few practices and techniques he outlined in the book, such as mission command, workflow board, queue master, service engineering lead, value stream mapping, and Einheit.
Listen out for:
- Career Journey - [00:03:58]
- Writing a DevOps Book - [00:14:14]
- On Demand Service Delivery - [00:18:58]
- Mission Command - [00:21:42]
- OODA Loop - [00:26:56]
- Building Situational Awareness - [00:33:16]
- Workflow Management - [00:39:43]
- 3 Tech Lead Wisdom - [00:49:41]
_____
Robert Benefield’s Bio
Robert Benefield is an experienced technical leader who has decades of experience delivering robust on-demand services to solve hard problems in demanding ecosystems including banking and securities trading, medical and pharmaceutical, energy, telecom, government, and Internet services. His continual eagerness to learn and work with others to make a difference has taken him from building computers and writing code in the early days of the Internet at Silicon Valley startups to the executive suite in large multinational companies. He shares his unique experience in the hopes that others can continue to build on it without having to collect quite as many scars along the way.
Follow Robert:
- LinkedIn – linkedin.com/in/robert-benefield-25482
- Website – leandevops.com
- Twitter – @leandevops
Mentions & Links:
- 📚 Lean DevOps: A Practical Guide to On Demand Service Delivery – https://www.amazon.com/Lean-DevOps-Practical-Service-Delivery/dp/0133847500
- 📚 Learning to See – https://www.amazon.com/Learning-See-Stream-Mapping-Eliminate/dp/0966784308
- Lean – https://www.lean.org/explore-lean/what-is-lean/
- The Fighter Mafia – https://en.wikipedia.org/wiki/Fighter_Mafia
- OODA – https://en.wikipedia.org/wiki/OODA_loop
- Kanban – https://www.atlassian.com/agile/kanban
- Spotify model – https://www.atlassian.com/agile/agile-at-scale/spotify
- Cynefin – https://en.wikipedia.org/wiki/Cynefin_framework
- Marc Andreessen – https://a16z.com/author/marc-andreessen/
- Howard Gobioff – https://en.wikipedia.org/wiki/Howard_Gobioff
- Gene Kim – http://www.realgenekim.me/
- Jez Humble – https://www.linkedin.com/in/jez-humble
- Dave Farley – https://www.davefarley.net/
- Mary Poppendieck – http://www.poppendieck.com/
- John Boyd – https://en.wikipedia.org/wiki/John_Boyd_(military_strategist)
- David Anderson – https://djaa.com/david-j-anderson/
- Apache Zookeeper – https://zookeeper.apache.org/
- Apache Hadoop – https://hadoop.apache.org/
- Energy-maneuverability theory – https://en.wikipedia.org/wiki/Energy%E2%80%93maneuverability_theory
Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.
Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.
Career Journey
-
My true interests have all been around understanding the dynamics of systems. So I always wanted to know how things work. But most importantly, what I wanted to know was actually how things break and what causes them to break.
-
This goes with not just technical systems, but also biological and social systems. There are always factors that are unknown or poorly understood that cause the unexpected to happen. Learning how to see what you can and then figuring out how to effectively manage the repercussions of that helps you actually improve your chances of success.
-
I learned the importance of actually having a shared understanding of what’s going on and the importance of understanding the desired outcomes and the means for maintaining enough situational awareness to make effective decisions. I saw with my own eyes that any weaknesses could lead to bad decisions and failure, even when you had the best technology money could buy at your fingertips.
-
I learned what it takes to build effective and scalable, and ultimately successful delivery organizations, how to build highly productive teams. And to figure out what it was that actual trade off. And I realized that it wasn’t just really the technology and tools or even the quality of the engineers.
-
What was most important was that you actually understood what were the actual target outcomes that your customers, the people actually use your stuff are trying to get to. And then being able to establish the right level of shared situational awareness across the team. To be able to allow everyone to be able to make the right decisions, to be able to safely fail, and to actually be able to learn and improve from all of that.
-
That’s a lot of where the Agile and DevOps side of things, they tend to miss in a lot of the implementations. People focus on the tools, they focus on the processes. And while they can be effective in reducing some of the noise, the problem is that they miss the point of what is it they are actually trying to do? What is it you’re trying to actually solve? How can you go through and improve the quality and the timeliness of the decisions that you’re trying to make?
Writing a DevOps Book
-
The biggest thing that most people miss: It’s not about the tools. Tools will come and tools will go. And I can tell you in my career, it will come, it will go, there’ll be something else.
-
And so if I just focus on the technologies, I’m missing the point. The point is, how can you go through and make better decisions? How is it that you’re able to achieve the actual outcomes? This is something that actually drives me a bit crazy with a lot of the DevOps community. People focus so much on outputs. They focus on uptime. But they don’t actually think about what is it that the customers are trying to do and what’s important.
-
That’s why I’m constantly going through and saying, what is it? What is your purpose? How is it that you actually know what’s going on and how you’re meeting your purpose? What is the data that the information, the awareness? Because it’s not just what you know and not just the data that you have, but what your ability is to pull those things together to be able to go through and make effective decisions.
-
Those are things that really come by actually understanding that True North of what actually matters to the customer. That is something that time and time again in the community people miss. They miss what is it that I’m trying to do? Why am I doing it? And that is the big thing that I see in the community that I really, really, really want people to learn and improve.
-
It’s not that the tools are bad. Tools are great. But understand what the tools are doing. Understand how it is helping you. Understand what it is that it’s simplifying and hiding from you. And understand if that is actually something that doesn’t actually matter or ultimately is gonna be important to you.
On Demand Service Delivery
-
Why I talk about on demand service delivery is if you’re in a shrink wrap software world or one where you have low engagement, low cycle, low tie to your customer, you can ignore a lot of things. Or you can push back onto the customer and their IT organization. But if you’re doing on-demand service delivery, much like your telephone, internet connection, all of that stuff, the quality of the whole end-to-end matters. You need to make sure that you actually understand who your customers are and what your customers are doing.
-
There’s still this difficulty for a lot of organizations, and not just the technical organizations, but actually the non-technical, managerial organizations to really understand how integral the IT service delivery, the technology and the teams that are building and running the technology are to understanding and working with the business to actually achieve what is needed.
-
A lot of people will go, okay, well, I’ll just go through and give them a bunch of requirements. I don’t need to tell them anything. Parts are parts. In fact, some people go even further and go, “You know what? Anyone can sling code. So if I just give requirements, I can outsource all that stuff and I can outsource it to some random other company that doesn’t need to know anything about my company or my industry or any of that. And I’ll get something back and I’ll be able to use it.”
-
What they quickly realize is, yeah, okay, that works for some generic things, but when it’s actually something that’s core, core to your business, it doesn’t work. Things start to fall down. Things don’t behave exactly how you would expect that they would.
Mission Command
-
Mission command is very much this mechanism that is talking about and making sure that there’s a shared understanding of what the actual objective or ultimately the target outcomes that you’re trying to achieve are. And the entire structure of it is all in and around how you enable the people that are down and on the ground to make the right decisions in order to do things.
-
And why is that important? That’s important because the people that are down on the ground are the ones that are gonna have the most context of exactly what’s going on. And if they don’t actually understand what the underlying intent of what you’re trying to achieve is, they’re going to do whatever they’re gonna do.
-
Mission command flips it and says, I’m going to tell you what it is that our goal is, and I’m gonna tell you what the constraints are of what you can’t do.
-
And so what you do is they call this a briefing. You give them a briefing and you don’t tell them what to do. You tell them what the goals are and you tell them what the anti goals are, what are the constraints. And then what you do is you say, okay, go off and tell me what you think you’re gonna do. And so what happens, lower down people will then go down and they’ll talk amongst themselves. And sometimes, if they’re like middle management, they’ll go down and down and down until he gets the people at the bottom. And they will go in and they will come up with some ideas and some plans, and they’ll come up with something that’s called a back brief.
-
Back brief is, this is how we think we might do it, and here are some questions that we have, and here’s what we think that we might need. This allows them to have the flexibility, to be able to have the conversation. So back brief is very much a conversation. They’ll present back, the people above will go, okay, this isn’t quite right. And they’ll tune and they’ll tune until they’re given the right resources and everything to go out in the field.
-
Then it’s not done. Now the people are out in the field. That’s when the crud hits the fan. And as they say, no plan lasts into battle. And so, again, the people understand what the ultimate objective is. They understand where the anti goals, where the constraints are. And they can then go through and do what is necessary in order to meet the objective.
-
I think is really important with on demand service delivery. We hire smart people, and we need to give them an understanding of what the goals are and give them what the constraints are that they need to work with, and then allow them to use their brains to be able to go through and make the right decisions. And that’s something that time and time again, I see technical and non-technical people like completely missing.
-
People want to be proud of what it is that they do and not just the technical parts. They wanna be proud of what is it that their contribution ultimately is able to go through and achieve.
OODA Loop
-
OODA loop is probably also one of the most, along with Lean, it’s probably some of the most misunderstood things out there. OODA loop is all about decision making and how to go through and make more effective, more timely decisions.
-
Ultimately, what [John Boyd] found out was the first thing that was important was that you needed to actually understand what is the outcome that it was that you needed to do. What he realized was it was all about how quickly you can make decisions. And if you can make decisions more effectively than the enemy, and more effectively means both quicker and better, then you’re gonna win.
-
And the best way to do that is first understanding what is it that you’re trying to actually do? What is the outcome? Then be able to go through and he has this Observe and Orient, which is the first two Os. How do you go through and pull that information together? How do you go through and build context, so that you can then hit that D, Decide and Act? And if you can get within the enemy’s decision loop, meaning that you’re able to make decisions faster and act on those decisions faster than the enemy can actually figure out how to observe and orient, you can outmaneuver the enemy all day long.
-
If you can actually understand how the decision-making process works, and even more importantly, you actually can understand what can get in the way of making decisions, and this is everything from biases. So we all bring our own biases to work, whether it is our favorite technologies, languages or ideas about how things work or not work, or like information flow. Am I getting the right information at the right time? Am I getting it in the right context or am I getting too much information in the wrong context? And then the poor framing of the problem. And the context of the information to be able to decide how to go through and solve that problem.
-
Those constantly get in the way of people being able to make effective decisions. If you can understand those things and you can understand where they’re coming from, you’re not gonna be able to stop them all, but it’s gonna be able to allow you to set the right trip wires and filters. To be able to go through and at least catch when those things happen. And then be able to learn and fail to figure out how to improve and overcome those.
Building Situational Awareness
-
There are a lot of things that are floating around in all our organizations. How can you go through and organize the information and filter the information such that you are able to learn to see?
-
And what I do is, I tend to have the top of the workflow board. The one I use in the book is called the queue master. In essence, this is the person that is there to watch what’s going on on the board. They’ll often be the one that is fielding things that are coming in randomly. And they’re the ones that will go through and work with the team to go through and make sure that work that gets stuck, they can get unstuck. They’ll help go and fix this. So it ends up being a combination of, like in Scrum, kind of like a Scrum Master part, product owner part. Like how am I gonna go through and help the team succeed?
-
The other thing that this person does is they’re allowed to go through and take a step back and look at what is going on in the big picture. Because they can see what’s going on, they can see the mayhem. And all it is, this is about helping the team be able to see what’s going on. I have some, again, pattern, processes in there for being able to go through and get the team to be able to understand what’s going on. To be able to go through and get a debrief to be able to allow the teams to learn and improve from what’s going on.
-
I have the concept of what I call a service engineering lead. It is kind of that expert to be able to allow you to go through and make sure that if you need help or you’re missing something, that they can go in and actually help you go through and improve and get better. They’re not a policeman. They’re not the one that’s gonna say whether or not you can go live or any of that stuff. But they’re there to be able to go through and help make sure that everything’s coming together. They make sure that if you’re working in an ecosystem where you have lots of other teams that are doing things, they will work to help to make sure that you have connections with those other teams.
-
They’ll set up tribes and squads and chapters. And what I’ve noticed is things will work quite well within the tribe. And tribes will tend to be aligned to a product line. But the problem is customers don’t tend to use a product line. Customers go across products. And the problem that you have is when a customer is cutting across products, those always break down in companies that have silos. Silos of whether they’re agile silos, Spotify silos, waterfall silos, whatever.
-
I often will create a highly effective team. It often looks like a squad. But really, it’s a kind of cut on the service engineering lead model where they cut across the various different squads to go through and help stitch things together to help people be able to learn to see. Be able to help them be able to make better decisions and be able to understand what the actual outcomes are and be able to deliver those.
Workflow Management
-
What I try to go through and do is, I try to go in and I try to build the instrumentation mechanisms. So workflow board and queue masters and those sorts of things. I’ll also instrument up things such as, I really like to try to go through and get a lot of things in around code metrics. Do I have a really complex branching strategy? What’s going on as far as code churn? Do I have lots of people touching the same code? Those give me little ideas of things that are going on. And it’s the same with being able to build observability and instrumentation within the production environment.
-
And then what I try to do is I try to then rebuild and reduce noise, and build a common understanding of what’s going on within the teams.
-
Then something that Boyd found out, which is something called Einheit. So Einheit is togetherness. This is being one team. This is knowing your team members. This is knowing like good, bad, ugly, building real relationships with them. And what Boyd found out was that when both the team themselves and the people that are the leaders or managers of the team actually get to know the team and they get to know each other, you can then understand how people are going to approach a problem. You’ll understand where their strengths and weaknesses are, and you can go through and help them. And you can help them in ways where you’re not getting in the way. And you can do it in a way where you don’t have to actually say a lot.
-
If you’re actually able to build that and you’re able to build that level of trust within the organization, then that allows for when people make mistakes, they’re not gonna hide them. They’ll be able to go through and surface them. They’ll be able to work together with others to go, “Hey, I made this mistake. Is there a way that we can make it more difficult to do this mistake in the future?” And understand what the root cause of that is. Are there ways to be able to go through and understand like, why is it that work is taking too long?
-
And this gets into the friction side of things. Friction are all those various different things that get in the way of you being able to not only make effective decisions, but to be able to act effectively.
-
What I notice is lots of people focus on throughput and output measures. How many features can I have? How quickly can I turn things around? Those aren’t super valuable. What’s more valuable is being able to go through and understand what are the root causes that are happening underneath? Do you have people that are task switching all the time? Task switching is really bad.
-
Another one that I noticed, and this gets back into what I was talking about earlier with silos, is you’ll get over specialization. I decided to break down the specialization. It wasn’t that those guys didn’t continue to do database work in one example, but they needed to be part of delivering the overall outcomes. They were there to work with the teams and try to figure out how to help enable the teams to do a lot of the things that they weren’t interested in doing half the time, and it allowed the team to be able to be faster. And it allowed the DBAs, they’re really focused on how can they deliver and engineer solutions that are able to go through and get there.
-
I use value streams quite a bit to go in and try to understand where things are getting caught up.
-
I give an example, at one company I was at, the delivery teams are delivering too slowly. It takes about between 14 and 18 months for something to get out there. So I did a value stream. And what I found out was that the delivery, the vast majority of the time, was actually spent with the non-technical people arguing and getting approvals and all that kind of stuff. The delivery teams, they took no more than five weeks to deliver something once they actually got it. And so I said, okay, I can improve that. I can get it down to probably four, maybe three weeks. But that’s not gonna save you the 14 to 18 months. So being able to get people to actually see and understand what’s going on.
-
And it’s the same within an engineering team. If your builds are taking too long, try to understand why. See if there are ways to be able to go through and reduce that. Go through and check your code in early and often, because again, if you’re not getting that feedback or you’re holding onto something for too long, then you’ll lose context. It’ll make it more difficult to merge.
-
I also then teach non-technical people as well how to go through and understand this information. So they don’t need to be technical, but I’ll go through and say, do you understand that there’s a lot of technical debt in these places? You can see that there’s a lot of code churn that’s happening in these places. The teams need help, or here are the places where you have a lot of defects. If you go through and focus on giving the teams time to be able to go through and deal with those, that will be able to help not only reach your outcomes, but will be able to make the things be less difficult for the teams.
-
I also go through and do the other thing of instrument up all of your features and everything that are out there. Are there ones that nobody’s actually ever using? And if they’re not using them, maybe you should kill them off. It creates waste. It’s something that has to get maintained.
-
People don’t understand that. They think that it’s a one and done sort of thing. They don’t understand that with technology, it’s a constant evolution. It’s a constant change. Everything has a cost that’s associated with it, so get it so that they understand that and that they can tune that side of things effectively.
3 Tech Lead Wisdom
-
Have a purpose.
-
This is something that’s important for everybody; that no matter if you’re the most junior person in the team or you’re the CEO.
-
Understand what are the target outcomes that you are trying to achieve. Why are they important? How do you contribute to them being able to be achieved? And most importantly, how do you measure whether you’re on the right track to progress them?
-
And outcomes aren’t outputs. They’re not feature or defect counts. They’re not lines of code. A lot of people think that O (in OKR) stands for output and the key results are measures of those outputs. No, it’s all about understanding what are the things that matter to people who drive value from what you’re trying to deliver. And if you don’t know what it is, try to find out. Because that’s what’s going to differentiate you.
-
And if you’re a leader, how do you go through and communicate that stuff to your teams? That’s gonna be important. And don’t tell them what to do or how to do it. Tell them what it is you’re trying to achieve.
-
-
Ensure effective decision making.
-
Boyd’s OODA loop is all about how do you go through and pull information and knowledge together to shape the right context to make quick and effective decisions. And it’s not about guessing. It’s about being able to go through and consider decisions critically, thinking through how things can go wrong and what might happen when they go wrong. This helps you be able to go through and test the quality of your situational awareness. It helps you filter out assumptions and biases and mitigate any unnecessary and uncontrolled risk.
-
For leaders, one of the things that I find fascinating is that most leaders don’t realize that they’re often too far away from what is happening to be able to make the best decisions at the time that’s needed. Even if they have the information, they’re not gonna be able to turn it around and be able to make it happen the right way. And oftentimes, they may think they have the right information, but they don’t.
-
And that’s where mission command comes in. Arm your people with purpose, help them get the information, give them the safety to be able to make the decisions that they’re best placed to make. And that way, what decisions you need to make at the right level where you have the time necessary to be effective?
-
And then if you’re an individual contributor, you also need to be able to be equipped to make decisions. You need to be able to realize what you know and what you don’t know, where the gaps are in your awareness. And gaps are okay. I mean, this is where Einheit is important, and understanding your teams, where information flow, and mission command come in.
-
You need to be able to feel safe to make mistakes and to be able to learn from them. And learning is how we all get better. And so, this is where managers and teams need to work together to make sure that nobody is put into a situation where they must make a highly risky decision with no support.
-
-
You really need to respect people.
-
People are your most valuable resource. You have to involve them in the decisions. You have to make sure that they feel safe and part of the decision. They need to understand the purpose. You need to make sure that you have an environment that feels like it’s a supportive team where people feel committed to helping each other pursue the target outcomes.
-
And what I notice time and time again is that if you don’t do that, you build a low trust environment. And a low trust environment is where information gets hidden, mistakes get made, but they also get hidden. And all of this damages decision making and it makes learning nearly impossible. It’s also really stressful. And inevitably, it creates a situation where the people that you can least afford to lose are usually end up being the first to go.
-
By respecting people, you make it so that those people really never want to leave. They always want to put it in all. They always want to be able to help be successful. Because, again, they’re there and they’re helping to be part of achieving the target outcome.
-
[00:01:09] Episode Introduction
Henry Suryawirawan: Hello again to all of you, my friends and my listeners. After a one week break, I am back here again with a new episode of the Tech Lead Journal podcast, the podcast where you can learn about technical leadership and excellence from my conversations with great thought leaders in the tech industry.
If you haven’t, please follow the show on your podcast app and social media on LinkedIn, Twitter, and Instagram. And to appreciate and support my work, subscribe as a patron at techleadjournal.dev/patron, or you can also fuel me with coffee at techleadjournal.dev/tip.
My guest for today’s episode is Robert Benefield. Robert is the author of “Lean DevOps: A Practical Guide to On Demand Service Delivery”. In this episode, Robert shared insights on how we can apply the Lean DevOps mindset for building successful IT delivery organizations. Robert started by sharing what initiated him writing the book and how it differs from the other available DevOps books.
Robert described the concept of on-demand service delivery and important concepts, such as knowing the target outcomes, building situational awareness, and making effective and timely decisions based on the OODA loop. Robert also shared a few practices and techniques he outlined in the book, such as mission command, workflow board, queue master, service engineering lead, value stream mapping, and Einheit.
This is such a great refresher of the Lean mindset combined with the DevOps culture and practices. And I hope you enjoy listening to this episode and learning a lot from it. And if you do, it would really be awesome if you can also share this episode with your colleagues, your friends, and your communities. And also don’t forget to leave a five-star rating and review on Apple Podcasts and Spotify. It may sound simple, but it will help me a lot in getting more people discover the podcast on the platforms. Let’s go to the conversation with Robert, after hearing a few words from our sponsors.
[00:03:26] Introduction
Henry Suryawirawan: Hey, everyone. Welcome back to another new episode of the Tech Lead Journal podcast. Today, I have with me a guest named Robert Benefield. He’s the author of a book titled “Lean DevOps”. So if you think this is just another DevOps book, I think when you find the book and you read it, I think it really offers something different from the other DevOps books that I have read in the past.
So, Robert, today I’m really looking forward to discuss about topics from the books and maybe learn something new from the angle or perspective of DevOps compared to the other books. So welcome to the show.
Robert Benefield: Thank you.
[00:03:58] Career Journey
Henry Suryawirawan: Robert, I always love to ask my guests to introduce themselves by telling their career journeys, especially the highlights or turning points that you feel are useful for listeners to learn from.
Robert Benefield: Okay, yeah. My career journey has actually played a really important role in the way that I think, which ultimately is a lot of what the book is about. So, while I’ve always been interested in technology, I started out playing with piles of components and soldering irons and all that sort of thing. Really, my true interests have all been around understanding the dynamics of systems. So I always wanted to know how things work. But most importantly, what I wanted to know was actually how things break and what causes them to break.
This goes with not just technical systems, but also biological and social systems. There is always factors that are unknown or poorly understood that cause the unexpected to happen. I mean, if you think about it, like, you know, we live in a crazy day with crazy politics and all kinds of things, and there’s all sorts of weird things that happen. And it’s hard to actually understand that, let alone, to be able to go through and traverse it. And so, learning how to see what you can and then figuring out how to effectively manage the repercussions of that helps you actually improve your chances of success.
So, my career, I studied aerospace engineering and international relations. Both are disciplines that require working in a system of complex systems. And so, I realized quite early on that focusing on just a fragment in isolation was a sure-fail path for failure. So, I’ve been really fortunate in my career. Sometimes I feel like I was like Forrest Gump and how lucky I’ve been, so a lot of it was solving problems that no one had actually ever done before, and me being too dumb to actually realize that no one had done them before half the time. And also having fortuitous circumstances.
So like for instance, I learned a lot of the concepts of Lean back in the 1980s because I happened to actually run into some industrial engineers from Matsushita. And so I was interested in engineering and they showed me industrial engineering and Lean. And so I took Lean as actually being industrial engineering and I took a lot of the learnings to heart. Similarly, later on I ended up working for the US government, and I met some of the Fighter Mafia, of Boyd’s Fighter Mafia people, and the concepts around OODA and decision making. And again, I thought that was normal. And I thought, oh, this is how you actually go through and make effective decisions.
And from there, I actually learned the importance of actually having a shared understanding of what’s going on. And the importance of understanding the desired outcomes and the means for maintaining enough situational awareness to make effective decisions. I saw with my own eyes that any weaknesses could lead to bad decisions and failure, even when you had the best technology money could buy at your fingertips.
So again, at the government, I was lucky that I was one of the teams that was originally tasked on putting the government online and built the first news site ever, and the first digital audio site ever on the internet. This was before web browsers. This was before the NCSA web server was out there. We did it with FTP and Gopher. So that’s how long ago it was.
I then went to a company that was building electronic trading and crossing systems for investment banks. And again, it was a brand new world. Very few people had actually been going through and working on this thing. And we were fortunate enough that we got bought by Jefferies cause it was a startup. And one of the things that was important there was, you needed to understand the business. We worked closely with the traders. We were technical, we were helping build the trading strategies. And we really needed to be in tune with what were the desired outcomes that people were trying to go through and achieve.
Then I went to Silicon Valley at the beginning of the Dot-com boom. And managed to go through and help a bunch of startups get off the ground. Ones that many of us have heard of. Did road shows, and also crashed some. And learned from that. Back then in the Valley, it was a really small world. Everyone knew everyone. And I got in arguments with Marc Andreessen about things. I helped Howard Gobioff, he was one of the early people at Google, with helping getting Google off the ground. So it was a really interesting world to be able to go through and try things out.
And since then, I learned all these interesting techniques. And I learned what it takes to build effective and scalable, and ultimately successful delivery organizations, how to build highly productive teams. And to figure out what it was that actual trade off. And I realized that it wasn’t just really the technology and tools or even the quality of the engineers. I mean, sure that’s important, kind of like having enough cash on hand to be able to go through and do things. But what was most important was that you actually understood what were the actual target outcomes that your customers, the people actually use your stuff are trying to get to. And then being able to establish the right level of shared situational awareness across the team. To be able to allow everyone to be able to make the right decisions, to be able to safely fail, and to actually be able to learn and improve from all of that.
And I think that’s a lot of where the Agile and DevOps side of things, they tend to miss in a lot of the implementations. People focus on the tools, they focus on the processes. And while they can be effective for reducing some of the noise, the problem is that they miss the point of what is it they are actually trying to do? What is it you’re trying to actually solve? How can you go through and improve the quality and the timeliness of the decisions that you’re trying to make?
And so, I figured all that stuff out, and then I thought, okay, let me go try some things out. So I went to a startup and we were working on the collaborative software development environments for both large companies as well as open source. So we helped HP Imaging and Printing, as one example, rebalance all of the features that they had in all the printer lines. And they managed to reduce their delivery life cycle from 36 months to getting a new feature out, down to about two or three months. You know, big thing. Really big thing.
And then from it, when I was there, I started to really build a lot of the DevOps centric model that I actually talk about in my book. So how to go through and actually work closely with the customers, how to go through and combine the operational side of things with the engineering side of things. We were a truly software as a service, as was back when I was working at the investment bank. We provided all of our stuff as a software as a service, which in the mid nineties was completely strange. Nobody had actually thought of that.
So then I thought, okay, well I’ve done this, now let’s go and try to do this at scale. So a bunch of the people at the company, we went to different places. A lot went to Google, some went to Facebook, I ended up going to Yahoo. And at Yahoo, that was a really amazing experience. I was given a complete freehand to do and show what was possible there. They had just signed service level agreements with AT&T and with Rogers and with BT, and they were trying to figure out, how do we do that? How do we go through and actually build things at scale with high availability, high reliability?
So one of the things that I worked with people on was pioneering, we called it service engineering, which looked very similar to what SRE looks like at Google and Production Engineering looks like at Facebook. We built automated CI/CD pipelines. We called them code pipes. This was before, again, there was such a thing. We created a rudimentary container system before that stuff existed. We also created what I consider to be a real masterpiece of how to go through and actually manage all of the software and configurations and services all throughout the whole ecosystem and how to do that at scale and how to authoritatively understand what was actually going on. And then most importantly, how to go through an instrument up all of the software and services and everything like that to really figure out what the customer’s experiences were and figure out most importantly, what matters.
So with all of that, we started doing a lot of really amazing things. Out of that, we came up with Zookeeper. We were working on a lot of the big data side of things. We brought in the people who had created Hadoop and built a whole big data side of things around it. And again, all of this was really early days. And people go, yeah, it might go yeah, yeah, yeah. But you know, again, we’re talking about like 2006, 2007, 2008 timeframe. So eventually, the politics at Yahoo got to be a little much. So I decided with my family, we decided to move to Europe.
And in Europe was where I realized that all of the concepts and habits that I had picked up before, early in my life, they weren’t normal. They weren’t the things that normal people do. And so I spent the next few years carefully examining and trying to figure out how is it that I actually solve problems? How is it that I build effective teams? And are there things that I can actually go through and help teach others? Are there things that I can help to actually help others be able to be successful?
And so, I ended up working at a bunch of big companies. I worked at British Telecom. I worked at an energy company called RWE and they’re all over the world, but they’re primarily in Germany. This was also at the time where the Germans had come up with Energiwende, which is, we’re gonna shut down all the nuclear power plants. Oh my God, how are we gonna power everything? So that was an interesting challenge. I was at Skype and a bunch of other places as well. And the output of a lot of that actually ultimately came and built the book. So that’s a bit of my journey.
Henry Suryawirawan: Wow! Sitting here and listening you telling the story, right? I’m very amazed with what you have done. You seem to be at the edge, the first pioneers of so many great technologies, so many great practices. And I can tell as well, you summarize some parts of the book, some gist of the books throughout your introduction as well, right? Things like target desired outcome, situational awareness, and things like that.
[00:14:14] Writing a DevOps Book
Henry Suryawirawan: Which is the interesting thing that I picked when I read this book is that I find it a bit different than some other DevOps books that I have read. For example, I’ve read DevOps Handbook, Accelerate, Phoenix Project. Although it sounds very different, definitely. But can you tell us, maybe, what would be the silver lining when you write the book, right? What is something that you want to offer different from the other DevOps books out there or DevOps practitioners that have been established throughout the last few years?
Robert Benefield: So I think that the biggest thing that most people miss, and this is something I’ve talked with Gene Kim about. I’ve actually had great conversations with people like Jez Humble and Dave Farley about as well. It’s not about the tools. Tools will come and tools will go. In fact, this was something when I first wrote the book and I gave an early draft to people they went, in fact one of them was Mary Poppendieck, she’s like, why aren’t you talking about containers? And I’m like, containers is an implementation thing. And I can tell you in my career, it will come, it will go, there’ll be something else.
And so if I just focus on the technologies, I’m missing the point. The point is, how can you go through and make better decisions? How is it that you’re able to achieve the actual outcomes? This is something that actually drives me a bit crazy with a lot of the DevOps community. People focus so much on outputs. They focus on uptime. But they don’t actually think about what is it that the customers are trying to do and what’s important.
And I give a story. I was working for a company that did a lot of commodities trading. And when I came in, they said, well, we’re trying to produce all of our systems and our services, we want to have five 9s uptime. And I went, oh, that’s interesting. So tell me more. Why is it in five 9s uptime? And they went, well, our services are really important and we can’t afford to be down. If we are down, then our traders can’t trade. And I said, okay. So are your traders trading in the middle of the night? And they went, no. I said, are they trading on the weekends? No, no, no. So when is it that they’re trading? And I find out that when they’re trading, and they did have a bit of an extended day, but their extended day was like 9:00 AM to like 7:00 PM. And I went, okay, so 9:00 AM to 7:00 PM you need to be up 100% of the time. The rest of the time the whole thing could be down. And they went, “Yep!”.
So it’s not about five 9s uptime. It’s about understanding what it is that you’re trying to do. This is something I learned when I was at the investment trading, we’re building the trading systems. For us, the crosses were the most important. It was nine minutes that we had to have 100% uptime. The rest of the time, it was nine minutes since like six windows during the day. The rest of the time the whole thing could be down. But those times, everything needed to be up. It needed to be performant, it needed to respond exactly how it was that I wanted.
And I think that’s something that people miss. They miss what is it that people are trying to do? What is it that they need to try to solve? My experience of watching video is gonna be very different than if I’m actually grabbing a regular text website. If I lose frames, the world comes to an end when I’m actually watching a video. If I get a retransmit on something that’s text, I’m not even gonna notice. And you need to be able to actually understand that, understand those things, and then build your ecosystem to be able to go through and manage all of that stuff.
So that’s why I’m constantly going through and saying, what is it? What is your purpose? How is it that you actually know what’s going on and how you’re meeting your purpose? What is the data that the information, the awareness? Because it’s not just what you know and not just the data that you have, but what your ability to pull those things together to be able to go through and make effective decisions. Do I deploy this way or that way? Do I build it this way or do I do it that way? Those are things that really come by actually understanding that True North of what actually matters to the customer.
That is something that time and time again in the community people miss. They miss what is it that I’m trying to do? Why am I doing it? They miss if I do things in one way, it’s going to hide information that might be important to me. Or it might flood me with so much information that isn’t actually important. And that makes it very difficult for me to sift through it to be able to make decisions. And that is the big thing that I see in the community that I really, really, really want people to learn and improve.
And again, it’s not that the tools are bad. Tools are great. But understand what the tools are doing. Understand how it is helping you. Understand what it is that it’s simplifying and hiding from you. And understand if that is actually something that doesn’t actually matter or ultimately is gonna be important to you.
[00:18:58] On Demand Service Delivery
Henry Suryawirawan: Thank you for giving the context why you wrote this book and why it differs from others. So the subtitle of your book also I find very interesting. You mentioned “A Practical Guide to On Demand Service Delivery”. First of all, what are you referring to as “on demand service delivery”? Is it some specific thing about, you know, I don’t know, like software consulting or is it like software engineering team? Or is this something that is applicable to all engineering team no matter what? And if you can also give us some highlights, what are some of the problems that the IT service delivery are dealing with In this modern days.
Robert Benefield: Sure. So the reason I say on demand service delivery is when I worked in different types of delivery organizations. I’ve worked in ones that build shrink wrap software. And of course, I’ve worked in ones that actually have services. And why I talk about on demand service delivery is if you’re in a shrink wrap software world or one where you have low engagement, low cycle, low tie to your customer, you can ignore a lot of things. Or you can push back onto the customer and their IT organization. A lot of the heavy lifting of what’s going on. But if you’re doing on-demand service delivery, much like your telephone, internet connection, all of that stuff, the quality of the whole end-to-end matters. You need to make sure that you actually understand who your customers are and what your customers are doing.
And that part is actually really important. And that gets back into the IT service delivery teams. There’s this interesting thing, and I’ve seen it far more since I’ve been outside of the Valley, in that there’s still this difficulty for a lot of organizations, and not just the technical organizations, but actually the non-technical, managerial organizations to really understand how integral the IT service delivery, the technology and the teams that are building and running the technology are to understanding and working with the business to actually achieve what is needed.
So a lot of people will go, okay, well, I’ll just go through and give them a bunch of requirements. I don’t need to tell them anything. Parts is parts. In fact, some people go even further and go, " You know what? Anyone can sling code. So if I just give requirements, I can outsource all that stuff and I can outsource it to some random other company that doesn’t need to know anything about my company or my industry or any of that. And I’ll get something back and I’ll be able to use it." And then what they quickly realize is, yeah, okay, that works for some generic things, but when it’s actually something that’s core, core to your business, it doesn’t work. Things start to fall down. Things don’t behave exactly how you would expect that they would. And that is a big, big problem.
[00:21:42] Mission Command
Robert Benefield: And that gets back into something that I talk about quite a bit in the book. I talk about mission command. So mission command is very much this. And this, I got a lot of heat about in the beginning because there’s a lot of the world out there that finds anything that ever came out of the military to be a bad thing. Not realizing that there are a lot of interesting learnings that come from there that are not militaristic, even remotely; that are actually incredibly helpful for being able to help do the right thing.
So mission command is very much this mechanism that is about talking about and making sure that there’s a shared understanding of what the actual objective or ultimately the target outcomes that you’re trying to achieve are. And the entire structure of it is all in and around, how do you enable the people that are down and on the ground to make the right decisions in order to do things? And why is that important? That’s important because the people that are down on the ground are the ones that are gonna have the most context of exactly what’s going on. And if they don’t actually understand what the underlying intent of what you’re trying to achieve is, they’re going to do whatever they’re gonna do.
In fact, this is actually something that you can see in Ukraine right now. So the West spent quite a bit of time after 2014 teaching the Ukrainian military about mission command. They originally were in a command and control style military structure, much of like what you’re seeing in Russia. The Chinese do the same thing. So the people at the top tell the people at the bottom what to do. The people at the bottom do what the people at the top do, and they just act as an automatons. Mission command flips it and says, I’m going to tell you what it is that our goal is, and I’m gonna tell you what the constraints are of what you can’t do. You can’t go into the village and kill everybody, you know, or whatever it is. And in software, it may be, well, you have to be really careful that you don’t lose data, or some other type of concept like that.
And so what you do is they call this a briefing. You give them a briefing and you don’t tell them what to do. You tell them what the goals are and you tell them what the anti goals are, what are the constraints. And then what you do is you say, okay, go off and tell me what you think you’re gonna do. And so what happens, lower down people will then go down and they’ll talk amongst themselves. And sometimes, if they’re like middle management, they’ll go down and down and down until he gets the people at the bottom. And they will go in and they will come up with some ideas and some plans, and they’ll come up with something that’s called a back brief.
So back brief is, this is how we think we might do it, and here’s some questions that we have, and here’s what we think that we might need. This allows them to have the flexibility, to be able to have the conversation. So back brief is very much a conversation. They’ll present back, the people above will go, okay, this isn’t quite right. And they’ll tune and they’ll tune until they’re given the right resources and everything to go out in the field.
Then it’s not done. Now the people are out in the field. That’s when, you know, the crud hits the fan. And as they say, no plan lasts into battle. And so, again, the people understand what the ultimate objective is. They understand where the anti goals, where the constraints are. And they can then go through and do what is necessary in order to meet the objective.
And I gave, I think I gave an example of this in the book of, say for instance, somebody is told that they need to go out and take a hill in order to be able to understand what’s going on with the enemy, to then be able to go in and stop the enemy from being able to advance. So the outcome is stop the enemy from being able to advance. And the team is going through and they’re going towards the hill and they notice that they get behind the enemy lines and they see where they can go through and take out a command and control center. So they could continue up the hill, but if they take out the command and control center, well, that meets the objective even better than taking the hill, doesn’t it? And so they’re given the freedom to be able to go through and do that.
And that is something again, that I think is really important with on demand service delivery. We hire smart people, and we need to give them an understanding of what the goals are and give them what the constraints are that they need to work with, and then allow them to use their brains to be able to go through and make the right decisions. And that’s something that time and time again, I see technical and non-technical people like completely missing.
And when they actually see what’s possible, when you actually get people to use their brains, they’re just like, oh, I didn’t realize that. And every, like, this light bulb goes off in their heads and they’re like, oh, wow. They’re actually partners. They actually care about the business. They don’t just care about playing around with technology. And it’s like, no, actually they genuinely care. And that’s, you know, people want to be proud of what it is that they do and not just the technical parts. They wanna be proud of what is it that their contribution ultimately is able to go through and achieve. And so that’s something I think is really, really important.
Henry Suryawirawan: Thanks for explaining this concept of mission command. When I read the chapter also, I find it quite fascinating, right? This concept borrowed from military. And that also explains a lot of, why, like I interviewed some product people as well, they always talk about explain the outcome, not the output. Try to get engineering being involved in making the decisions how things should get done, not just coming from the top.
[00:26:56] OODA Loop
Henry Suryawirawan: And I think you also borrow a lot of other military concept. One is the great work by John Boyd, the OODA loop, which sometimes also quoted in the Agile methodology books, right? Maybe if you can give us also a glimpse like what is this OODA loop and how do you use it to apply in the Lean DevOps concept?
Robert Benefield: Sure. OODA loop is probably also one of the most, along with Lean, it’s probably some of the most misunderstood things out there. OODA loop is all about decision making and how to go through and make more effective, more timely decisions. So a lot of this came out of Boyd; is an interesting character.
See, he wanted to figure out how you could go through and outdo the enemy. So at first, he did what all us technologists do. He went clearly having the best technology that’s going to get you there. So he came up with this thing called energy maneuverability theory. And so this is a formula that allows you to be able to go through and determine with any aircraft, what is the maneuverability of the aircraft. You can compare different aircraft and the values of different aircraft and you can ultimately decide which one is better and which one will ultimately succeed. So he stole a bunch of mainframe computer time and ran all these calculations and everything, and came up with this amazing theory that’s still used today in aerospace.
Then what he did was, he happened to be a fighter pilot during the Korean War, and so the North Koreans were using MIG-15s and the US was using F-86s. So the US had a kill ratio that was just astronomical, just knocking out lots and lots of the enemy planes. So he went, clearly, this is gonna be an example where the F-86’s gonna be a better plane than the MIG-15. Then he ran the numbers, then he ran them again, and then he ran them again.
Every single time, the MIG-15 came up as a better plane. And he went, huh, that’s interesting. So he ran it against a bunch of other aircraft, and found that there actually was no relation between the two for when it came into combat. So then he went to try to figure out, well, what was it? So ultimately, what he found out was the first thing that was important was that you needed to actually understand what is the outcome that it was that you needed to do. He also understood that and he actually spent a lot of time studying ancient military people. He spent a lot of time working with people who had been in the Wehrmacht, that had been really massively successful during World War II.
And what he realized was it was all about how quickly you can make decisions. And if you can make decisions more effectively than the enemy, and more effectively means both quicker and better, then you’re gonna win. And the best way to do that is first understanding what is it that you’re trying to actually do? What is the outcome? Then be able to go through and he has this Observe and Orient, which is the first two Os. How do you go through and pull that information together? How do you go through and build context, so that you can then hit that D, Decide and Act? And if you can get within the enemy’s decision loop, meaning that you’re able to make decisions faster and act on those decisions faster than the enemy can actually figure out how to observe and orient, you can outmaneuver the enemy all day long.
So that’s where OODA came in place. And so, Boyd, interestingly enough, it wasn’t the Air Force that ended up taking his knowledge. Though he did come up with, and this was part of his journey as well, he thought, okay, well, if you can come up with practices and you have better practices than the enemy, then you’ll do better. And so he came up with the book, the seminal book that every military uses on all of dog fighting tactics. And he found out that didn’t work either. That that is not about practices, it’s not about methodologies. They’re not gonna get you there. But he figured out that, okay, well if you have all this information, and you’re actually able to make the right decisions, you’ll get there.
So in the Air Force, they didn’t take much of it on. They totally took his EM theory stuff. They totally took his dog fighting stuff and they thought he was a nut. But the US Marine Corps and more importantly the Special Forces units, both the US and the UK very much liked the way that he thought. And actually at Quantico, the Marine Corps actually built a library around a lot of his teachings. He spent a lot of time training a lot of the Marine Corps leadership of how to go through and make more effective decisions, how to go through and pretty much turbocharge the whole mission command side of things. To be able to go through and out decide the enemy.
So that was the reason that I put it in my book was if you can actually understand how the decision making process works. And even more importantly, you actually can understand what can get in the way of making decisions, and this is everything from biases. So we all bring our own biases to work, whether it is our favorite technologies, languages or ideas about how things work or not work, or like information flow. Am I getting the right information the right time? Am I getting it in the right context or am I getting too much information in the wrong context? And then the poor framing of the problem. And the context of the information to be able to decide how to go through and solve that problem. Those constantly get in the way of people being able to make effective decisions. If you can understand those things and you can understand where they’re coming from, you’re not gonna be able to stop them all, but it’s gonna be able to allow you to set the right trip wires and filters. To be able to go through and at least catch when those things happen. And then be able to learn and fail to figure out how to improve and overcome those.
Henry Suryawirawan: When I read this chapter, this part of OODA loop, right, it’s quite fascinating, I would say. Thinking of the Observe-Orient-Decide-Act and the associated things that are involved to do all those things, which you summarize, in couple of key points, which becomes chapter as well in the later chapters, right? Things like, of course, you mentioned it probably around 10 times already by now, knowing the target outcome. And then the second is building situational awareness about your situation, the context, and also friction, right? You mentioned about what other things that stand in the way. And the last part is about learning, right? How the team is able to learn and learn from the mistakes.
[00:33:16] Building Situational Awareness
Henry Suryawirawan: So I wanna go to the second part, building situational awareness, because sometimes in IT service delivery, in engineering team, we are always busy no matter what, right? We got requirements, we got issues, incidents, on-call duties and things like that. And even we have so many disruptions along the way. And also new technologies coming. We are basically constantly interrupted and distracted. So how can we build better situational awareness? Maybe you can give some advice here?
Robert Benefield: Sure. So actually I talk about this in some of the practical sections of the book. So my book is kind of split up into, there’s a introduction part, and then there’s theory chapters and there’s practice chapters. And so, what I talk about is a term that it’s actually the title of one of the Lean books “Learning to See”. So there’s a lot of things that are floating around in all of our organizations. How can you go through and organize the information and filter the information such that you are able to learn to see?
And so I come up with a bunch of different mechanisms and they aren’t those like “Sow it thou, thou must do” sorts of things. These are much more patterns that I’ve noticed that tend to work. So what I try to go through and do is I try to go in and look at how to go and organize a lot of the work that’s coming in. And so I talk a lot about workflow boards. And so these are more or less like Kanban boards. But I don’t get super caught up on all of the Kanban-y sorts of things. I’ve actually talked quite a bit with David Anderson and some of the Kanban folks about this. And it’s not that they’re not important. It’s that you need to actually understand what’s important in your context, which is to actually understand what’s going on in your ecosystem.
And so I try to go through and make sure that all work goes through there, including even the most simple, dumbest things. Like I’ll give one that hopefully most of us don’t have to deal with these days. Password resets and printer things or stuff like that. Being able to understand those and being able to actually capture those, even if there’s no more than a tally board, actually allows you to understand what is the size of the prize by actually fixing that problem, either by putting in some form of automation or doing something that makes it go away.
And what I do is, I tend to have the top of the workflow board. I have the concept of, I’ve used various terms for this. The one I use in the book is called the queue master. We’ve called them ops monkeys. We’ve called them all sorts of things. And in essence, this is the person that is there to watch what’s going on on the board. They’ll often be the one that is fielding things that are coming in randomly. And they’re the ones that will go through and work with the team to go through and make sure that that work that gets stuck, they can get unstuck. They’ll help go and fix this. So it ends up being a combination of, like in Scrum, kind of like a Scrum Master part, product owner part. Like how am I gonna go through and help the team succeed?
But the other thing that this person does is they’re allowed to go through and take a step back and look at what is going on in the big picture. And every place that I’ve implemented a queue master, I almost always start with the first person on the team that feels the strongest against actually doing this. And I go, okay, congratulations. You’re gonna be a queue master. And every single time, every, every, every single time that I’ve done that, by day three at the absolute latest and often before that, they come to me and they go, oh my God, I get it. Why is it that they get it? Because they can see what’s going on, they can see the mayhem. And all it is, this is about helping the team be able to see what’s going on. And I have some, again, pattern, processes in there for being able to go through and get the team to be able to understand what’s going on. To be able to go through and get a debrief to be able to allow the teams to learn and improve from what’s going on.
And I also have other patterns for whether you are doing, whereas all of the operational side of things is within a dev team or if it is you have separate operations and dev teams. And I have the concept of what I call a service engineering lead. And this is kind of similar to the SRE side of things. You don’t always have to have it. But it is kind of that expert to be able to allow you to go through and make sure that if you need help or you’re missing something, that they can go in and actually help you go through and improve and get better. They’re not a policeman. They’re not the one that’s gonna say whether or not you can go live or any of that stuff. But they’re there to be able to go through and help make sure that everything’s coming together. They make sure that if you’re working in an ecosystem where you have lots of other teams that are doing things, they will work to help to make sure that you have connections with those other teams.
And again, I’ve noticed this. I’ve gone in a lot of companies lately. They use the Spotify model and they’ll set up tribes and squads and chapters. And what I’ve noticed is things will work quite well within the tribe. And tribes will tend to be aligned to a product line. But the problem is customers don’t tend to use a product line. Customers go across products. And the problem that you have is, is when a customer is cutting across products, like for instance, I’ll do things in banks, and one of the things that cuts across all products is Know Your Customer. So this is what are all the things to be able to make sure that the customers who they say they are and they’re not doing some sort of illegal things. Those always break down in, in companies that have silos. Silos of whether they’re agile silos, Spotify silos, waterfall silos, whatever.
And what I’ve noticed in those is I often will create a highly effective team. It often looks like a squad. But really it’s a kind of a cut on the service engineering lead model where they cut across the various different squads to go through and help stitch things together to be able to help people be able to learn to see. Be able to help them be able to make better decisions and be able to understand what the actual outcomes are and be able to deliver those. So does that answer of that? I know that I kind of skipped through a bunch of things. And I didn’t talk through the friction stuff, but I can as well, cause that’s also important.
[00:39:43] Workflow Management
Henry Suryawirawan: Sure. I’m quite interested when you are sharing this story about implementing queue master and you turned a skeptic into a promoter within just a few days, right? So for people here who are intrigued as well, maybe can you tell us a little bit of summary or gist, especially I find that many enterprise, you know, big enterprise or even the startups these days, right? There are just so many things to do. The company has so many initiatives, so many parallel tracks. And especially at the bottom, the engineering team, you know, like the tribe or the squad that you mentioned, people just are not aware of what are the things that are probably happening in the company. Second thing is about dependencies. How we align different teams across each other, right? So maybe if you can align a little bit about this concept of workflow board and queue master and how a company can help to see, be able to see what is happening in the organization so that they can improve from then on.
Robert Benefield: Yeah, so, I talk quite a bit about both in the book as well as in general. I talk about how you can go through and instrument up your ecosystem. And in the book I also talk about, I’m a big fan of Cynefin as far as being able to go through and understand what sort of domain you happen to be operating in. Because that defines the way that you can make decisions. And so for people that don’t know Cynefin, it talks a lot about complexity theory. So you can have an obvious type of environment where everything can be pretty much plotted out ahead of time. All the way to complex and even chaos. And you have to take different approaches in those worlds because what you know or don’t know.
And what I try to go through and do is, I try to go in and I try to build the instrumentation mechanisms. So workflow board and queue masters and those sorts of things. I’ll also instrument up things such as, I really like to try to go through and get a lot of things in around code metrics. Do I have a really complex branching strategy? What’s going on as far as code churn? Do I have lots of people touching the same code? Those give me little ideas of things that are going on. And it’s the same with being able to build observability and instrumentation within the production environment. They’ll give you some hints at things. And then what I try to do is I try to then rebuild and reduce noise, and build a common understanding of what’s going on within the teams. So this gets into something that, again, I talk about in the book.
Then something that Boyd found out which is something called Einheit. So Einheit is togetherness. This is being one team. This is knowing your team members. This is knowing like good, bad, ugly, building real relationships with them. And what Boyd found out was that when both the team themselves and the people that are the leaders or managers of the team actually get to know the team and they get to know each other, you can then understand how people are going to approach a problem. You’ll understand where their strengths and weaknesses are, and you can go through and help them. And you can help them in ways where you’re not getting in the way. And you can do it in a way where you don’t have to actually say a lot. In fact, Boyd talks about this. It was like a German, I don’t know if it was a colonel or whatever, and a lieutenant, and they made a couple of small talk and suddenly they’re both like able to go off to the races and be able to stay totally aligned.
If you’re actually able to build that and you’re able to build that level of trust within the organization, then that allows for when people make mistakes, they’re not gonna hide them. They’ll be able to go through and surface them. They’ll be able to work together with others to go, “Hey, I made this mistake. Is there a way that we can make it more difficult to do this mistake in the future?” And understand what the root cause of that is. Are there ways to be able to go through and understand like, why is it that work is taking too long? And this gets into the friction side of things.
So, to me friction are all those various different things that get in the way of you being able to not only make effective decisions, but be able to act effectively. So what I notice is lots of people focus on throughput and output measures. How many features can I have? How quickly can I turn things around? Those aren’t super valuable. What’s more valuable is being able to go through and understand what are the root causes that are happening underneath? Do you have people that are task switching all the time?
Task switching is really bad, because task switching means that you’ll get in your head especially if you’re a programmer. And any of the programmers out there will understand this. I’m working on something, something else comes. You have to set it down. And then, you work on that other thing and then you come back and you’re like, “Oh God, what was I thinking?” Okay. And then it takes you a while to get back into it. And you may never, ever get there. Or having partially done work, which is another one. And you have that same sort of a problem.
Another one that I noticed, and this gets back into what I was talking about earlier with silos is you’ll get over specialization. So one of the things that a lot of companies have done, Yahoo did this to me, in fact, is a lot of companies will have things like DBAs and those sorts of people or network engineers, and they go, “I do this one thing. I am an expert at Oracle databases. And God help you, I’m not gonna do anything else.” And you go to them and they’re like, you know, the Oracle says no. And what I do is at Yahoo, they gave me all of the Oracle DBAs. And the reason for that was, I decided to break down the specialization. It wasn’t that those guys didn’t continue to do database work in one example, but they needed to be part of delivering the overall outcomes. They were there to work with the teams and try to figure out how to help enable the teams to do a lot of the things that they weren’t interested in doing half the time, building an index or something like that, and it allowed the team to be able to be faster. It allowed the team to understand a bit more of what chaos and mayhem they happened to be doing to the relational database so that they wouldn’t do that, they wouldn’t code it that way. And it allowed the DBAs to do what I told them, that they should all ultimately become DBEs (database engineers), so that they’re really focused on how can they deliver and engineer solutions that are able to go through and get there.
And I noticed that all of these things seem to work. I go through and I use value streams quite a bit to go in and try to understand where things are getting caught up. I give an example, at one company I was at, the delivery teams are delivering too slowly. It takes, about between 14 and 18 months for something to get out there. The delivery teams needed to get better. So I did a value stream. And what I found out was that the delivery, the vast majority of the time was actually spent with the non-technical people arguing and getting approvals and all that kind of stuff. The delivery teams, they took no more than five weeks to deliver something once they actually got it. And so I said, okay, I can improve that. I can get it down to probably four, maybe three weeks. But that’s not gonna save you the 14 to 18 months. So being able to get people to actually see and understand what’s going on.
And it’s the same within an engineering team. If your builds are taking too long, try to understand why. See if there are ways to be able to go through and reduce that. Go through and check your code in early and often, because again, if you’re not getting that feedback or you’re holding onto something for too long, then you’ll lose context. It’ll make it more difficult to merge. If there’s a problem with the build, you’ll be sitting there scratching your head, trying to go, what the hell? What’s going on? So those are all things, again, to be able to go through and improve and get people to understand what’s going on.
And I also then teach non-technical people as well, how to go through and understand this information. So they don’t need to be technical, but I’ll go through and say, do you understand that there’s a lot of technical debt in these places? You can see that there’s a lot of code churn that’s happening in these places. The teams need help, or here’s the places where you have a lot of defects. If you go through and focus on giving the teams time to be able to go through and deal with those, that will be able to help not only reach your outcomes, but will be able to make the things be less difficult for the teams.
I also go through and do the other thing of instrument up all of your features and everything that are out there. Are there ones that nobody’s actually ever using? And if they’re not using them, maybe you should kill them off, which is something that is like telling a product person to go kill their babies. But again, it creates waste. It’s something that has to get maintained. It’s something that, you’re going rev an operating system or version of Java or whatever, that you have to go through and spend time to go through and maintain.
And people don’t understand that. They think that it’s a one and done sort of a thing. They don’t understand that with technology, it’s a constant evolution. It’s a constant change. Everything has a cost that’s associated with it, so get it so that they understand that and that they can tune that side of things effectively. So that’s a lot of the way that I try to go about doing a lot of those things.
Henry Suryawirawan: Wow, thanks for highlighting all these different things. It’s quite fascinating. I highly recommend people to read this book. Part of the summary that I learned from reading this book is that the first thing is that you need to enable the team, right? The whole team to make quicker decision making. I think that is the crucial thing about all this, maybe Lean DevOps or DevOps in general as well.
And one of the key first things is to understand the target outcome. So you mentioned about mission command, understanding the goals and the constraints, and let people figure out how to do the solution. And over the time, you also need to build situational awareness, understand the friction, where things go wrong within the team, being able to see what is happening. Because many times I think we are just busy, but we are not able to see what are the things that are producing the friction that we are facing each day. And the last part is, I think building the learning culture, right? And also touch on about togetherness where people know each other personally. And also build the trust so that we can build a highly performing team.
[00:49:41] 3 Tech Lead Wisdom
Henry Suryawirawan: So thank you so much Robert, for all this great explanation, great insights. Again, I would highly recommend people to read the book. As we reach the end of our conversation, I have one last question that I would like to ask you. This question I call the three technical leadership wisdom. You can think of it just like an advice that you want to give people here to learn from you about your experience or from your knowledge throughout your career so far.
Robert Benefield: Sure. So I would say the first one, and you probably heard me pound on about this all throughout it, and this is something that’s important for everybody; that no matter if you’re the most junior person in the team or you’re the CEO, and that is have a purpose. Understand what are the target outcomes that you are trying to achieve. Why are they important? How do you contribute to them being able to be achieved? And most importantly, how do you measure whether or not you’re on the right track to progressing them?
And outcomes aren’t outputs. They’re not feature or defect counts. They’re not lines of code. The fact this drives me completely up the wall with things like OKRs. A lot of people think that O stands for output and the key results are measures of those outputs. No, it’s all about understanding what are the things that matter to people who drive value from what you’re trying to deliver. So that’s number one. And if you don’t know what it is, try to find out. Because that’s what’s going to differentiate you from some Jay Random, guy who knows JavaScript or whatever that’s out there. And they’ll go, oh my God, this person’s able to go through and achieve things and make it happen. And if you’re a leader, how do you go through and communicate that stuff to your teams? That’s gonna be important. And don’t tell them what to do or how to do it, tell them what it is you’re trying to achieve.
So the second one is ensure effective decision making. So Boyd’s OODA loop is all about how do you go through and pull information and knowledge together to shape the right context to make quick and effective decisions. And it’s not about guessing. It’s about being able to go through and consider decisions critically, thinking through how things can go wrong and what might happen when they go wrong. This helps you be able to go through and test the quality of your situational awareness. It helps you filter out assumptions and biases and mitigate any unnecessary and uncontrolled risk.
So for leaders, one of the things that I find fascinating is that most leaders don’t realize that they’re often too far away from what is happening to be able to make the best decisions at the time that’s needed. Even if they have the information, they’re not gonna be able to turn it around and be able to make it happen the right way. And oftentimes, they may think they have the right information, but they don’t. And that’s where mission command comes in. Arm your people with purpose, help them get the information, give them the safety to be able to make the decisions that they’re best placed to make. And that way, what decisions you need to make are at the right level where you have the time necessary to be effective.
And then if you’re an individual contributor, you also need to be able to be equipped to make decisions. You need to be able to realize what you know and what you don’t know, where the gaps are in your awareness. And gaps are okay. I mean, this is where Einheit is important, and understanding your teams, where information flow, and mission command come in. You need to be able to feel safe to make mistakes and to be able to learn from them. And learning is how we all get better. And so, this is where managers and teams need to work together to make sure that nobody is put into a situation where they must make a highly risky decision with no support.
And then the third one, which I touched on a bit in the first two, is you really need to respect people. So people are your most valuable resource. You have to involve them in the decisions. You have to make sure that they feel safe and part of the decision. They need to understand the purpose. You need to make sure that you have an environment that feels like it’s a supportive team where people feel committed to helping each other pursue the target outcomes.
And what I notice time and time again is that if you don’t do that, you build a low trust environment. And low trust environment is where information gets hidden, mistakes get made, but they also get hidden. And all of this damages decision making and it makes learning nearly impossible. It’s also really stressful. And inevitably, it creates a situation where the people that you can least afford to lose are usually end up being the first to go. And so, by respecting people, you make it so that those people really never want to leave. They always want to put it in all. They always want to be able to help be successful. Because, again, they’re there and they’re helping be part of achieving the target outcome.
Henry Suryawirawan: Wow! Really beautiful message for people who just listened, right? I think respecting people is part of the very crucial, important elements of leadership, I would say. Because at the end of the day, it’s not just the leader who do the job, it’s actually the people who actually help to achieve the target outcome.
So, Robert, if people want to follow you or continue the conversation online, is there a place where they can find you or buy the books and things like that?
Robert Benefield: Yeah, you can buy the book. I have a site that I’m slowly cobbling together, leandevops.com. I do have a Twitter handle, which is leandevops, and sometimes I post things to LinkedIn as well. So I haven’t been quite as out there as I would like, because I’ve been brought into mother of all flaming projects at the moment. And a lot of people say, new book, and I’m like, “Nope, that’s okay.” But it’ll definitely be a learning experience, and it’s definitely one of those completely super important, very incredibly complex projects. So those would be the places to go. I also do try to go out from time to time to speak in the community and meet people as well. So, you know, keep your eye out there. You’ll find very quickly, I am not a one way sort of person. I like to listen and understand what people are actually seeing as well as to be able to provide any sorts of insights and help that I can.
Henry Suryawirawan: Sounds the new project is like similar last time, right? You always seem to do things that are new, right? Nobody has done it before and I wish you good luck with that project as well. So thanks for coming here and sharing your insights, Robert.
Robert Benefield: Thank you. Thank you for your time.
– End –