#135 - Microservice Reflection & Scaling Complex Adaptive System - James Lewis
“Spend some time looking at the system in which you work. Understand how the work is working. Understand how flow is for your organization. And then you can work to optimize that.”
James Lewis is a Director at ThoughtWorks and a pioneer of microservice architecture. In this episode, we went back memory lane to the time when James first coined and popularized the microservice architecture. James described his definition of a microservice and its important characteristics. He also shared the recent microservice evolution, including the swing between microservice and monolith. In the second half, James shared his insights from complexity science related to different scaling patterns. Particularly, he explained how different hierarchy types can affect an organization’s growth rate. Towards the end, James gave some tips on how organization can detect signs of suboptimal growth and what we can do to maintain organizational agility.
Listen out for:
- Career Journey - [00:03:48]
- Coining Microservices - [00:07:25]
- Definition of Microservices - [00:14:13]
- Microservices Swing - [00:18:42]
- Scaling Law and Complexity Science - [00:24:05]
- Complex and Adaptive System - [00:40:01]
- Examining Sublinear Growth - [00:43:47]
- 3 Tech Lead Wisdom - [00:51:19]
_____
James Lewis’s Bio
James is a Software Architect and Director at Thoughtworks based in the UK. He’s proud to have been a part of Thoughtworks’ journey for fourteen years and it’s ongoing mission of delivering technical excellence for its clients and in amplifying positive social change for an equitable future. As a member of the Thoughtworks Technical Advisory Board, the group that creates the Technology Radar, he contributes to industry adoption of open source and other tools, techniques, platforms and languages.
He is an internationally recognised expert on software architecture and design and on its intersection with organisational design and lean product development. After defining what was the newly emerging Microservices architectural style back in 2014, James’ primary consulting focus these days is helping organisations with technology strategy, distributed systems design and adoption of SOA.
Follow James Lewis:
- Twitter – @boicy
- LinkedIn – linkedin.com/in/james-lewis-microservices/
- Email – james.lewis@thoughtworks.com
Mentions & Links:
- 📚 The Goal: A Process of Ongoing Improvement – https://www.amazon.com/Goal-Process-Ongoing-Improvement/dp/0884271951
- 📚 Beyond the Goal – https://www.amazon.com/Beyond-Goal-Eliyahu-Goldratt-Constraints/dp/1596590238
- 📚 The Quark and the Jaguar – https://www.amazon.com/Quark-Jaguar-Adventures-Simple-Complex/dp/0805072535
- 📚 Growing an Object-oriented Software Guided by Tests – https://www.amazon.com/Growing-Object-Oriented-Software-Guided-Tests/dp/0321503627
- 📚 REST in Practice – https://www.oreilly.com/library/view/rest-in-practice/9781449383312/
- Characteristics of a Microservices Architecture – https://martinfowler.com/articles/microservices.html
- Java the Unix Way – http://2012.33degree.org/pdf/JamesLewisMicroServices.pdf
- Continuous Delivery – https://www.amazon.com/Continuous-Delivery-Deployment-Automation-Addison-Wesley/dp/0321601912
- Behavior-Driven Development – https://en.wikipedia.org/wiki/Behavior-driven_development
- XP (eXtreme Programming) – https://www.agilealliance.org/glossary/xp/
- Value stream map – https://en.wikipedia.org/wiki/Value-stream_mapping
- Team topologies – https://teamtopologies.com/
- Gini coefficient – https://en.wikipedia.org/wiki/Gini_coefficient
- Dave Farley – http://www.davefarley.net/
- Jez Humble – https://www.thoughtworks.com/profiles/j/jez-humble
- Nat Pryce – http://www.natpryce.com/bio.html
- Steve Freeman – https://www.linkedin.com/in/stevefreeman
- Dan North – https://dannorth.net/about/
- Cyndi Mitchell – https://www.thoughtworks.com/profiles/c/cyndi-mitchell
- Erik Doernenburg – https://erik.doernenburg.com/
- Martin Fowler – https://martinfowler.com/
- Jim Webber – https://jimwebber.org/
- Ian Robinson – https://twitter.com/iansrobinson/
- Jimmy Nielsen – https://www.jim-nielsen.com/
- Roy Fielding – https://en.wikipedia.org/wiki/Roy_Fielding
- Adrian Cockcroft – https://www.oreilly.com/people/adrian-cockcroft/
- Fred George – https://www.linkedin.com/in/fred-george/
- Sam Newman – https://samnewman.io/
- Geoffrey West – https://www.santafe.edu/people/profile/geoffrey-west
- Murray Gell-Mann – https://en.wikipedia.org/wiki/Murray_Gell-Mann
- Manuel Pais – https://www.manuelpais.net/
- Matthew Skelton – https://blog.matthewskelton.net/about/
- Eli Goldratt – https://en.wikipedia.org/wiki/Eliyahu_M._Goldratt
- YOW! Sydney 2022 – https://yowcon.com/sydney-2022
- JAOO Conference – https://www.infoq.com/jaoo/
- Sun Microsystems – https://en.wikipedia.org/wiki/Sun_Microsystems
- PL/SQL – https://en.wikipedia.org/wiki/PL/SQL
- Tcl (Tool command language) – https://en.wikipedia.org/wiki/Tcl
- Vignette StoryServer – https://en.wikipedia.org/wiki/StoryServer
- jMock – http://jmock.org/
- Docker – https://www.docker.com/
- ThoughtWorks – https://www.thoughtworks.com/
- Santa Fe Institute – https://www.santafe.edu/
Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.
Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.
Coining Microservices
-
Ian’s observation is “be of the web, not behind the web”. This idea that why in big organizations, we spend so much time creating our own infrastructures, our own protocols. We try to reinvent the wheel the whole time when actually there’s this massively successful distributed system called the worldwide web. Why shouldn’t we just use the technologies from the worldwide web?
-
It’s an unconference, and we were talking over three days, forming our own agenda. And one of the themes that came out over that three days was, why are things so hard to change? Why we’ve got this big product? How do we break it up? How do we change it more effectively? How do we improve maintenance of it? Do you build a new one or do we strangle it or all these different questions?
-
Genuinely, one night, and I thought maybe we’re solving the wrong problem. Maybe if we didn’t have the problem of things that were too big. Maybe if we had lots of smaller things, that might make things easier to deal with.
-
And I said, “Jimmy, I’ve got this idea. Like, what happens if we just have everything as an aggregate root? And they talk to each other across the network.”
-
So rather than doing threading within a monolith or having a structured monolith, we decided to scale via processes. So every time we wanted to have a new capability or subdomain, if you like, we would create a new service. And along with that, that came a bunch of other practices. We’d have one repository per service.
-
And the reason we’d have one repository per service is if you put them all into the same repository, then I won’t be able to avoid finding the abstractions and extracting stuff into shared libraries. And we don’t wanna do that. We want to hang across the wire API to API communication.
-
We started putting all these things into practice. We had something that looked very much like Ansible–pre-Ansible–in order to manage all this stuff on AWS. And it seemed to work pretty well.
-
And Martin Fowler had been trying to persuade me to write it up. There were other people at the same time talking about exactly the same thing. So whilst I would argue that Martin and myself, we wrote down what became the description of the characteristics, Fred George at the time was also talking about microservices in a different style. But he was also talking about these really small things. He was talking about functional microservices. And Adrian Cockcroft at Netflix, he was also on a journey with Netflix to what he was calling fine-grained service oriented architecture.
Definition of Microservices
-
There’s always gonna be, I think Martin calls it semantic diffusion, where the original meaning of a term is lost over time, as many people start to add their own meanings to it. But actually if you look at those characteristics, they stand up pretty well. I might make more of a focus on the business side of things rather than the technical side of things. I think the technical side of things has accelerated off in all sorts of directions.
-
When I’m consulting, the things I see least applied are the things around products versus projects. And the organization around business capabilities and really embedding Domain-Driven Design into the heart of what you do. And I think that’s because it’s actually harder in some ways to solve than the technology problems of breaking things up and remote communication.
-
Because it involves humans. Whenever we’re in a position when we involve humans and teams and people and structures, the time scales to change and for experimentation are much longer. It’s much harder to change your development organization or your organizational structure frequently, like we can do when we refactor code bases or where we actually redesign or re-architecture. Because it takes a long time to see if something’s working.
-
The other thing with the characteristics, I probably would’ve emphasized Conway’s law more because I think it is something that’s a real driver for a lot of the microservices.
-
A lot of the problems I see with implementations. You often get asked a question like, “You know, we’ve got these 2,000 services. Like, how am I supposed to understand all of those?” To which my answer is, “Well, you’re not, right?” That’s the whole point of breaking things up into smaller units, organizing around the capabilities, around the domains, and then splitting the teams up like that.
-
Conway’s law act in there so that all you need to know is about the things that you know about and the things may be adjacent to you. You don’t need to know about all 2,000 or 4,000 of these things that you’ve got. I think that’s definitely something that I might go back and focus a little bit on.
-
[Sam Newman] talks about, when you’ve got a big problem, how do you solve a really big complex problem? Break it up into a lot of smaller but still complex problems and solve the small problems. And that’s really what we were trying to do . It’s one of the hearts of microservices.
-
The other thing I probably would’ve emphasized a bit more, and this is maybe due to my journey as a software professional, is the concept of flow. So I probably would’ve put more effort into describing how these optimize your ability to get things done. And to scale teams around these things, because that, fundamentally, is such a huge issue in our industry. You know, how do we go faster? How do we get more stuff done? How do we improve the developer experience? How do we get value out the door, improved cost of delay, all these things?
-
And I think microservices at the heart are a way of doing that. But only if they’re done “properly” by paying attention to how you split up your teams and that you don’t get this massive distributed monolith that everyone has to understand everything, and everything that has to be deployed all at once.
Microservices Swing
-
The ideas behind service-oriented architectures, it was about how to maximize the investment in our internal IT estate. If you think about how organizations grew in the eighties and nineties, what you had internally, you’d end up with lots of these different platforms that were sometimes doing the same thing. These are big ERP systems that sucked all your data in. It’s really hard to get stuff out of. Maybe one department bought one thing, another department bought another thing, and they both did the same thing, so you end up with duplication of investments and things.
-
And service oriented architecture really was this idea of, can we create a set of common building blocks that we can use in our organization? So this is going back to capabilities as well. Business capabilities we first talked about in the late fifties, so this is nothing new, right?
-
When we talked about capabilities, they were talked about not in the sense of purely as the software. They were talked about in the sense of the people, processes, and tools that make up an important part of what your business does. And service-oriented architecture was about then almost like creating these units that would allow these things to talk to one another without a duplication and so on. So reducing cost.
-
As things change, as then you move into the internet age into the world of where everything is digital, then the meaning of a service oriented architecture started to change. We started to think less about internal integration between systems and how to make that more stable into this idea of how do we offer APIs externally? How do we build software that’s providing these APIs that either our channels can use or that we can offer as a service? There’s been this long-term trends towards more service oriented architecture everywhere, if you like.
-
Then there’s also the idea of modular monoliths. Do we have to start with microservices? This seems like a really complex way to build simple systems. And it is a really complex way to build simple systems. I don’t think anyone’s ever said that you shouldn’t build structured monoliths.
-
I think the problem came that over time, we saw that it’s really hard once you go past a certain scale to keep the monolith modular. If you’ve got something that’s relatively simple and small, then that’s fine. But at a certain point, with a certain amount of churn on the team, with a certain number of people working on the platform, etc, you tend to reach a point where things start to become tangled. You end up having to put a lot of energy into managing tech debt. Or a lot of energy into avoiding entropy, if you like, in monolith.
-
Naturally, we sort of swung one way. I think then naturally we’re swinging somewhat in the other direction. But I also think there’s something in the middle. And I suspect we still haven’t settled on that yet, which probably looks a bit more like fine-grained service oriented architecture, which Adrian would say. Because at certain points, at certain scales, you can’t solve the problems without using this sort of distributed system. There are certain things you have to use these sorts of patterns for.
-
And obviously, as an industry, we are amazing. We’re a bit like Ouroboros, the snake that eats its own tail. We keep going back round and round and round and doing the same things.
-
And I think that’s the same with microservices. We’ll hit everything with the microservices hammer, forgetting that you don’t always have to solve every problem that way.
Scaling Law and Complexity Science
-
Every time you see straight lines on a graph in lots of different places, all these straight lines basically replicating across different domains, different parts of the world, different types of complex system, then you should find something interesting in it, right? You should be looking like, maybe, there’s something underlying here. Maybe there’s something that’s like a truth we can go after. And that we can use scientific methods to try to determine. And that’s what they’ve kind of been doing in Santa Fe around scaling.
-
And what they’ve sort of found essentially is with different types of network–whether you’ve got like hierarchical, like graph, like directed graph like networks versus things like social networks–different types of networks can explain the different straight lines on graphs that they’ve been finding in all sorts of weird and wonderful ways.
-
So an example, the straight line on the graph that is the scaling law for metabolic rate in mammals. You can draw a straight line. As mammals get bigger, there is this straight line that can be drawn about how many calories essentially they need to intake, to take in to survive. And there’s another straight line for how the infrastructure in cities expands. So as a city gets bigger, how many water pipes, how much more water pipes do you need? And it’s a very similar straight line with a very, very similar exponent. And it’s the same for companies as well, and company growth, revenue is the same.
-
And all of these different things appear to be related to a particular type of network, a hierarchical network. So water pipes in cities are hierarchies. In mammals, our circulatory system is a hierarchy, essentially from a heart down to the capillaries. And in organizations, we have hierarchies of information flow. So where do ideas come from? Where does direction come from? Where do orders in some senses come from?
-
As companies grow, it tends to become easier to create a hierarchy for which information trickles down through in the same way that water flows through pipes in the city than it is to do it any other way.
-
The interesting thing about these hierarchical networks is that they all scale using what’s called a sublinear scaling law. So they scale sub exponentially. What that means is as you double the size of a city, you don’t have to double the number of water pipes. Instead you get less, you have to do less than that. And the same, if you double the size of a mammal, you don’t have to double the amount of calories they take in. It scales less than that.
-
And it’s the same for organizations, for physical infrastructure. As you double the size of an office, the network cables aren’t double, right? I mean, it’s the infrastructure in the office, it doesn’t double. This is what economists call economies of scale. So as you get bigger, you pay less for things, essentially. And all of these complex systems exhibit these behaviors.
-
But it does also imply some other things. It implies the fact that things slow down. Things move more slowly when you have deeper hierarchies. And similarly with cities, as cities get bigger, the hierarchies do the same thing.
-
But then the interesting thing is, with companies, if you work for a big company, as you get bigger, things do get slower. Also, you can see it in the things, like the results that public companies post in terms of revenue and so on. So as a company doubles in size, it doesn’t double the amount of revenue, interestingly.
-
The other thing, like mammals, companies die all the time. As mammals get older, as we get older, we are taking the same number of calories, but we’re having to direct more and more of those calories to self-repair. And therefore, we slow down. As we age, things start to break down, and eventually, we pass away. It seems like companies do the same thing. There’s some interesting facts in the book “Scale”. The half life of a company on the exchanges is 10 years. So every 10 years, 50% of the companies and the exchanges will change.
-
There’s all these interesting things that seem to be related to hierarchy. Things slowing down, things taking longer, as things get older and so on.
-
In my experience, when you plot out things like development life cycles, you look at a big old organization, you look at the software development life cycle, you plot that as a value stream map, you’ll end up with these huge, long processes. Like, it takes a year from an idea to get even to the project management function, to be able to be scheduled to have some work done. And you see this in every old, big, traditional company that I visit.
-
Conversely, if you look at younger companies, often they don’t have that. Usually they have much shorter time to market. They’re able to be much more innovative, because the time it takes for an idea to go from in someone’s head, that information passing through a series of process steps to making money, it tends to be much shorter. And so over time, companies add all these processes, these constraints. They add hierarchy to manage all these constraints and all these processes and things start slowing down.
-
There’s also another interesting idea that larger organizations do the same thing. Bigger older organizations rather do the same thing that happens with mammals, where they have to spend more of their revenue on keeping the company running. In the same way that as we get older, we spend more of the caloric intake on repairing ourselves, companies spend more money on just being the company, whereas at the start they were spending money on innovation and products and people and all this cool stuff, and that’s why they were successful.
-
Eventually, they end up in this position where they’re spending so much just keeping themselves going. They stop thinking about innovation, or innovation becomes hard, R&D becomes hard.
-
But the other interesting thing is, it appears there’s another type of network, which in the natural world is associated with a different type of scaling law. Something called super linear scaling. So super linear scaling is when you’re in advanced of exponential growth. So as you double the amount of one thing, as you double the number of people, you get more than double the revenue, for example.
-
And in cities, cities are another example where social networks, massive social networks exist, between people communicating. As you double the size of a city, you get more than double the amount of socioeconomic things out of it. So you get more than double the innovation. You get more than double wage growth. Other effects, negative effects, you get more than double crime, pollution, disease.
-
And related to these social networks and these social network graphs. If you structure yourself, if information is allowed to flow in a particular way, then you get, like, the whole is greater than the sum of the parts, if you like. You get more out of than you think you should. And if you structure yourself in a different way, conversely, you get less out than you think you should, informationally. If the information flows hierarchically versus via social network.
-
So the question is how to take advantage of this? How to use this knowledge to improve our ability to get stuff done. If it’s a big organization, how do we organize ourselves such that people have got everything they need within a short number of hops from them?
-
Microservice was all about product teams which have everything there. You’ve got everything you need on your team to solve the problem, to evolve your product towards being a better product. And it’s related to Conway’s law as well. How do we avoid having to go outside our team, ask other people to do stuff for us, because that’s a kind of hierarchical way of doing things?
-
And the observation I make in the talk was, that’s how AWS structures themselves. These many small, independent teams relying on self service infrastructure. They’re so decoupled from one another that they’re able to add new things on the side super fast. So their ability to spin up new product teams is super quick.
-
Interestingly, I also point out their APIs. You could be using one set of APIs and you look at a different set of products and they bear no relation to one another, right? Like the user experience, the developer experience across the different products is very different. But it’s deliberate in the sense is we’re not gonna pay the coordination cost to make our APIs consistent, cause the coordination cost is going to slow us down.
-
We’re going to allow the teams to be independent. And that means we have to put up with the fact that the developer experience isn’t gonna be potentially the best.
Complex and Adaptive System
-
From the Santa Fe Institute, they’ve settled on a definition, which is a complex adaptive systems display four characteristics.
-
The first one is that they’re inherently complex so that the whole is greater than the sum of the parts. How do you get such complexity from such simplicity? What makes a jaguar, or what makes a human, or what makes a team or an organization or a city? So this is the idea of complexity. The whole is greater than the sum of the parts.
-
There’s also this idea of emergent behavior. So when we say emergent behavior, we mean you can’t reliably predict the outputs of a complex adaptive system from its inputs every time. It’s not a functional problem where you can say, given these stimuli, this thing will always happen.
- Coming back to the thing about teams, we think teams exhibit that behavior all the time, right? You could never run the same situation. The same, maybe even user story or sprint twice, you couldn’t do it. You’d get different outcomes if you ran it a second time.
-
The third characteristic is this idea that they’re made up of self-similar parts. So in humans, the self similarity is the cells. We are self-similar. Our cells are self-similar to one another. And on teams, the self similarity is the people on the teams are self similar with one another within limits.
-
And then this idea of self-organization. So there’s no central thing telling a complex adaptive system how to organize itself.
-
-
And there’s obviously similarity with teams there as well. And so that’s why I talk about teams being complex adaptive systems. An example I use, again, the sprint. If you just change one member of a team, you’d get a different outcome, right? If one person is sick at a different time, you get a different outcome. If people pair or don’t pair, you get a different outcome. It’s almost like every single line of code could be or could not be written depending on so many different factors. And that’s why we talk about teams as being complex adaptive systems. And, of course, organizations are made up of these components. They’re made up of these things. So an organization is also a complex adaptive system.
Examining Sublinear Growth
-
For example, you can use the metrics from the DORA reports or the “Accelerate” book, the four key metrics as leading indicators of organizational health. Lead time to value, meantime to recovery, change failure rates, number of deploys per unit time. Those things are really interesting to track.
-
Now, a lot of people then fall into the trap of saying, “Ah, great, we’ve got tools that will tell us this.” And then they spend eight months trying to automate measuring all this stuff. There’s no need, right? We don’t need precision. That’s not what we’re after here. What we’re after is saying, you know, finger in the air, your team, how long on average does it take to get stuff into production? That’s good enough.
-
As long as you track that somewhere, and as long as you reflect on it, maybe at the end of an iteration or whatever it is. When you’re doing quarterly planning or when your OKRs are being reviewed and you look, are we better or worse? That’s what we’re really looking at. Are we trending in the right direction? We can use those trends to work out which direction we’re heading.
-
I use the analogy from biology, from medicine of going to the doctors. They’ll take your blood pressure, your temperature, your heart rate, that kind of stuff. If you’ve got super high blood pressure, maybe there’s an intervention that needs to be staged. It doesn’t tell you exactly what the problem is, but it might tell you there is an issue.
-
There are a couple caveats to using the four key metrics. The four key metrics are only going to be improvable to the limit of the system you find yourself in. So if you’re in a deeply hierarchical organization with these massive long value streams to get stuff done, you can optimize four key metrics, but you are only going to be able to optimize to the limits imposed by that system.
-
The other thing is the idea of signals. When we have some of this information about how we structure ourselves fundamentally impacts performance to such a degree. How do we take advantage and how do we listen to signals from our own organization? That’s one set of signals I’ve just sort of given you, which is a four key metrics. But the other thing is taking time setting the organization up so that people in the organization can take the time to listen for the signals.
-
I often these days use the phrase that people are so busy doing the work that they don’t have time to think about how the work works. They don’t have time to think about the system and optimize the system. Because people are just head down, just doing stuff, doing stuff, doing stuff all the time.
-
That will be the other piece of advice I would give. Spend some time looking at the system in which you work. Understand how the work is working. Understand how flow is for your organization. And then you can work to optimize that. You can find, you can look at where those hierarchies are, the informational hierarchies. It might not be on the org chart, it might be in the value stream maps and work to change that.
-
One thing that is often overlooked is self-service and why self-service is so powerful. As I mentioned, the problem with hierarchies is you are trickling information one way. And often, if I have to ask someone to get something done, I’m waiting for someone else doing some work. That’s kind of like blocking my flow, blocking my circulatory system in a sense. So how can we make those systems such that they’re self-service? So I don’t ask someone to do something for me. Instead, they provide that in a self-service way.
-
And that’s what’s so powerful about AWS is that they recognize that by providing the undifferentiated heavy lifting when they provide that self service, they solve a couple of problems. The first is you don’t have to wait on anyone to get stuff done. So that’s the flow improvement problem.
-
And the other thing is you solve the problem of scarce resource. Because it’s queuing theory. If you’ve got a request coming in to a service, into a queue rather. And you’ve got one thing able to take that request off the queue and process it. If you’ve got multiple producers feeding messages into a queue, then you have to have multiple consumers to process the messages at the same rate, to keep the queue from backing up.
-
And it’s the same with what we do. It is exactly the same as what we do. When we are asking people to spin up VMs for us, when we are asking another team to make a change for us, the only way for them to scale is for that team we’re asking to add more people. As they get more requests, they have to add more people. It’s unsustainable.
-
So you invert the relationship and say, make these things self-service. Make providing infrastructure self-service. Make providing changes self-service. Things like developer platform engineering teams providing self-service access to tooling, to release pipelines, to databases on demand. That’s all about improving flow.
-
But you don’t get to see that unless you are able to visualize your system. Unless you’re able to step back, take the time and understand where the blockages are, where the cholesterol is building up in your arteries, if you like.
3 Tech Lead Wisdom
-
Empathy is a superpower, especially as you get more experienced.
-
Being empathetic is absolutely the key to getting things done. It’s all about being empathetic. It’s all about understanding not just what your team needs and what you need, but it’s also about understanding what your other stakeholders need, that aren’t just the business stakeholders, that are your stakeholders in security and operations, your downstream, your upstream folks.
-
Empathy in the large as part of the software development process, but also empathy in the small. Empathy with the people you are working with. You don’t know what’s going on in people’s lives often. And, being empathetic goes a long way to creating a really fun environment.
-
-
Seriousness does not equal professionalism.
-
The best projects I’ve ever worked on, the most successful products I’ve ever built, the best teams I’ve worked in, they understood that seriousness was not professionalism. Having fun makes a better team. Having a good time, being empathetic with your colleagues, sharing good experiences.
-
We sometimes make the mistake that if we’re seen to be having a good time doing something, then surely we’re not doing it right. But actually often I think it’s completely the inverse.
-
-
Look for the new rules you need and implement the new rules.
-
Why do organizations often fail at adopting new things, new ideas, new innovations? And [Eli Goldratt] says that in his experience, it’s because that there are four questions they need to ask, and they forget to ask one of them. The four questions are:
-
If you’ve got a new thing, whatever that innovation is (microservices, Docker, Kubernetes, Continuous Delivery), you should ask yourself, what are the cool things that this thing gives me? What can this unlock for my organization if we adopt this innovation as technology?
-
What current limitations in the company will this new innovation help us overcome?
-
And you should then look and say, what rules did we have to manage our existing limitations?
-
And he says, when people adopt new technologies, often what they do is they stop there. You forget to ask the fourth question: what new rules do we need? And the new rules are the important bits, cause if we try to manage new stuff with the old rules, you don’t get anywhere.
-
-
[00:00:56] Episode Introduction
Henry Suryawirawan: Hello again to all of you, my listeners. You’re listening to the Tech Lead Journal podcast, the podcast where you can learn about technical leadership and excellence from my conversations with great thought leaders in the tech industry. If you haven’t, please follow the show on your podcast app and social media on LinkedIn, Twitter, and Instagram. And to appreciate and support my work, subscribe as a patron at techleadjournal.dev/patron or buy me a coffee at techleadjournal.dev/tip.
My guest for today’s episode is James Lewis. James is a Director at ThoughtWorks and a pioneer of microservice architecture. In this episode, we went back memory lane to the time when James first coined and popularized the microservice architecture. James described his definition of a microservice and its important characteristics. He also shared the recent microservice evolution, including the swing between microservice and monolith.
In the second half of our conversation, James shared his insights from complexity science related to different scaling patterns. Particularly, he explained how different hierarchy types can affect an organization’s growth rate. And towards the end, James gave some tips on how organization can detect signs of sub optimal growth and what we can do to maintain organizational agility.
This is such an insightful discussion with James, and I find it a really interesting story how he first got the idea of the microservice architecture. I hope you enjoy listening to this episode and learning a lot from it. And if you do, please share this with your colleagues, your friends, and your communities, and also leave a five star rating and review on Apple Podcasts and Spotify. Let’s go to my conversation with James after hearing a few words from our sponsors.
[00:03:02] Introduction
Henry Suryawirawan: Hi, everyone. Welcome back to another new episode of the Tech Lead Journal podcast. Today, we have another ThoughtWorker in the show. So his name is James Lewis, a software architect and Director at ThoughtWorks. He’s based in UK. If some of you don’t know about James Lewis, he’s actually the person who coined the term microservices with Martin Fowler back then in 2014.
And I think some of you would have known by now that since then it has been a craze and hype and so many technology trends following microservices. And today we will touch on a little bit about the history of microservices and some evolutions since then, and we’ll talk about some other things as well.
So James, thank you for the opportunity to have you in the show. Looking forward for this conversation.
James Lewis: Hey, Henry, thanks very much for asking me to be on the show. It’s a real privilege and pleasure, so, yeah. Thank you.
[00:03:48] Career Journey
Henry Suryawirawan: So James, in the beginning, I always love to ask my guests to actually share any career journey highlights or turning points that you have in your career to be shared here for the listeners to learn from.
James Lewis: Yeah, that’s a great question, actually. What those career highlights are. There’s been so many. I’ve been, I think, genuinely so lucky over the 26 or so years I’ve been doing this or adjacent roles in technology. I think, I use the word luck deliberately, right? Because my career has spanned really what, if you think about the last 26 years of what’s been happening in technology, it spanned the birth of the internet and the worldwide web.
I was one of the first generation in the UK to grow up privileged enough to have access to computer in school. You know, we had this BBC micro thing that every school was sent a computer by the government or by the BBC actually. So my career really spanned from that when I was sort of eight, nine years old, through Java, through the birth of the worldwide web, through Apache, through Sun Microsystems, all these different milestones.
And I’ve been very privileged along the way to work with some brilliant people. I mean, going way back, I started off doing physics. To be honest, I wasn’t diligent enough to be a good theoretical physicist. I didn’t do enough practice, and I’m definitely not accurate or precise enough to be an experimental physicist.
So, I sort of fell into systems administration and programming after that. Which I’ve really loved ever since. And I started off as a sysadmin. Then I was a Oracle database administrator for a while. All that P/L SQL goodness back in the day. And then I studied with programming on, would you believe, TCL (Tool Command Language). From things like Tool Command Language for Vignette StoryServer, which was a sort of very early content management system for the web. And then Java. I got into Java, I guess back in sort of 2001-ish.
I joined ThoughtWorks about 2005 as a Dev-2. Don’t know what that meant at the time. And I still didn’t, but that was my title Dev-2. And I’ve been at ThoughtWorks ever since, which is, it has meant I’ve been incredibly privileged to spend time with some amazing people. You know, ThoughtWorks London, 2005 until today, has been a tremendous amount of talent. ThoughtWorks in general, ThoughtWorks global, has this huge amount of talent. But ThoughtWorks London where I joined, there were these people who went on to really change our industry.
Dave Farley and Jez Humble were there and they went on to write “Continuous Delivery”. We had Nat Pryce and Steve Freeman, who were the creators of jMock and went on to write “Growing an Object-oriented Software Guided by Tests”. Dan North was there, who went on to create Behavior Driven Development. And now, obviously, Daniel Terhorst-North, he’s a very, very influential and wise figure in our community. And you know, Cindy Mitchell who was the MD at the time, but was former Sun architect, I think behind the Swing libraries for Java. So there were these amazing intellects, amazing people. I should mention Erik Doernenburg as well, because he was an old friend before I joined and I sort of reunited with him at ThoughtWorks.
But these sort of huge talents that really helped me to embrace the fun, right? The fun of working with technology. You know, there’s the old adage, if you enjoy what you do, you don’t work a day in your life. And I think I’ve been very privileged and lucky to have that come true for me. You know, I enjoy what I do and I enjoy working with the teams of people at ThoughtWorks. I enjoy meeting new clients and learning new domains. And I still do, I still find it fascinating. So, yeah, I guess that’s a brief phrase of my early career.
Henry Suryawirawan: So you’ve been at ThoughtWorks for I think about 17, 18 years by now, so I think that’s pretty long, super long. And I think you naming some of the people just now, right? I think some of these are thought leaders in the industry. They also changed some of the trends in the industry as well. So I’m sure today we’ll be talking a lot about some of the trends that you also yourself drive in the industry.
[00:07:25] Coining Microservices
Henry Suryawirawan: So let’s go back to the time then, 2014, right? You were the first to coin the term microservice with Martin Fowler. I dunno who came with the term, but I think if we can go back and reflect the story. How did you actually come up with this idea?
James Lewis: It’s a really good question. And actually you need to go a few years before, in fact. So, the one thing about ThoughtWorks is there are so many amazing people working at ThoughtWorks doing amazing things, right? So I mentioned a couple of them, Dave Farley. I mentioned Jez Humble, so Continuous Delivery. I mentioned Daniel Terhorst-North. His thinking around architecture really informs me. I worked on a team building out services at a retail bank with him in RESTful services. That really influenced my thinking. Jim Webber and Ian Robinson, who wrote “REST in Practice” along with their co-authors. So that really tied together for me.
And also Ian’s observation that, you know, be of the web, not behind the web. This idea that why is it in big organizations, we spend so much time creating our own infrastructures, our own protocols. We try and reinvent the wheel the whole time when actually there’s this massively successful distributed system called the worldwide web, right? Why shouldn’t we just use the technologies from the worldwide web, and RESTful architecture level 3, Richardson maturity model level three architecture. So I was lucky enough to be in that sort of place.
And the other thing, I was lucky enough to experience, a really formative experience for me actually was in 2007, I think I went to my first cross-disciplinary or cross technology conference. It was called JAOO in Aarhus. Daniel Terhorst-North, he really pressed me to go to this thing. And it blew me away, right? Blew my mind. People talking about so many different things, different types of integration, different programming languages, kind of where Java came from, virtual machines, infrastructure. It was just this massive interesting information that sort of went into my head. And I think that started me on this sort of journey to try and synthesize a bunch of the things that I was seeing that were successful. And to come up with some way of doing these all at once.
I guess in the same way that XP (eXtreme Programming) came about, right? It’s a bunch of practices or a bunch of things that people were doing that were really successful independently. So let’s put them all together and to do them all the time. And I think, you know, my brain works like that. I see patterns and I’m able to form connections between things. And I think there’s maybe a legacy of the physics earlier in my life. And so gradually, these sort of things I was seeing in ThoughtWorks and outside of ThoughtWorks started to fit into place.
Then there was one event actually in 2011. Was lucky enough to be invited to something called the Software Architecture Workshop. This was in just north of Venice, actually. I can’t remember the exact name. And there were about 30 or so people there who’d been invited from all across Europe and the US and Africa actually. And it’s an unconference, so we were talking over three days, form your own agenda. And one of the themes that came out over that three days was, why are things so hard to change? Why is it that, these big products–there are some product people there –why we’ve got this big product? How do we break it up? How do we change it more effectively? How do we improve maintenance of it? Do you build a new one or do we strangle it or all these different questions.
I can honestly say I had a sort of, I’ve never, it’ll never happen to me again. It’ll happened once, at least. I literally had like a ‘bing’ in my mind, like literally almost like a chime, right? And I know it’s like a cliche to say, but genuinely, one night, and I thought, well, maybe we’re solving the wrong problem. Maybe if we didn’t have the problem of things that were too big, maybe if we had lots of smaller things that might make things easier to deal with. So I came downstairs very excited in the morning to Jimmy Nielsen, who’s organizing the events, you know, the great Domain-Driven Design expert influential person.
I said, “Jimmy, Jimmy, I’ve got this idea!” And his wife and kids were there. He often travels with his wife and children. And he was looking at me quizzically. And I said, “Jimmy, I’ve got this idea. Like, what happens if we just have everything as an aggregate root. And they talk to each other across the network.” And he looked at me as like, “Yeah, okay, that’s interesting James, but you are crazy, you know?” He probably remembers it differently. And yeah, and then we sort of went our separate ways and I went back to a client project in the UK, and we started putting some of this stuff into practice.
So rather than doing threading within a monolith or having a structured monolith, we decided to scale via processes. So every time we wanted to have a new capability or subdomain, if you like, we would create a new service. And along with that, that came a bunch of other practices. We’d have one repository per service. And the reason we’d have one repository per service is, because I’m stupid, right, and generally, that was the reason. It’s like, if you put them all into the same repository, then I won’t be able to avoid finding the abstractions and extracting stuff into shared libraries. And we don’t wanna do that, right? We wanna hang across the wire API to API communication. We started putting all these things into practice. We had something that looked very much like Ansible, pre Ansible, in order to manage all this stuff on AWS. And it seemed to work pretty well.
And I did a talk at the time called “Java the Unix Way”. And I first did that, I think, in the conference that became Devoxx in Poland. I also then did it a short notice in QCon. I think it was QCon 2012 in San Francisco. Jez Humble had asked me to go into a training course. I did a training course on microservices, which was a little bit scary cause I had the guy who sat next to Roy Fielding at Adobe on my training course. And I was trying to teach him about REST, like was a bit awkward. And then I did this talk and then it sort of went a bit quiet.
And Martin Fowler, you know, Martin’s a good friend now. He’d been trying to persuade me to write it up. We should write it up. We should write it up. We should write it up. We funded a internal summit, the ThoughtWorks Microservices Summit. We flew people in from around the world to talk about our experiences. And eventually, he decided that the only way to get me to write it up was to fly me to Boston and to go for a sleepover at his house. So that’s what we did. I flew to Boston. I spent a week living with Martin and his amazing wife, Cindy. Drinking good craft beer and writing about microservices.
I should also point out that, in the history here, and this is where the whole ThoughtWorks diaspora and the sort of connections to the rest of the industry come in. There were other people at the same time talking about exactly the same thing. So whilst I would argue that Martin and myself, we wrote down what became the description of the characteristics, Fred George at the time was also talking about microservices in a different style. But he was also talking about these really small things. He was talking about functional microservices, if you like. And Adrian Cockcroft at Netflix, he was also on a journey with Netflix to what he was calling fine-grained service oriented architecture.
But then, you know, Adrian, Fred George’s ex-ThoughtWorks. Adrian has been a long time ally and friend of ThoughtWorks and knows all the people I know. So it’s no surprise really, that these ideas all came together. And so you could probably say that the three of us, Fred, Adrian, or myself, were the people who sort of put it all together in one place. And then I wrote it up. So, yeah. That’s, I guess a very long or very short version of the story depending on.
Henry Suryawirawan: Right. I think it sounds really interesting. I didn’t know some of these historical moments, right? Like when you shared the light bulb moment that you got, and then you shared this idea. And then it became like a thing that we found out from Martin’s blog, right? So I think it’s still there. In 2014, both of you published this on his blog.
[00:14:13] Definition of Microservices
Henry Suryawirawan: And I think, if we have to go back to the definition, right? If you would define microservices now, I know that people take microservices into different types of understanding, different types of characteristics. But maybe from you originally, how would you define microservices now?
James Lewis: I probably, and we’ve talked about going back and writing a version two or writing an updated as a second edition, if you like. I think it holds up pretty well. I mean, there’s always gonna be, I think Martin calls it semantic diffusion, right, where the original meaning of a term is lost over time, as many people start to add their own meanings to it. But I think actually if you look at those characteristics, they stand up pretty well. I might make more of a focus actually on the business side of things rather than the technical side of things. I think the technical side of things has accelerated off in all sorts of directions.
Remember Docker came after this, right? And Kubernetes. These are technologies that have been invented or popularized cause I mean Docker, the underlying tech has been around for ages to help obviously a lot with building distributed systems. It makes it a lot easier. But I think that when I’m consulting, the things I see least applied are the things around products versus projects. And the organization around business capabilities and really embedding Domain-Driven Design into the heart of what you do. And I think that’s because that’s actually harder in some ways to solve than the technology problems of breaking things up and remote communication.
And why do I say that? Well, because it involve humans, right? So, whenever we’re in a position where when we involve humans and teams and people and structures, and this maybe relates to some of the stuff I talk about later, but the time scales to change and for experimentation are much longer. It’s much harder to change your development organization or your organizational structure frequently like we can do when we refactor code bases or where we actually redesign or re-architecture. Because it takes a long time to see if something’s working.
I think the other thing is, with the characteristics, I probably would’ve emphasized Conway’s law more in a sense. Because I think that again, is something that’s a real driver for a lot of the microservices. A lot of the problems I see with implementations. You often get asked a question like, “You know, we’ve got these 2,000 services. Like, how am I supposed to understand all of those?” To which my answer is, “Well, you’re not, right?” That’s the whole point of breaking things up into smaller units, organizing around the capabilities, around the domains, and then splitting the teams up like that.
Conway’s law act in there so that all you need to know is about the things that you know about. And the things may be adjacent to you. You don’t need to know about all 2,000 or 4,000 of these things that you’ve got. I think that’s definitely something that I would maybe go back and focus a little bit on. Sam Newman has got a lovely phrase. He talks about, you know, when you’ve got a big problem, how do you solve a really big complex problem? Break it up into a lot of smaller but still complex problems and solve the small problems. And that’s really what we were trying to. It’s one of the hearts of microservices.
The other thing I probably would’ve emphasized a bit more, and this is maybe due to my journey as a software professional since, is the concept of flow. So I probably would’ve put more effort into describing how these optimize your ability to get things done. And to scale teams around these things, because that, fundamentally, is such a huge issue in our industry. You know, how do we go faster? How do we get more stuff done? How do we improve the developer experience? How do we get value out the door, improved cost of delay, all these things.
And I think microservices at the heart are a way of doing that. But only if they’re done, I’m gonna say “properly”, I’m air quoting properly, by paying attention to how you split up your teams and that you don’t get this massive distributed monolith that everyone has to understand everything and everything that has to be deployed all at once. But that just might be my obsession with flow and flow efficiency that I’ve developed over the last few years. I’m not sure.
Henry Suryawirawan: Yeah. So I think what you said is really insightful because many people actually took to the extreme on the technology side, right? So things like, for example, lines of code, like how big is a microservice, right? Or what technologies that we should use, maybe like Kubernetes, container, and things like that. But I think you remind us the important things, which is already defined by both of you in the article itself, nine characteristics, right?
And things that stand true to the test of time, I believe. Things like, for example, componentization via services, organized around business capabilities, products not projects, and things like that. So I think that’s really insightful. And I like the way you also mentioned about Conway’s law. I think Vaughn Vernon also mentioned that it’s like a law of gravity. You can’t escape from that. And you just need to adapt to Conway’s law, right?
[00:18:42] Microservices Swing
Henry Suryawirawan: The other thing about microservices lately is that people have been swinging, right? So some started with like service oriented architecture, then they went into microservices. Monolith became quite a trend as well, because people implemented microservices in the wrong way. And now people started to think about right-sized service microservice. So what do you think about all these different swings, right? What could go wrong here? Or what is the underlying trend behind all these?
James Lewis: That’s a good question. I mean, going back and I’m fortunate enough to have been involved in the industry for a fair amount of time. You mentioned service oriented architecture. But what was the driver for SOA to start with, right?
So this is pre API economy. This is, in many ways, this is pre e-commerce, right? The ideas behind service oriented architectures, it was about how to maximize the investment in our internal IT estate, right? If you think about how organizations grew in the eighties and nineties, what you had internally, you’d end up with lots of these different platforms that were sometimes doing the same thing. Sometimes not. These are big ERP systems that were like, sucked all your data in. It’s really hard to get stuff out of. Maybe one department bought one thing, another department bought another thing, and they both did the same thing, so you end up with duplication of investments and things.
And service oriented architecture really was this idea of, can we create a set of common building blocks that we can use in our organization, right? So this is going back to capabilities as well. Business capabilities we first talked about in the late fifties, so this is nothing new, right? But when we talked about capabilities, they were talked about not in the sense of purely as the software, right? Just the services. They were talked about in the sense of the people, processes, and tools that make up an important part of what your business does. And service-oriented architecture was about then almost like, creating these units that would allow these things to talk to one another without a duplication and so on. So reducing cost by doing X, Y, Z.
And I think, as things change, as then you move into the internet age into the world of where everything is digital. Doesn’t matter if you are bricks and mortar retailer, you’re also a digital retailer, in a sense, at least you have to offer your catalogs digitally, etc. Then, the meaning of a service oriented architecture started to change. You know, we started to think less about internal integration between systems and how to make that more stable, which is really where my head was back in the day, into this idea of, okay, how do we offer APIs externally? How do we build software that’s providing these APIs that either our channels can use or that we can offer as a service. Google Maps as an example.
So I mean, there’s been this long-term trends towards more service oriented architecture everywhere, if you like. Back in 2014-15, in my training courses, I’d be asked, how big should one be? You know, do you think we’ll end up with thousands of these small things that are 50 lines of code each. Which some proponents of them at the time were saying, you know, there was one classically, oh, what was his name? Chad. Chad. Chad. It’ll come to me in a minute. But his rule was you can write your service in any language you like, as long as you can rewrite it in half a day in another language. Which means things are really small, you know? So there was that movement.
Then there’s also the idea of modular monoliths. Do we have to start with microservices? This seems like a really complex way to build simple systems. And it is a really complex way to build simple systems. I don’t think anyone’s ever said that you shouldn’t build structured monoliths. I think the problem came that over time, we saw that it’s really hard once you go past a certain scale to keep the monolith modular. If you’ve got something that’s relatively simple and small, then that’s fine. But at a certain point, with a certain amount of churn on the team, with a certain number of people working on the platform, etc, you tend to reach a point where things start to become tangled. You know, you end up having to put a lot of energy into managing tech debt. Or a lot of energy into avoiding entropy, if you like, in monolith.
So I think, naturally, we sort of swung one way. I think then naturally we’re swinging somewhat in the other direction. But I also think there’s something in the middle, right? And I suspect we still haven’t settled on that yet, which probably looks a bit more like fine-grained service oriented architecture, which you know Adrian would say. Because at certain points, at certain scales, you can’t solve the problems without using this sort of distributed system. There are certain things you have to use these sort of patterns for.
And obviously, as an industry, we are amazing. We’re a bit like Ouroboros, you know, the snake that eats its own tail. We keep on going back round and round and round and doing the same things. I had, it was a funny conversation with someone recently about, what should we use simple stack to build like an internal, simple internal web app that’s never gonna be exposed to the outside world, it’s gonna be a really simple line of business kind of thing, not many users. And of course the immediate reaction from many people were, “Well, first of all, you start with React and then you need maybe GraphQL.” This massive, massive stack of stuff, right? Just to build something that you could do really simply via server-side rendering.
And I think that’s the same with microservices. We’ll hit everything with the microservices hammer, forgetting that you don’t always have to solve every problem that way. I still think they’re very useful tools though. Although I do often, when I introduce myself as the sort of grumpy uncle of microservices, I do say, and I’m sorry about that.
Henry Suryawirawan: Right. So I think, one of the classic things also for people who are building internal tools, like what you said just now, right? So there are only a few developers, but they create more than a number of services that they could handle as a one person. So, for example, one person could handle three to five microservices on his own. I think that’s also probably not the right thing to do.
[00:24:05] Scaling Law and Complexity Science
Henry Suryawirawan: So I think, let’s see how it goes in the industry these days. I wanna continue our conversation to the other thing that you mentioned just now in our conversation, right? We are always trying to optimize ourself. How can we deliver things faster? And lately you have also been giving talks in various conferences about this topic, which I had the opportunity to attend in YOW! Sydney back then in December, which I found really insightful. And I wanna go and maybe cover some of the things that you mentioned there as well.
So one of the most important thing that you mentioned in the talk is about, most companies, right, as they get bigger, actually it becomes very difficult for them to scale even bigger. While in the talk, you also mentioned that AWS, interestingly, didn’t have that characteristics. As they get bigger, actually they become much, much bigger. Looking back from my experience as well, right? I can see many companies as they tend to become bigger, as more number of people join the companies, things start to get slower. So maybe in the first place, let’s go there and try to analyze why things become slower as most organizations grew bigger?
James Lewis: Sure. Yeah, absolutely. Let’s do that. My observation on that is, and I should point out that again, this is meta research. This is me summarizing a whole lot of work, really insightful stuff that’s been going on in the world of biology and economics in the world. Things like city planning or all these different areas, that generally come under the term complexity science.
So there’s been a huge amount of work going on. Singapore is a big center of complexity sciences. There’s an institute in Singapore. Probably, the home of complexity science has been the Santa Fe Institute in New Mexico. And there’s just been some super interesting things coming out of there.
Well, what I find fascinating is they put together a cross-disciplinary group of scientists and academics, right? People who wouldn’t normally talk to one another talk to each other. And they get them together and they force a conversation, if you like. So you end up with economists and biologists and physicists and chemists and geographers, and all these different groups of people coming together to talk about problems in a holistic way. Or think interesting things that they’re seeing in a holistic way.
One example would be economics. So the Santa Fe Institute is a driver of this idea of complexity economics or non-equilibrium economics, where –I’m gonna murder this definition–but the idea is that traditional economics is based on equilibrium states, frankly, because the math is easy. The math is easy if you’re talking about systems at equilibrium. The observation from a bunch of people coming together in Santa Fe was, “Hang on a minute, most things aren’t at equilibrium, right?” Most things are in non-equilibrium states. Like the world is in a non-equilibrium state. So why is it we’re trying to approximate what’s really going on in economics with these massive oversimplifications. And the answer is because the economists are economists, they’re not mathematical physicists, right? So the maths that is easier for the mathematical physicists is not so easy if you don’t have that background.
So that’s one example of the sort of cross-disciplinary stuff that’s been happening. And they’ve been doing some amazing things. Using agent-based models, they can now do things like simulate the economies, non-equilibrium economies in the large. And they can produce by putting in the rules that the different countries in the world apply to the catalyst systems that they’re in. So the different market controls. Just by using these simple agent based models exchanging value, and these market controls, they can produce the Gini coefficients for nations around the world. So they can say, look, if we’re in the Netherlands, we’ve got this type of market control, right? Let’s run the model with these market controls. Oh, turns out we can get the Gini coefficient. So that’s the ratio between the richest in society and the poorest in society where the wealth is. It pops out for the Netherlands. If you put the market controls the UK has, the Gini coefficient pops out.
So they’re doing some fascinating, interesting stuff across the board. I think what really interested me was how it then related to how some of that research into scaling laws, into how things get bigger, relates to what we do as technologists or what we do in here or the organizations in which we find ourselves.
And there seems to be, I think in the talk I talk about Jeffrey West talks, to paraphrase, every time you see straight lines on a graph in lots of different places, all these straight lines basically replicating across different domains, different parts of the world, different types of complex system, then you should find something interesting in it, right? You should be looking like maybe there’s something underlying here. Maybe there’s something that’s like a truth we can go after. And that we can use scientific methods to try and determine. And that’s what they’ve kind of been doing in Santa Fe around scaling.
Jeffrey West, Professor West, I should say. I don’t know them, don’t know the gentleman. He’s been instrumental as well as some others. And what they’ve sort of found essentially is with different types of network, and I’m using the words network in the kind of mathematical sense, if you like. So whether you’ve got, like hierarchical, like graph, like directed graph like networks versus things like social networks. Different types of networks can explain the different straight lines on graphs that they’ve been finding in all sorts of weird and wonderful ways. Weird and wonderful places.
So an example, the straight line on the graph that is the scaling law for metabolic rate in mammals. You can draw a straight line. As mammals get bigger, there is this straight line that can be drawn about how many calories essentially they need to intake, to take in to survive. And there’s another straight line for how the infrastructure in cities expands. So as a city gets bigger, how many water pipes, how much more water pipes do you need? And it’s a very similar straight line with a very, very similar exponent. And it’s the same for companies as well, and company growth, revenue is the same.
And all of these different things appear to be related to a particular type of network, a hierarchical network. So water pipes in cities are hierarchies, in mammals our circulatory system is a hierarchy, essentially from a heart down to the capillaries. And in organizations, we have hierarchies of information flow, right? So where do ideas come from? Where does direction come from? Where do orders in some senses come from, right? And they tend to be hierarchies as well. Because as companies grow, it tends to become easier to create a hierarchy for which information trickles down through in the same way that water flows through pipes in the city than it is to do it any other way.
But the interesting thing about these hierarchical networks is that they all scale using what’s called a sublinear scaling law. So they scale with a sub exponentially. What that means is as you double the size of a city, you don’t have to double the number of water pipes. Instead you get less, you have to do less than that. And the same is you double the size of a mammal, you don’t have to double the amount of calories they take in. It scales less than that. And it’s the same for organizations, for physical infrastructure. As you double the size of an office, the network cables aren’t double, right? I mean, it’s the infrastructure in the office, it doesn’t double. This is what we call or economists call economies of scale. So as you get bigger, you pay less for things, essentially. And all of these complex systems exhibit these behaviors.
But it does also imply some other things. It implies the fact that things slow down, right? Things move more slowly when you have deeper hierarchies. So humans experience life more slowly than a mouse does. And it’s down to a mathematical relationship related to the depth of our circulatory system compared to the depth of the circulatory system of a mouse. Or seems to be explainable that way anyway, I should say. Science, we might find something else out. And similarly with cities, as cities get bigger, the hierarchies do the same thing.
But then the interesting thing is with companies, like what happens with companies and I think we experience this in our everyday lives. You know, if you work for a big company, as you get bigger, things do get slower. We do experience this personally day to day, you know, individually day to day. But also you can see it in the things, like the results that public companies post in terms of revenue and so on. So as a company doubles in size, it doesn’t double the amount of revenue, interestingly.
The other thing is like mammals, companies die all the time. And this is another fascinating similarity between the two which is also explainable via these ideas behind complexity science, complex adaptive systems. So as mammals get older, as we get older, we are taking the same number of calories, but we’re having to direct more and more of those calories to self-repair. To repairing ourselves rather than, you know, our vigorous youth, we’re not having to do that. And so therefore we slow down. As we age, as things start to break down and eventually we pass away. It seems like companies do the same thing. There’s some interesting facts in the book “Scale”. I think it’s the half life of a company on the exchanges is 10 years. So every 10 years, 50% of the companies and the exchanges will change.
So as I say, there’s all these interesting things that seem to be related to hierarchy. Things slowing down, things taking longer, as things get older and so on. Which I thought was really interesting, right? Because in my experience, when you plot out things like development life cycles, right, you look at a big old organization, you look at the software development life cycle, you plot that as a value stream map, you’ll end up with these huge, long processes, right? Massive. Where you’ve got, like, it takes a year from an idea to get even to the project management function, to be able to be scheduled to have some work done, right? Cause it’s bouncing around between different review boards, it’s bouncing between different architecture groups, and the technical business analysis versus the business analysis. And then what, I never know what any of these things mean. Then you hit development and then maybe some development gets done and it takes however long it takes. And then it pops out into some really long involved process of testing, you know, integration testing, user acceptance testing, regression testing, performance testing, blah, blah, blah, until you’re eventually in production. And you see this in every old, big, traditional company that I visit.
Conversely, if you look at younger companies, often they don’t have that. Usually they have much shorter time to market, right? They’re much more able to be much more innovative, because the time it takes for an idea to go from in someone’s head, that information passing through a series of process steps to making money, it tends to be much shorter. And so over time, there’s this thing that companies around add all these processes, these constraints. They add hierarchy to manage all these constraints and all these processes and things start slowing down.
There’s also another interesting idea that larger organizations do the same thing. Bigger older organizations rather do the same thing that happens with mammals, where they have to spend more of their revenue on keeping the company running. In the same way that as we get older, we spend more of the caloric intake on repairing ourselves, companies, they spend more money on just being the company, whereas at the start they were spending money on innovation and products and people and all this cool stuff, and that’s why they were successful.
Eventually, they end up in this position where they’re spending so much just keeping themselves going. They stop thinking about innovation, or innovation becomes hard, R&D becomes hard. I’ve got this great idea, but I don’t know where to put it, because there’s no one interested in it, because we’re spending everything we’ve got on, I don’t know, would it be rude to say marketing? I don’t know. But you know, that’s an observation.
But the other interesting thing is, coming back to what I was saying earlier, the two types of network. It appears there’s another type of network, which in the natural world is associated with a different type of scaling law. Something called super linear scaling. So super linear scaling is when you’re in advanced of exponential growth. So as you double the amount of one thing, as you double the number of people, you get more than double the revenue, for example. And in cities, cities are another example where social networks, massive social networks exist, between people communicating. As you double the size of a city, you get more than double the amount of socioeconomic things out of it. So you get more than double the innovation. You get more than double wage growth. Other effects, negative effects, you get more than double crime, pollution, disease, these things, right? And all these super linear scaling, these super linear factors are related to these social networks. So people talking with people.
I like to sort of explain it in the sense that, you know, if you live in a village of 150 people, right? I’m a software developer. I live in a village of 150 people. How many other software developers live in that village for me to connect to on a professional level? Maybe one? Who knows maybe more than one these days. But probably not that many. Whereas if you go to the YOW! Sydney, there’s a thousand people just at YOW! Sydney talking about innovation, talking about bootstrapping each other’s ideas, like the original conference I went to that we talked about right at the start of this, that bootstrapped all my ideas. So as you get a bigger city, as you get bigger places where more people live, people tend to congregate together with like interest, right? So you get this bootstrapping effect where people with like interest get together. That’s true for not just software and cool stuff with innovation and software. You get more than double the number of lawyers in bigger cities, right? You get more than double the number of accountants, etc, etc.
And related to these social networks and these social network graphs. For me, that’s a really interesting observation that, we know mathematically there’s this relationship, right? If you structure yourself, if information is allowed to flow in a particular way, then you get, like, the whole is greater than the sum of the parts, if you like. You get more out of than you think you should. And if you structure yourself in a different way, conversely, you get less out than you think you should. Informationally, you know. If the information flows hierarchically versus via social network.
Mathematically there’s a choice between the two. So the question is how to take advantage of this? How to use this knowledge to improve our ability to get stuff done, because that’s really what we want to do. Not necessarily make more money. Maybe, it’s saving lives, maybe it’s working at hospitals and organizing teams such that they’re communicating super effectively, which they do incidentally. And certainly in the UK. I can’t speak for the rest of the world. Or maybe, it’s schools. How do we organize our schools such that we’ve got the right information at the right time and stuff doesn’t have to trickle down slowly from above. Or if it’s a big organization, how do we organize ourselves such that people have got everything they need within a short number of hops from them.
And this is why, you know, we talk and go back to microservices. Microservices was all about product teams which have everything there. You’ve got everything you need on your team to solve the problem, right? To evolve your product towards being a better product. Which is, it’s the same idea, you know. And it’s related to Conway’s law as well, how do we avoid having to go outside our team, ask other people to do stuff for us, because that’s a kind of hierarchical way of doing things.
And the observation I make in the talk, of course, which was related to me originally by someone at AWS, was that that’s how AWS structures themselves. These many small, independent teams relying on self service infrastructure. And that’s another key thing that I should talk about in terms of microservices more. They’re so decoupled from one another that they’re able to add new things on the side super fast. So their ability to spin up new product teams is super quick.
Interestingly, I also sort of point out, I don’t know if you’ve ever done much work with AWS, but you know their APIs, you could be using one set of APIs and you look at a different set of products and they bear no relation to one another, right? Like the user experience, the developer experience across the different products is very different. But that’s deliberate. It’s not deliberate per se, as in we want things to be different. But it’s deliberate in the sense is we’re not gonna pay the coordination cost to make our APIs consistent, cause the coordination cost is going to slow us down.
So we’re going to allow the teams to be independent and that means we have to put up with the fact that the developer experience isn’t gonna be potentially the best. Which of course is interesting implications for things like microfrontends. Or how do you build apps that are able to use multiple teams to deliver parts of an app or parts of a website, parts of a page or whatever it is, retain that experience. Cause most organizations, they want their brand to be relatable across the difference. You don’t want one, the recommendations bit to look very different from the main product bit. But it means you have to pay a coordination cost to do that. I think that that’s a direct result of some of this complexity stuff.
Henry Suryawirawan: Wow, so many interesting stuffs that you brought up here, right? So things from the type of networks that you have in the organization, whether it’s hierarchical, social. It’s also amount of flow, how much information trickles down easily.
[00:40:01] Complex Adaptive System
Henry Suryawirawan: So one thing that I wanna bring up first, before we go into recommendations for all organizations or software teams here.
Many times actually, I also read that software teams is also, the analogy is like that it’s also a complex adaptive system, right? So for many people who may not be familiar with this term, they always think like, okay, it’s just another department within an organization. You can just produce whatever code that you can do, and then just deploy it. But I think we all know as a software engineer, that’s not that easy, right? So there are a lot of things that actually going on, and some people actually call it a complex adaptive system.
Maybe if you can touch on a little bit and explain why we should also think that software engineering team or product engineering team is also a complex adaptive system.
James Lewis: Yeah, that’s a good question. What is the definition of a complex adaptive system, right? I mean, again from the Santa Fe Institute, they’ve settled on a definition, which is a complex adaptive systems display four characteristics. The first one is that they’re inherently complex so that the whole is greater than the sum of the parts. That’s the first thing.
So for example, I normally explain that by saying, you know, I am a complex adaptive system, James. If you put James into a blender and came up with like James soup, it’s not very nice idea. And then you poured James soup into a James shaped jar. Would it be James? And the answer is no, right? I mean, this is the interesting thing, and this is why the history is fascinating.
It comes back to an old physicist called Murray Gell-Mann, who was fascinated by the idea of how you get such complexity from such simplicity. He wrote a book called “The Quark and the Jaguar”, right? How do you get something as complex as a jaguar from something as simple as a subatomic particle called a quark, right? What makes a jaguar, or what makes a human, or what makes a team or an organization or a city? So this is the idea of complexity, right? The whole is greater than the sum of the parts.
There’s also this idea of emergent behavior. So when we say emergent behavior, we mean that you can’t reliably predict the outputs of a complex adaptive system from its inputs every time, right? So it’s not a functional problem where you can say, given this stimuli, this thing will always happen. I mean, coming back to the thing about teams, we think teams exhibit that behavior all the time, right? You could never run the same situation. The same, maybe even user story or sprint twice, you couldn’t do it. You’d get different outcomes if you ran it a second time.
The third characteristic is this idea that they’re made up of self-similar parts. So in humans, the self similarity is the cells. We are self-similar. Our cells are self-similar to one another. Actually in mammals, and this is where the scaling laws work or come from. Our cells are self-similar to a mouse’s cell or to an elephant’s cell or to a blue whale’s cell. And on teams, the self similarity is the people on the teams are self similar with one another within limits.
And then this idea of self-organization. So there’s no central thing telling a complex adaptive system how to organize itself. Mammals, in our case, it’s DNA, right? There’s no, thing when we’re in the womb that says, “Okay, now build a heart.” They’ve just evolved over time that that happens according to some weird, magical, biological process.
And there’s obviously similarity with teams there as well. And so that’s why I talk about teams being complex adaptive systems. An example I use, again, the sprint thing. If you just change one member of a team, you’d get a different outcome, right? If one person is sick at a different time, you get a different outcome. If people pair or don’t pair, you get a different outcome. It’s almost like every single line of code could be or could not be written depending on so many different factors. And that’s why we talk about teams as being complex adaptive systems. And of course, organizations are made up of these components. They’re made up of these things. So an organization is also a complex adaptive system. I hope that somewhat answers the question.
Henry Suryawirawan: Definitely, and it’s not just within internal, the teams themselves. I think the external factors as well. There are so many complexities, what people call the VUCA world, right? The disruptions, the new technologies that come as well. So I think all these play a part as well to make it much more complex.
[00:43:47] Examining Sublinear Growth
Henry Suryawirawan: So, coming back to the topic of how can organization avoid this sublinear optimization growth? So would you be able to suggest some things, like, how can we actually first identify if we are showing a sign of aging? So, things get slowed down, things are not flowing much, much faster as when we are small. And how shall organizations start to think about the more optimal way to actually have this super linear growth that you mentioned?
James Lewis: That’s a really great question. And what I normally do is I dodge that by saying, “Hey, I’m just here to provide the information. You can do with it what you want.” But actually that’s not very helpful at all, is it? I mean, I do have in that talk some concrete advice, right? Which is, for example, you can use the metrics from the DORA reports or the Accelerate book, the four key metrics which should become very popular, rightly so, as indicators, leading indicators actually of organizational health. So, you know, the four key metrics, lead time to value. They’ve got a weird definition of lead time. It’s from commit through to production, whereas real lead time is actually from idea through soup to nuts. My old MD used to say. So lead time to value, meantime to recovery, change failure rates, number of deploys per unit time. But those things I think are really interesting to track.
Now, a lot of people then fall into the trap of saying, “Ah, great, we’ve got tools that will tell us this.” And then they spend eight months trying to automate measuring all this stuff. There’s no need, right? We don’t need precision. We don’t need the precision you get from knowing, on average, we have exactly 32.5 hours between each command. That’s not what we’re after here. What we’re after is saying, you know, look, accuracy, finger in the air. Your team, how long on average does it take to get stuff into production? That’s good enough.
And as long as you track that somewhere and as long as you reflect on it, maybe at the end of an iteration or whatever it is, when you’re doing quarterly planning or when your OKRs are being reviewed and you look at, are we better or worse? That’s what we’re really looking at. You know, are we trending in the right direction? Same for meantime to recovery, same for the others. We can use those trends to work out which direction we’re heading. And I use the analogy from biology, from medicine of going to the doctors, our general practitioners in the UK. They’ll take your blood pressure, your temperature, your heart rate, that kind of stuff. They’ll know if there’s something a bit dodgy going on, right? If you’ve got super high blood pressure, maybe there’s an intervention that needs to be staged. It doesn’t tell you exactly what the problem is, but it might tell you there is an issue.
Now there are a couple caveats to using the four key metrics, right? The four key metrics are only going to be improvable to the limit of the system you find yourself in. So if you’re in a deeply hierarchical organization with these massive long value streams to get stuff done, you can optimize four key metrics, but you are only going to be able to optimize to the limits imposed by that system, right? That’s one key thing. I think that’s kind of interesting. So that would be one thing.
I think the other thing is this idea of signals. When we have some of this information about how we structure ourselves fundamentally impacts performance to such a degree, which we kind of now know based on a bunch of data that’s out there in the public domain, how do we take advantage and how do we listen to signals from our own organization? So that’s one set of signals I’ve just sort of given you, which is a four key metrics. But the other thing is taking time setting the organization up so that people in the organization can take the time to listen for the signals.
Now I often these days use the phrase that people are so busy doing the work, that they don’t have time to think about how the work works. They don’t have time to think about the system and optimize the system. Because people are just head down, just doing stuff, doing stuff, doing stuff all the time. And often that’s actually why they hire people like ThoughtWorks to come in, right? Because we come up with a toolbox, with a bunch of techniques. We can say, “Hey, we’ll build you this value stream map, now we can optimize it.” I probably shouldn’t say this, it’s a secret. There’s no reason that you, for example Henry, can’t do the same thing with the same tools. It’s just that often we don’t have time to do that ourselves, you know. Cause we’re so busy doing other things.
So that will be the other piece of advice I would give is spend some time looking at the system in which you work. Understand how the work is working. Understand how flow is for your organization. And then you can work to optimize that. You can find, you can look at where those hierarchies are, the informational hierarchies. It might not be on the org chart, it might be in the value stream maps and work to change that.
And one thing, and I think I mentioned it briefly, I think one thing that is often overlooked is self-service and why self-service is so powerful. Because, as I mentioned, the problem with hierarchies is you’re trickling information one way. And often, if I have to ask someone to get something done, I’m waiting on someone else doing some work, right? That’s kind of like blocking my flow, blocking my circulatory system in a sense, right? So how can we make those systems such that they’re self-service. So I don’t ask someone to do something for me. Instead, they provide that in a self-service way.
And that’s what’s so powerful about AWS is that they recognize that by providing the undifferentiated heavy lifting–was the original term they used, all the stuff that the original AWS platform did –when they provide that self service, they solve a couple of problems. The first is you don’t have to wait on anyone to get stuff done, right? So that’s the flow improvement problem.
And the other thing is you solve the problem of scarce resource. Because it’s queuing theory, right? If you’ve got a request coming in to a service, into a queue rather. And you’ve got one thing able to take that request off the queue and process it. The only way in, by adding another message in, at the same rate. If you’ve got multiple producers feeding messages into a queue, then you have to have multiple consumers to process the messages at the same rate, to keep the queue from backing up.
And it’s the same with what we do. It is exactly the same with what we do. When we are asking people to spin up VMs for us, when we are asking another team to make a change for us, the only way for them to scale is for that team we’re asking to add more people. As they get more requests, they have to add more people. It’s unsustainable. As we ask ops to do more for us, they have to add more people. And that’s unsustainable.
So you invert the relationship and say, make these things self-service. Make providing infrastructure self-service. Make providing changes self-service. Things like developer platform engineering teams providing self-service access to tooling, to release pipelines, to databases on demand. All that stuff. That’s all about improving flow. But you don’t get to see that, really, unless you are able to visualize your system. Unless you’re able to step back, take the time and understand where the blockages are, where the cholesterol is building up in your arteries, if you like.
Henry Suryawirawan: So I think that’s a very good reminder. Yeah, like you mentioned, right? When we are working in the organization, in the team, we tend to just get things done, right? Okay, we have new features, we have new ideas, we have new roadmaps, whatever that is. And we just get things done. And we just live with the same habit. Maybe, let’s put it this way, it’s the same habit that we used to do, right? Maybe it proves successful in the beginning and we just continuously do that without actually monitoring and review the system, right? How the system looks like after some time that we go through it. So I think that’s a good reminder to always look at the systems. Think about how we do the work, right?
And I think the concept of platform teams, the self-service thing, it goes back also to team topologies, right? I think this work by Manuel Pais and Matthew Skelton. It’s really quoted in everywhere, I think I would say. You know, the concept of the stream aligned team, and the platform team, enabling team, and things like that. So I think that’s really good concept as well for people to look at how you can actually optimize your organization.
And you kept mentioning about flow, right? Optimizing flow. I think this is also another thing that is really critical for organization to figure out how the flow in your team actually moves. Is it fast or is it slow? And we can use the key metrics from DORA as one indicator. It’s not the true answer of everything, and we don’t have to be accurate. So thanks for sharing that.
[00:51:19] 3 Tech Lead Wisdom
Henry Suryawirawan: So James, unfortunately, due to time, we have to wrap up pretty soon. I have one last question though, which I would like to ask you, which I call this question, three technical leadership wisdom. So you can think of it like an advice that you wanna impart for us here to learn from your experience or your expertise. So can you share, maybe, your three technical leadership wisdom?
James Lewis: Yeah. And this is a big one, right? And you may expect me to come up with some deeply technical kind of set of answers here. But I’m gonna go completely the other way, I think. And these are three things that I’ve learned from people I’ve worked with in ThoughtWorks.
And in particular, I’m gonna call out, I’m gonna quote Daniel Terhorst-North here, cause he’s one of the people that I’ve really learned a lot from. He’s a very, very wise person. And the first thing is that empathy is a superpower, especially as you get more experienced. Being empathetic is absolutely the key to getting things done, I think from a personal perspective. So much so that, you know, I’ll go back to 2008, 2009, working on a team with Daniel. It was when he wrote the article that became the BDD article, that sort of invented whatever it was. He originated BDD, behavior driven development.
And in that, it’s all about being empathetic. It’s all about understanding not just what your team needs, what you need. But it’s about understanding what your other stakeholders need that aren’t just the business stakeholders, that are your stakeholders in security and operations. Your downstream, your upstream folks. Empathy in the large as part of the software development process, but empathy in the small. You know, empathy with the people you are working with. You don’t know what’s going on in people’s lives often. And, being empathetic goes a long way to creating a really fun environment.
And that comes to my second piece of advice or maybe rather it’s a statement. And I would say that seriousness does not equal professionalism. The best projects I’ve ever worked on, the most successful products I’ve ever built, the best teams I’ve worked in, they understood that seriousness was not professionalism. Having fun makes a better team. Having a good time, being empathetic with your colleagues, sharing good experiences; that makes the team pour as much as anything else. And I think we sometimes make the mistake that if we’re seen to be having a good time doing something, then surely we’re not doing it right, you know. But actually often I think it’s completely the inverse. So that would be my second one. Serious does not equal professionalism.
And then the final one, which is interesting, and again, this I got from Dan North. It comes from a conversation with Eli Goldratt, the great management consultant who wrote “The Goal” and “Beyond the Goal” and a number of other books. And he was asked about, why organizations often fail at adopting new things, new ideas, new innovations. And he says that in his experience, it’s because that there are four questions they need to ask, right? And they forget to ask one of them.
So the four questions are, if you’ve got a new thing, whatever that innovation is, microservices, Docker, Kubernetes, Continuous Delivery, you should ask yourself, okay, what are the cool things that this thing gives me? What can this unlock for my organization if we adopt this innovation as technology? And the second question you should ask is, okay, what current limitations in the company will this new innovation help us overcome? And then he says, you should then look and say, okay, what rules do we have to manage our existing limitations? And he says, when people adopt new technologies, often what they do is they stop there.
So as an example, Continuous Delivery would be a good example of this. When you’re adopting Continuous Delivery, that’s gonna give the ability to speed your time to market, to automate security, audit, all these really cool things, build baked quality in, etc, etc, etc, right? Now, what limitations do we currently have? Oh, we’ve got this massive manual testing process that we could probably use, Continuous Delivery would help shorten the amount of time it takes to do that. Okay. And then they’re, what rules do we currently have? Well, you know, the rules are that in order to get into UAT, you have to pass this barrier. In order to get into integration, pass this manual barrier.
And so you adopt Continuous Delivery and the development teams go super, super fast. But you forget to ask the fourth question: what new rules do we need? And the new rules are the important bits, cause if we try and manage new stuff with the old rules, you don’t get anywhere. So in the Continuous Delivery case, you can say, well, we’ll manage our new Continuous Delivery process, but using the same set of manual gates to get through to different environments. And it doesn’t matter if you’ve got a computer doing it, if it still takes you six weeks to get through, that you don’t gain anything. So what are the new rules? Look for the new rules that you need and implement the new rules. That would be my third one.
Henry Suryawirawan: Well, I’ve never listened to these four key questions that you mentioned, right? Whenever we want to adopt new things. So thanks for sharing that. And I love the second one. Seriousness does not equal professionalism. So I think, yeah. In some organizations they tend to want to be very strict and follow the process, always hit the target, right? But that doesn’t always equate to professionalism, what you mentioned, right? Or like, we need to create fun. And I think when we work in a fun manner, I think people tend to thrive, right? Tend to produce the best results.
So James, I think it’s really a great conversation, I learn a lot. So if people want to continue the conversation and maybe connect with you, is there a place they can find you online?
James Lewis: Yeah, I mean, the best place is probably via email. If they wanna chat with me via email. I’m james.lewis@thoughtworks.com . I’m also on Twitter @boicy, b-o-i-c-y. And I’m on LinkedIn. I don’t use it that much. So those are probably the three places that are the best.
Henry Suryawirawan: Thank you for this opportunity, James. I hope people also learn a lot from this conversation. So thank you again.
James Lewis: Well, thank you so much, Henry. It’s been an absolute pleasure talking with you and, uh, hopefully I’ll get to see you in Australia again before too long.
Henry Suryawirawan: Yeah. Looking forward for that.
– End –