#179 - Bottlenecks of Scaleups - Tim Cochran & Kennedy Collins
“As a startup, as a scaleup, you often get one chance. If the first impression is something that’s slow, doesn’t work, is down entirely, people will move on and go find some other way to solve that problem.”
Tim Cochran and Kennedy Collins are the co-authors of the “Bottlenecks of Scaleups” series published on Martin Fowler’s website. In this episode, we explore several key challenges faced by scaleups, such as product-engineering friction, service disruptions, accumulation of tech debt, and onboarding. Tim and Kennedy share their experiences and provide actionable advice on fostering collaboration, creating unified roadmaps, ensuring system reliability, and managing technical debt. They also emphasize the importance of efficient onboarding and developer experience in navigating the complexities of scaling up a startup.
Listen out for:
- Career Journey - [00:02:02]
- Definition of a Scaleup - [00:05:29]
- Bottleneck #1: Friction Between Product and Engineering - [00:08:24]
- Healthy Product-Engineering Tension - [00:13:36]
- Unified Product-Engineering Roadmap - [00:18:54]
- Bottleneck #2: Service Disruptions - [00:22:16]
- Cross Functional Attributes - [00:27:09]
- Bottleneck #3: Accumulation of Tech Debt - [00:32:39]
- Systems Ownership - [00:38:37]
- Bottleneck #4: Onboarding - [00:41:01]
- 3 Tech Lead Wisdom - [00:46:35]
_____
Tim Cochran’s Bio
Tim Cochran is a Principal in Amazon’s Software Builder Experience (ASBX) group. He was previously a Technical Director at Thoughtworks.
Tim has over 20 years of experience working with both scaleups and enterprises. He advises on technology strategy and making the right technology investments to enable digital transformation goals. He is a vocal advocate for the developer experience and passionate about using data-driven approaches to improve it.
Kennedy Collins’ Bio
At Thoughtworks, he leads product and design for the Central Market of North America. A product manager by trade and a designer by training, he’s most interested in creating (and helping others create) useful and valuable things — be it software or organizational structures.
He’s also a bit of a nerd about strategy, human behavior, health and fitness, productivity, writing, coffee, cocktails, board games, and the history of product management.
Follow Tim:
- LinkedIn – linkedin.com/in/timcochran
Follow Kennedy:
- LinkedIn – linkedin.com/in/kennedycollins
- Twitter / X – @kennedycollins
Mentions & Links:
- ✍🏻 Bottlenecks of Scaleups –
https://martinfowler.com/articles/bottlenecks-of-scaleups
- #01: Tech Debt – https://martinfowler.com/articles/bottlenecks-of-scaleups/01-tech-debt.html
- #02: Talent – https://martinfowler.com/articles/bottlenecks-of-scaleups/02-talent.html
- #03: Product v Engineering – https://martinfowler.com/articles/bottlenecks-of-scaleups/03-product-v-engineering.html
- #04: Cost Efficiency – https://martinfowler.com/articles/bottlenecks-of-scaleups/04-costs.html
- #05: Resilience and Observability – https://martinfowler.com/articles/bottlenecks-of-scaleups/05-resilience-and-observability.html
- #06: Onboarding – https://martinfowler.com/articles/bottlenecks-of-scaleups/06-onboarding.html
- 📚 Extreme Programming Explained – https://www.amazon.com/Extreme-Programming-Explained-Embrace-Change/dp/0321278658
- 📚 Refactoring: Improving the Design of Existing Code – https://www.amazon.com/Refactoring-Improving-Design-Existing-Code/dp/0201485672
- 📚 Implementation Patterns – https://www.amazon.com/Implementation-Patterns-Kent-Beck/dp/0321413091
- 📚 The Five Dysfunctions of a Team – https://www.amazon.com/Five-Dysfunctions-Team-Leadership-Fable/dp/0787960756
- Martin Fowler – https://martinfowler.com/
- Kent Beck – https://en.wikipedia.org/wiki/Kent_Beck
- Thoughtworks – https://www.thoughtworks.com/
- Basecamp / 37signals – https://en.wikipedia.org/wiki/37signals
- Stoplight – https://stoplight.io/
- Confluence – https://www.atlassian.com/software/confluence
Check out FREE coding software options and special offers on jetbrains.com/store/#discounts.
Make it happen. With code.
Get a 40% discount for Tech Lead Journal listeners by using the code techlead24 for all products in all formats.
Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.
Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.
Definition of a Scaleup
-
The way that I think about this is, it’s one way to think about it is a kind of post product market fit. The other way to think about it is a kind of through the lens of these problems and these scaling challenges that we’re talking about.
-
And having worked at a number of startups, the problem you face is that no one cares and no one cares and no one cares. And the thing that you’re really facing is awareness and getting those users and getting people to actually use your product. And then one day you have the opposite problem. If you’re successful, if you were so lucky, and then everyone is trying to come in all at once and everyone is trying to use your things and you don’t have enough time to keep up or do anything.
-
We really wanted to think about what are some of those foundational problems that you can solve that you should put in at the beginning? But it’s very hard to find, because we also experienced a number of scaleups where they had had a problem, because they’d over engineered or over invested at the beginning before they had product market fit. So it’s this incredibly hard balance of what things you should put in place that you won’t regret.
Bottleneck #1: Friction Between Product and Engineering
-
We do see it in a lot of our larger companies. Although the solution is often slightly different and the cause is usually quite different. And that’s sort of why we wanted to tackle this is because the solutions are often very common, but the cause is usually quite different. It’s worth talking about.
-
Because in these larger companies, oftentimes you get this disconnect because of large bureaucratic organizations and organizations that build up all these systems and processes where folks don’t talk to each other. And organizations that get caught up in the expectations of their funders and their capital. Public companies wanting predictability, turning a lot of software development into something about time and about budget rather than about outcome and about results.
-
But that’s not really why this happens in scaleups. This happens in scaleups in our experience because people were successful. You had a core team. We had product and engineering and there were no issues with communication cause we just talked. And it’s this fundamental thing that happens when you start to grow and especially when you start to add new people into the organization where you don’t build out those systems to allow for both collaboration. And collaboration will allow for autonomous teams, allow for a kind of success.
-
And without that, you end up in this place where the principals, people who started the organization, are still the bottlenecks constantly for product decisions, engineering decisions. And the people within the organizations aren’t aligned, aren’t working together, aren’t working as teams, and that causes a lot of issues. So that was kind of the thing that we identified as why this would be interesting to talk about specifically for scaleups and how to solve it.
-
The biggest warning signs that we really look at are really that breakdown of that shared goal. That finger pointing and saying, well, engineering didn’t do this, well, product didn’t do this. When you’re a scaleup, it doesn’t matter who didn’t do it. You’re gonna go out of business if you don’t succeed, so that’s not really helpful patterns, but it’s really common.
-
The other side of that, too, is oftentimes missed dependencies from a timing perspective. People understand that they rely on each other’s work, which again goes to this idea of having going from one team to multiple teams. This idea of engineers not having an enough understanding of the goal and the product to be able to make autonomous decisions, be able to move autonomously.
-
And with engineers and also kind of product folks at the team level, not knowing enough about where they’re going and the strategy and the goals to work autonomously and to have things continue to be bottlenecked through the original founders.
-
And then also this thing which is this negotiation between tech debt, infrastructure improvements and technical investment and product and feature and things that ostensibly are adding new features and functionality and bring new people in.
-
There’s this moment where everything where you run into these scaling problems and you need to invest in the technology. You haven’t enough. Or the inverse, like Tim said, you’ve baked in a lot of assumptions about your product into the system, and now you’re living with that, and you’re struggling with that.
-
And this conversation about how you make those investments and how you make really smart strategic decisions about what you’re going to let live and what you need to invest in so that you can continue to grow. It’s a conversation that needs to happen as a negotiation of equals. And when that can’t happen, then things really start to break down.
-
As a startup, you’re experimenting. And the only way you’re going to be able to experiment efficiently is that balance of a product idea. And then what’s the quickest way or the leanest way that I can experiment with it? And that only comes through negotiation between product and engineering.
-
We’re not saying that product engineering should get along all the time. In actual-fact, there should be a healthy tension there. A push and a pull, you know, in a good way because ultimately products will want the feature out as fast as possible and engineering should push back on the quality, the observability, the availability where appropriate.
Healthy Product-Engineering Tension
-
One of the important things to remember is you’re going to fail like over and over and over again.
-
The important thing is having a healthy transparency. And this ability to do blameless postmortems. And to really think about things objectively. Because if you get to that area where you’re siloed, you have protectionism, as a startup it won’t work. You have to look at things objectively and understand. No finger pointing and that kind of stuff.
-
I run into this a lot because we go to a lot of organizations and we talk about cross-functional teams. Everyone’s like we’re cross functional and what probably that meant in some places is they’re just kind of like the floor plan. They just sat teams next to each other, but they weren’t actually like communicating or like actually working together. It was still like a handover, a document handover, and not like, are you actually discussing and brainstorming and literally just pairing and working together?
-
A positive thing is just to try to build that vulnerability of actually creating and retrospecting together. So building that sort of vulnerability, forming that team at sort of higher level.
-
In terms of Agile ceremonies, the retrospective is the only thing that’s required. Everything else is optional. There are some good places to start, but the only thing you really need to do is sit down, make sure you do it. Sit down, talk to each other about how it’s going and what you can do together to change and improve. Everything else you can throw out. That’s the only thing that has to happen, in my opinion. And you can negotiate and figure out the best way of working for you and your team and your goals.
-
One of the things that I think happens a lot, the engineering is focused on scalability. We’re focused on tech debt. We’re focused on uptime. And the product is saying we’re focused on shipping new features. We’re focused on achieving product market fit. We’re focused on bringing in a new persona into our product, whatever. All those goals are fine and reasonable a lot of the time. But there are ultimately goals that are designed to further the success of the scaleup for the success of the organization.
-
And if you’re not talking together and setting and making both goals everybody’s goals, and talking about it through the lens of the business value. And at the same time, comparing that as much as you can, apples to apples, and you can’t always. But trying to kind of have a conversation about those things and say, okay, do we need to this this month, this week, this quarter on five 9s of uptime or getting a new persona in? Those are really different questions. They do ultimately impact your success as an organization. And so getting really clear and having that conversation together around what’s important and what, how we’re going to create that value is really important.
-
The other big thing, it’s just making sure everyone understands how your organization creates a value. Scaleup, at least commercial ones are trying to provide a service for people, provide a product or a service for people a cost to them that is lower than they charge the people and therefore make a profit. And that value, there’s a very clear understanding of what that value, what value are you providing to your customer and what value can you capture back in the form of? Usually, revenue if it’s a direct SaaS business.
-
Making sure everyone understands that core business, really the business that you’re in, the exchange of what you’re providing to society and the value you’re getting back from them in order to kind of continue providing that value and then how their team contributes to that. And the inverse is once people start to understand that, in our experience anyway, they’ll come up with great ideas, actually.
Unified Product-Engineering Roadmap
-
It comes back to this idea of fundamentally understanding the business in what you’re doing.
-
Ultimately, every result an organization has is just human behavior. Including often the most important human behavior that we care about as a business, people giving us money. But it’s all just human behavior. Every metric you look at is just human behavior in aggregate. Or it’s a system behavior that was programmed by humans.
-
It’s having that really clear understanding, that value loop or that hypothesis. What behaviors are you supporting? What behaviors are people paying you to support? And then when you have that, it makes it much easier to have this idea of unified roadmap, at least to have these high-level hypotheses, where you can top down from the CTO or whomever.
-
Oftentimes in my experience, there’s a healthy balance between the two. And finding the right balance between those two as a conversation between leadership and the teams. And it’s a conversation between product engineering. Understand how it ties back to what you’re doing as a business. And that ultimately gives you a common language to negotiate with or to collaborate with and to find that space.
Bottleneck #2: Service Disruptions
-
This is what really hurts when you get to hypergrowth and you haven’t put the fundamental systems in place.
-
There’s a lot of stuff that we’re quite pragmatic about, like availability. It’s easier. It’s almost a non-negotiable. Of course, there’s a spectrum to it, but I think it’s a lot easier to just from the beginning to build those habits and practices in rather than trying to do it once you’ve scaled.
-
We have to be a little bit careful about scalability, but scalability in a simple way, not in a complex way. Because right at the moment when you finally, you’ve gone viral, your product’s caught up and then your customer service is disrupted.
-
As a startup, as a scaleup, you often get one chance. Or two or you maybe get another one in five years if you stick around that long. That first impression is often the only impression. If the first impression is something that’s slow, doesn’t work, is down entirely, people move on. People go find a competitor. People will go find some other way to solve that problem.
-
Three 9s of availability. You might say, oh, that’s only 0.01 percent of users who are going to see that. Those 0.01 are never coming back. Risk actually is a function of both odds and impact. And stuff that is low odds, but really, really high impact, you should work to avoid. It’s critical.
-
It’s one of those things that, at least in our experience, is not that hard to just make different decisions every day to build. It’s not like it’s this heroic effort. You have to invest a lot of little decisions you make every day, as you’re building software that allows you to have software that’s instrumented and scalable, that kind of thing. Whereas it becomes a Herculean effort later if you try and do it, and it’s even more expensive, usually.
-
It’s always these trade-offs we think about, but it’s one of the things that is really worth trying to build in from day one. Because ultimately, you’re trying to get to that growth. Ultimately, trying to get to that scale and resilience and monitoring and understanding, obviously in a reasonable way, not gold plating anything, is really worth your time.
-
In the realm of service disruption, monitoring, observability; and monitoring, not just at an infrastructure, but at a business metric and a product metric level. Otherwise you’re in the dark. Cause you’re doing experiments, but I have no data to come off. So I think that sort of experimentation infrastructure is, for a scaleup, is important.
-
And then I have to say testing. We’re big fans of TDD. As a scaleup, you will have to change the software a lot. If it’s not a one-off throwaway experiment that you’re just trying to see what happens, that you know you’re going to toss and kind of rebuild later. Good automated testing around that is really critical to that ability to change safely, that ability to tweak, that ability to experiment, that ability to evolve your product.
-
If not, you end up in this place that we see so often where you hit that hyper scale. And then the first thing you have to do is a massive rewrite of your core platform. And then you’re not shipping any new features for six months or a year, and then you get lapped by your competitors. It happens. We see it a lot, actually. And if you build in ability to change through testing from day one, it really kind of makes that above ability possible.
Cross-Functional Attributes
-
We don’t like the term non-functional requirements. We use the term cross-functional requirements. A lot of it cuts across different specific pieces of functionality, but it is critical to the ultimate success of the product.
-
One of the things that I’ve been on about product people participating in the setting of SLAs or SLOs. Product should be part of that conversation because thinking about those, the customer impact and the relative risk of downtime and the relative risk of service disruptions is important. This goes back to this idea that risk is a function of both and odds. And if it’s low odds for really high impact, that’s one thing. If it’s something that’s like really low impact, it even doesn’t matter that much. Invest where it’s important, where it’s going to really impact your business.
-
Often, I’ve seen the non-functional, cross-functional used as an excuse for engineers to build the most fanciful systems because it has to be like absolutely real time or something, and they want some sort of event sourced, event driven system. And because they want it to, they’re just trying to justify it.
-
There’s a lot of pragmatism that I think is important. And it’s from both sides. Sometimes, a developer will ask a product manager, should it be in real time? And they probably just say yes, unless they’ve understood it properly, like what is that trade-off. We’re probably talking like millions of dollars’ decision right there and then in real time, there’s such a difference. Every nine is another order of magnitude more expensive. So what does it need to be?
-
This tax can be hidden quite a lot, especially with scaleups. You often have a core team and unless you’ve got a good product management system, they may be just doing a lot of work to keep the system up. We see this quite a lot. The core team, they build it, they can solve problems really quickly. And perhaps they don’t need to build automations at the beginning, because they can just solve it.
-
But the problem is, over time, those issues increase. And then, someone’s asking, well, why are we not building features fast enough? And it’s actually because a lot of the team is, a lot of these are taxed on the team. And often it’s just unknown, because it doesn’t show up in the product management system.
-
You don’t see that unplanned work. You talk about how fast the planned work is happening, but you don’t talk about what’s happening with the unplanned work and the volume of that unplanned work.
-
A lot of that is just not lying to yourself. That sounds simple, but it’s like being honest with yourself. And this goes back to this thing we talked. Blameless cultures and retrospectives and really talking about what’s going on in a blameless way.
-
You’re not having those conversations and in a meaningful and transparent way. It all just gets hidden or pushed under the rug, or you don’t want to bring it up, and then it doesn’t get addressed.
Bottleneck #3: Accumulation of Tech Debt
-
Some of the warning signs, like development slowing down. Some of it could be in the impact to the end user. Some of it’s going to be harder to track. But unless you’re really sort of tracking developer satisfaction, it can be problems like often we’d go to a client and there’ll be some strange, unique system that they’ve created that everybody complains about. And perhaps it was not documented in a very good way or some legacy or something. And those kinds of things, like they really hurt, like onboarding, can hurt maintenance.
-
The other thing is it’s about engineering satisfaction, because if a developer is kind of like complaining about tech debt and they have no autonomy or no ability to reduce that tech debt, then that’s going to affect productivity.
-
Often a pragmatic developer doesn’t understand that it’s debt, and it’s something that we’ve taken on. But if there’s no conversation about it, or there’s no light at the end of the tunnel, that’s when it can become an issue. And that’s when it’s probably imbalanced in that sort of product engineering balance. Maybe leaning more towards the sales team or the product team is sort of making all the decisions.
-
A lot of that time you could spend on paying down tech debt is spent on keeping the lights on. That’s often a good signal that it’s time to invest a little bit more because you can buy that time back and buy that scale in the future and it’s worth taking the time now to pay down that debt a little bit.
-
The other thing is to think about onboarding and to use onboarding as a moment to identify some of that stuff. Like if you’re successful and lucky in scaling, you’re often hiring. And there are new folks coming in frequently. Asking them and setting an expectation that they are sensitive to things in your systems is really helpful. Because you only get those new eyes once.
-
Having those new people and asking them, making it part of the onboarding to identify some of those things and things that they don’t understand or they think are strange so that you can use as a feedback loop, is really helpful.
-
The other part of that, there’s a certain quality of work or pride of work that becomes important in this space. And this goes back to the satisfaction thing, like there’s that pride of work. And if you’re able to instill that, it creates a lot of value of almost like camping. This idea of cleaning up after yourself while you’re camping, leaving it cleaner than you found it, when you’re refactoring, that kind of stuff.
-
One of the things that I do a lot as a product manager is often set a budget where that’s the team’s carte blanche to spend on tech debt and specifically usually tech debt [that] is impacting their day to day, their quality of life.
-
Oftentimes, to your point about a unified roadmap, there are larger technical initiatives or larger infrastructure, reliability, whatever initiatives that should come in on a unified roadmap. But there’s also this budget of just like, this variable is named six different things in six different places. We should probably get some consistency here.
-
Setting that budget, I think is really helpful. And that budget should be negotiable as well. If there’s a big crunch time, we need to turn that to zero right now, that’s fine. But what’s the payback?
-
Finding that way to do that time and do that cycling back and that cleanup is really helpful. And kind of framing it that way. It’s like, hey, engineering leaders, this is your place to make decisions. And hopefully, the engineering leaders are devolving that to the team too, so they can clean up after and do that stuff that makes them proud of their work.
-
Sometimes tech debt isn’t a very useful term, because while it’s a sort of good analogy, it becomes this sort of amorphous blob of stuff. You should be describing what are actually in the backlog and the product team should take time to actually understand it. And the tech team should take time to explain it as well. But if there’s just this list of tech debt that the product team doesn’t understand, then you can’t have that negotiation kind of thing. Sometimes it’s not very helpful. You just got to be careful not to just bucket everything. Because a lot of the stuff that we see in tech debt backlogs, not a lot of the things will actually affect a customer’s experience.
Systems Ownership
-
It can be difficult at the beginning. Because you have this small team that everybody owns everything. And probably at the beginning, you may have an individual who would probably build something. But I think it’s important as you scale that you understand your technical landscape and everything has an owner. That is probably something we’d also say is a sort of non-negotiable, that kind of like making sure that everything has an owner and is assigned to someone and someone’s aware of it, even if they’re not working on it.
-
And then this goes back to that pride of ownership, pride of work thing. You care about things you own and you don’t like. You care less about things you don’t own and just having that ownership and also having engineering leadership that checks in.
-
Following up and making sure that the people who are owning it are actually owning it, and not just have their name next to it on a sheet of paper somewhere, and aren’t actually doing the things you need to do to really own something.
-
The original metaphor of tech debt term comes from is this idea of, in the same way that I took out a mortgage to buy my apartment that debt allowed me to do a thing I wouldn’t otherwise be able to do, but I do have to pay that down. And if I don’t pay it down, the bank’s going to take my house.
-
It’s this balance, making some decisions right now to get some leverage when you wouldn’t otherwise be able to get that leverage, which, especially as a scaleup, is really, really critical. But also not getting so over leveraged that the smallest issue blows everything up. Or that you know you’re spending your entire paycheck on debt service instead of actually saving some money or you know investing in things for the future.
-
Calling everything tech debt and throwing your hands away is not really helpful, but that metaphor is actually a really useful frame to have conversations about this stuff.
Bottleneck #4: Onboarding
-
The last step of the onboarding checklist should be to provide feedback on the onboarding checklist, so that for the next person, it’s better.
-
In the article, we have this kind of optimal onboarding timeline. And it’s a very extreme example. It’s extreme, but it’s based on real world experience. We work for companies that have amazing onboarding experiences. And I do believe that those companies can pivot quickly, can scaleup faster.
-
One of the things we often think about when we think about onboarding is it’s given a short shift because it’s treated as the new hire orientation. The point we make in the article is that it’s actually a key business capability. And it can drive a lot of your scale.
-
A few tips. Deploying on your first day. Having a workstation, having your dev environment set up and good to go. Some research from Microsoft, they were noticing that people that deployed early actually were more productive in their career. They thought it was because, oh, they must’ve done like a story early and got into like learning the domain and all that kind of thing. But actually what it was is more, they were doing a trivial change, and it was because they were able to talk the language of the team. I recommend doing a trivial change on the first day just to make sure the environment’s set up and then you start to understand all the different systems and those kinds of things.
-
A lot of the onboarding thing is like I am improving onboarding, but I’m also improving the developer experience and effectiveness of the team. The things that a new developer will find hard, probably, developers are also finding hard, but not to the same extreme. If you improve knowledge, if you improve the developer experience, the friction, the communication tools, it’s going to improve for everyone. So it’s not right just to think about it like an onboarding or a new hire orientation. It’s really improving the effectiveness of the whole team this way.
-
Beyond the sort of technical aspects, a lot of it is about building that ability to collaborate. And to quickly getting to know your team, work with your team, getting to know your cross functional counterparts. Some of those things that we have in our checklist are just really understanding what is the company mission, what are the business goals, like actually talking to a leader to present that. Having an OKR, those kinds of things.
-
At one of the scaleups I worked at, I was the Head of Product and I personally did the business context onboarding for every single new hire far after you would think I would have stopped doing that. Because I found it incredibly valuable. If I can spend an hour of my time or 90 minutes and increase this person’s effectiveness by 10 percent over the course of the time they work here, that’s an incredible return on investment. Even though, yes, it’s hard for me to find 90 minutes a lot of the time.
-
I made that a priority and just did it a lot of the time. Cause it’s really, really, really, really valuable. To make sure you understand and can ask those questions. They can ask those questions from someone who has a more holistic view. Having more senior folks do that, and then your team can onboard into your team context, but getting that holistic view so you can understand what’s happening outside of your team context is really helpful.
-
This idea of onboarding as not being just for the person who is being onboarded, but also for everybody, and also kind of building this set of things that is most important for the new person, but it’s helpful for everyone. Those onboarding documents are often so helpful for adjacent teams who need to understand about what you are doing who are new to collaborating with you. That onboarding happens constantly, especially as you grow.
-
If you can go read those things and then go to the conversation rather than showing up and going, so I hear you guys are the contracts team. What does that mean? That’s such a wildly different experience to showing what the software does, knowing what it does today, knowing what they think their goals are, knowing what their roadmap is in the near future. That makes it such a different conversation. And that’s helpful for onboarding, but it’s also helpful for everyone.
-
And the other thing too is scaleups change so fast that a team you thought you knew what they were doing six months ago or a year ago is completely different now. Even if it has the same name. And again, that being able to re-onboard yourself or re-orient yourself into what your colleagues are doing, it’s just very, very helpful. And using onboarding of new hires as a lens or a forcing function to improve all of that is really key.
3 Tech Lead Wisdom
-
When we talk about knowledge and sort of collaboration and that kind of stuff, a lot of times it comes down to documentation.
-
One of the things is the best way, especially for a junior developer, to kind of learn is via observation. When you document something, you write down what you think is important. But actually, what might be really important is, you know, how the shortcuts, how does someone arrange their tabs, these things.
-
Especially with onboarding, having opportunities for developers to observe other developers, especially senior developers, is incredibly important, whether that’s pair programming, mob programming, those kinds of things.
-
It’s not just for developers. Pairing as product people, pairing as designers is really, really valuable too. In the same way, how you navigate through problems, how you navigate through finding information, seeing those things.
-
And to Tim’s point, like the stuff you write down is the stuff that you think is important, but oftentimes, the stuff that you do every day is often the more valuable stuff and you just don’t even think about it because it’s so automatic.
-
-
It’s those little things that you do every day that add up to success.
-
A lot of these things that we talk about, they’re not big initiatives. They’re not big efforts and they aren’t successful if they are big efforts. It’s stuff you show up and do every day. It’s those little things that you do every day that add up to success.
-
Making sure that paying down tech debt every so often making sure that you are showing up and thinking about and taking the time to think about how the work you’re going is going to impact the overall goals of the organization. Not lying to yourself about the impact of the things you want to do because you want to do them.
-
All that, it’s simple, but it’s hard. And that’s the stuff that is surprisingly valuable and surprisingly difficult. But it’s also one of the things that I find heartening about that is because it’s simple, but difficult. Anyone has access to it.
-
It’s not this thing that you need to be a genius to do. It’s just a thing you need to, the commitment and the discipline and the showing up every day to give a shit to do it.
-
-
Particularly, if we’re talking about scaleups. You may have some exotic, complex technology that should be in the core feature of your system, but everything else should just be simple.
-
What’s the simplest way that I can build whatever supporting systems I need and not waste any time on it? And put all your energies into that complex things.
-
As an engineer, you get excited about solving a problem or creating a rules engine or whatever it is. Ideally, you can use a third party or something. But if you can’t, how can we do it in the most simple way and still scalable?
-
One of the best pieces of advice I got when I started my first company was startups don’t starve, they drown.
-
The thing that will kill you is a lack of focus, not focusing too hard on one plan and not being successful in that place. You obviously have to stay honest with yourself and pivot when you need to, but you can’t pivot if you’re not going in a direction. That’s not what pivoting means.
-
If you’re just spinning, it doesn’t matter. If you’re just trying to do everything all at once, it doesn’t matter. Focus on what the core is, the thing that’s really the value you’re creating. Make everything else as simple as possible. Just don’t worry about it.
-
[00:01:30] Introduction
Henry Suryawirawan: Hello, guys! Welcome to the Tech Lead Journal podcast. Today, I have with me Tim Cochran and Kennedy Collins. Both are some of the co-authors of this series called “Bottlenecks of a Scaleup”, right, published on Martin Fowler. And I believe when they wrote it, they were in ThoughtWorks. Although now probably it’s a little bit different. But I think I find these articles really, really insightful, especially for those of you who are going through this journey, you know, working in scaleups, there are so many challenges, there are so many things that you have to take care of. So welcome to the show, Tim and Kennedy.
Kennedy Collins: Thank you.
Tim Cochran: Hi.
[00:02:02] Career Journey
Henry Suryawirawan: So maybe before we start with all the discussions about the bottlenecks, I’ll let you introduce yourself a little bit and maybe share us highlights or turning points that you think we all can learn from you.
Kennedy Collins: Sure. I can go first. Kennedy Collins, based in Chicago. Product manager by trade, I guess, although my background is in architecture. I studied buildings. But yeah, in terms of how I kind of ended up in this space and this topic, I graduated, uh, university into kind of right after the financial crisis of 2008, which led to a really bad time to build buildings, right? And so I had been doing various bits of marketing work and various kind of marketing and design things here and there since I was like in high school, just kind of working here and there. And so I ended up getting a job with a startup here in Chicago, because it was a bad time to do buildings. And grew that into a product management role. And then spent about three or sort of five years or a little bit more working in startups in Chicago.
And then from there, joined ThoughtWorks. And it was a very interesting transition, because I had been in startups, including one that I started myself for my entire career at that point, pretty much. And then not only was ThoughtWorks the biggest company I’d ever worked for, I was, you know, my first client was the bigger than ThoughtWorks, the biggest company I’ve ever worked for. But actually the ThoughtWorks team on that client was about the same size as the largest company I’d ever worked for. So it was a big transition. And then from there, I ended up spending a number of years working in some of our public sector clients, including for The Department of Veterans Affairs which is literally the largest federal government agency. So I went from this tiny company thing to this giant company thing. Now, I bounced right a little bit inside of ThoughtWorks as well, but I’m currently leading our product design and development practice from North America. That’s me.
Tim Cochran: So I worked for ThoughtWorks for 19 years. I think my career made it a highlight was before ThoughtWorks, I worked in a company that was exploring Agile. And I think, you know, bought book by Kent Beck, Extreme Programming and Refactoring, the Design Patterns book. And I think that led me down the route of researching and then applying to ThoughtWorks and getting in. And it’s a great, was a great company to learn your craft as a developer. During that time, lots of innovation coming out of the company. And how I ended up working with scaleups is, so I was working in the East market. And the nature of some of ThoughtWorks principles like open source and diversity and things like that led us to work with a number scaleups or more like scaled digital companies, so companies like Spotify and Etsy.
The thing with ThoughtWorks is you never get a project, they don’t call you if everything’s going well. So it’s always a problem. So we got called into a few, like scaling companies that were scaling or scaleups that had a massive bottleneck. And then at the same time, I was also working at enterprises, right? So it led me down this route of sort of comparing all these things of like these scaled digital natives, these enterprises that were struggling to modernize, and then these companies that were really bottlenecked. And so that got me interested and I petitioned to sort of run a studio at ThoughtWorks and it led to a lot of research and a lot of talking to people. At ThoughtWorks, we have tons of different backgrounds and people that had worked in different areas. And so doing a lot of research with people like Kennedy to try to create these articles. And worked together on projects. Like all the articles were based on real world experience.
[00:05:29] Definition of a Scaleup
Henry Suryawirawan: Thanks for sharing some of your journey, right? So I think, definitely, when we talk about scaleups, right, there’s so many challenges. But before we go through that, maybe if you can level set, what is scaleup, right? And at which point of a startup journey that you can call yourself scaleup? I think that will be great for the listeners.
Kennedy Collins: Sure. I think we probably each have our own slightly different definitions, but the way that I think about this is like, it’s one way to think about it is kind of post product market fit. But I think the other way to think about it is kind of through the lens of these problems and these scaling challenges that we’re talking about, right? And having worked at a number of startups, the problem that you face is that no one cares and no one cares and no one cares. And the thing that you’re really facing is awareness and getting those users and getting people to actually use your product. And then one day you have the opposite problem. If you’re successful, if you were so lucky, and then everyone is trying to come in all at once and everyone is trying to use your things and you don’t have enough time to keep up or do anything.
I think about this for a company that I worked at, right? We were, as a media company, and we launched a new product. We launched a new website. And we had 10,000 users, 20,000 users, 30,000 users, 100,000 users, 10 million users, 40 million users. That’s monthly numbers. And it just like went, phttt! And then everything started melting all the time. And then also it became really, really hard from a product perspective and also we raised 25 million Series B at that point, which also led to a bunch of other issues, because now we have, as opposed to operating from this place of scarcity, we’re operating from this place of surplus and how do we make good choices.
And so it’s not really that transition, that post that transition, which is one of the reasons why we wanted to write these articles and why I was really excited when I found out Tim was doing it. Because it is such a dramatic mindset shift from in that early kind of scaling portion that it’s challenging. And I think it’s a hard place to navigate. I don’t know. Would you agree with that, Tim?
Tim Cochran: Yeah, I do. And I think, you know, a lot of folks obviously think about hypergrowth. And it’s the desired state, I think. But I think also like there could be companies that are just growth, not necessarily hypergrowth. Like ThoughtWork’s a good example of that, like we had our own scaling problems, but I don’t think it was hyper growth. So it’s interesting. But as Kennedy says that sort of post-product market fit. And yeah, we really wanted to think about like, you know, what are some of those foundational problems that you can solve, that you should put in at the beginning? But it’s very hard to find, because we also experienced a number of scaleups where they had had a problem, because they’d over engineered or over invested at the beginning before they had product market fit. So it’s this incredibly hard balance of what things you should put in place that you won’t regret.
Kennedy Collins: Right. Yeah.
Henry Suryawirawan: Yeah, I think it’s always challenging, right? So as a startup, you want to get the product market fit, so you build as much as you-can, right, to get the product market fit. But sometimes when you are successful, right, you kind of like lose the balance a bit. And like what Kennedy said, somehow, there’s a point in time where it just flips. All these concerns suddenly becomes more important.
[00:08:24] Bottleneck #1: Friction Between Product and Engineering
Henry Suryawirawan: So in the series, there are six bottlenecks that are identified for scaleups. Probably, we won’t be able to cover all of them, but let’s just start with some of those things that I find really interesting, based on my experience as well working in the scaleup. The first one is about friction between product and engineering. I think we can see it not just in scaleup, right, in many organizations, sometimes this becomes a friction. Maybe we’ll try to cover by covering what are the warning signs, how can people identify if there are major frictions, and what are some solutions? So maybe let’s start with the product and engineering friction.
Kennedy Collins: This is the one that I wrote. And I helped edit some of the other ones kind of stuff. But this the one that I was one of the two main co-authors on. So yeah, I can start talking about this.
This is really something that we started talking about because we do see it in a lot of our larger companies, right? Although the solution is often slightly different and the cause is usually quite different, right? And that’s sort of why we wanted to tackle this is because, well, the solutions are often very common but the cause is usually quite different. It’s worth talking about. Because in these larger companies, oftentimes you get this disconnect because of large bureaucratic organizations and organizations that build up all these systems and processes where folks don’t talk to each other. And organizations that get caught up in the expectations of their funders and their capital oftentimes as well, right? Public companies wanting predictability, turning a lot of software development into something about time and about budget rather than about outcome and about results.
But that’s not really why this happens in scaleups. This happens in scaleups in our experience because people were successful, right? You had a core team, you had, you know, it was me and Tim starting a company. We had product and engineering and there was no issues with communication, cause I just talked to Tim. And it’s this fundamental thing that happens when you start to grow and especially when you start to add new people into the organization where you don’t build out those systems to allow for both collaboration. And collaboration will allow for autonomous teams, allow for kind of success, right? And without that, you end up in this place where the principals, people who started the organization, are still the bottlenecks constantly for product decisions, engineering decisions. And the people within the organizations aren’t aligned, aren’t working together, aren’t working as teams, and that causes a lot of issues. So that was kind of the thing that we identified as why this would be interesting to talk about specifically for scaleups and how to solve it.
Yeah, the biggest warning signs that we really look at are really that breakdown of that shared goal, right? That finger pointing and saying, well, engineering didn’t do this, well, product didn’t do this. Well, I mean, I don’t know, when you’re scaleup, like, it doesn’t matter who didn’t do it, you didn’t do it, and you’re gonna go out of business if you don’t succeed, so that’s not really helpful patterns, but it’s really common.
The other side of that, too, is oftentimes missed dependencies from a timing perspective. People understand that they rely on each other’s work, which again goes to this idea of having going from one team to multiple teams. This idea of engineers not having enough understanding of the goal and the product to be able to make autonomous decisions, be able to move autonomously, right? And with engineers and also kind of product folks at the team level, not knowing enough about where they’re going and the strategy and the goals to work autonomously and to have things, again, continue to be bottlenecked through the original founders.
And then also this thing that Tim brought up a minute ago that I think is really interesting to talk about, which is this negotiation between tech debt and this negotiation between infrastructure improvements and technical investment and product and feature and things that ostensibly are adding new features and functionality and bring new people in, right?
It’s like we were talking about a minute ago, and I’d love for Tim to talk about this a little more because he’s much more articulate about it than I am. There’s this moment where everything, you know, where you run into these scaling problems, right, and you need to invest in the technology. You haven’t enough. Or the inverse, like Tim said, you’ve baked in a lot of assumptions about your product into the system, and now you’re living with that, and you’re struggling with that. And this conversation about how you make those investments and how you make really smart strategic decisions about what you’re going to let live and what you need to invest in so that you can continue to grow. It’s a conversation that needs to happen as a negotiation of equals. And when that can’t happen, then things really start to break down.
Tim Cochran: Yeah, I mean, it’s important, right? Like as a startup, you’re experimenting, right? And the only way you’re going to be able to experiment efficiently is that balance of a product idea. And then what’s the quickest way that I, or the the leanest way that I can experiment with it. And that only comes through negotiation between product and engineering. And it’s interesting, but we’re not saying that product engineering should get along all the time. In actual-fact, there should-be
Kennedy Collins: There’s a healthy tension there.
Tim Cochran: Yeah, a tension, right? A push and a pull, you know, in a good way, right? Because, you know, ultimately products will want the feature out as fast possible and engineering should push back on the quality, the observability, the availability, right, where appropriate.
Henry Suryawirawan: Yeah, I think one of the biggest disconnect as well that I can see, right, as both organizations grow, they kind of like focus on their functional silo most of times, right? And people have different priorities. Maybe sometimes, as they get larger, right? Kennedy mentioned about bureaucracy and-process. So things probably starting get-slower. There’s a lot bureaucracy. And I think there are a lot of cases as well where sometimes engineering doesn’t understand why the product wants it certain way and also the other way, right, product doesn’t understand why engineering thinks some of the things are important.
[00:13:36] Healthy Product-Engineering Tension
Henry Suryawirawan: So maybe if you can give us some pointers, like what are some of the solutions that you can propose for us to have lesser friction? Some friction is good, right, just like Tim mentioned. But how can we have the healthy friction, healthy-tension?
Tim Cochran: Well, I mean, I think one of the important things to remember is you’re going to fail like over and over and over again. So the important thing is like this sort of like having this healthy transparency. And this kind of like blameless…
Kennedy Collins: Absolutely. …
Tim Cochran: ability to do blameless post mortems. And to really think about things objectively. Because if you get to that area where you’re siloed, you have protectionism, as a startup, it can’t, it won’t work. Because you have to be a Teflon. You have to look at things objectively and understand. No finger pointing and that kind of stuff. So I mean, one of the big things we see a lot is this. I run into this a lot because we go to a lot of organizations and we talk about cross functional teams. Everyone’s like, oh, we’re cross functional and what probably that meant in some places is they’re just kind of like the floor plan. They just sat teams next to each other, but they weren’t actually like communicating or like actually working together. It was still like a handover, a document handover, and not like, are you actually discussing and brainstorming and literally just pairing and working together?
Yeah, so I think, not to an anti-patterns, but a positive thing is just to try to… and that kind of requires a lot of work, right? To build that vulnerability of like actually creating and retrospecting together. So building that sort of vulnerability, forming that team at sort of higher level.
Kennedy Collins: Yeah, we love retrospectives for that in particular, right? I used to joke, I still do actually, that in terms of like Agile ceremonies, whatever. People talk about like big A Agile and you have to do all this stuff. The retrospective is the only thing that’s required. Everything else is optional. There are some good places to start, but the only thing you really need to do is sit down, make sure you do it. Sit down, talk to each other about how it’s going and what you can do together to change and improve. Everything else you can throw out. That’s the only thing that has to happen, in my opinion. And you can negotiate and figure out the best way of working for you and your team and your goals.
Which takes me to the other big thing that we talk about a lot, right, which is goals. One of the things that kind of I think happens a lot, that we see a lot is where engineering ends up with their own goals, right? The engineering is trying to say, we’re focused on scalability. We’re focused on tech debt. We’re focused on uptime. We’re focused on whatever, right? And product is saying we’re focused on shipping new features. We’re focused on achieving product market fit. We’re focused on bringing in a new persona into our product, whatever, right? All those goals are fine and reasonable a lot of the time. But there are ultimately goals that are designed to further the success of the scaleup for the success of the organization, right?
And if you’re not talking together and setting and making both of those goals everybody’s goals, and both thinking about the business value of things like scalability, which there is a lot of, don’t get me wrong. And talking about it through that lens. And at the same time, comparing that as much as you can, apples to apples, and you can’t always. But trying to kind of have a conversation about those things and say, okay, do we need to focus this month, this week, this quarter on five 9s of uptime or getting a new persona in? Those are really different questions, but they do ultimately impact your success as an organization. And so getting really clear and having that conversation together around what’s important and how we’re going to create that value is really important.
The other big thing that we think about a lot too, and this doesn’t, I’m going to say this, but this doesn’t have to be heavyweight, right? It’s just making sure everyone understands how your organization creates a value. Every scaleup, at least commercial ones are trying to provide a service for people, provide a product or a service for people at a cost to them that is lower than they charge the people and therefore make a profit. And that value, there’s a very clear understanding of what that value, what value are you providing to your customer and what value can you capture back in the form of, usually, revenue if it’s a direct SaaS business. Or when I worked in media, then the value we’d get as people looking, and then we’d sell that to advertisers. And then, you know, that’s the value loop.
But making sure everyone understands that, that core business, really the business that you’re in, the exchange of value and what you’re providing to society and the value you’re getting back from them in order to kind of continue providing that value and then how their team contributes to that. If you understand that, then it’s really hard to have these conversations, right? And the inverse is once people start to understand that, in our experience anyway, they’ll come up with great ideas actually around, hey, I think we could do a better job. Or, you know, for example, I think we do a better job getting this value if we do this or that or the other thing, creating this value, ensuring that we create this value for folks.
So yeah, those are some of the key things. It sounds a bit vague. One of the things you mentioned in the article too is there’s a book called The Five Dysfunctions of a Team by Patrick Lencioni. It’s a good book. We like it a lot. I like it a lot. And it’s a good, I think it’s a good starting point just to talk about psychological safety like Tim said, and like finding that place so that team can really, really collaborate. And then you get into autonomous teams and structure and stuff, but we can get into that some other time. That’s a longer discussion.
Henry Suryawirawan: Right. I think there are so many insightful things, right? I must admit as well, last time when I work in a scaleup, right? So we used to have the engineering roadmap, so called engineering initiative, right? And also the product initiative, like sometimes they can be totally independent, right? And I think all these blameless thing is definitely very important, right? We want to have like no finger pointings or, you know, blaming why certain things are done this way.
[00:18:54] Unified Product-Engineering Roadmap
Henry Suryawirawan: But I think if I can pick one particular thing that if you can give us more suggestion is how to come up with a unified roadmap right? Because sometimes it comes from the top, you know, the CEO just said, oh, yeah, we want to increase revenue, increase users, and that gets translated usually with more features. How do you get this mixed balance of product and engineering priorities?
Kennedy Collins: So I think that comes back to this idea of fundamentally understanding the business you’re in and what you’re doing, right? And when I say that, I mean like every organization, ultimately, every result an organization has is just human behavior. Including often the most important human behavior that we care about as a business, which is people giving us money. But it’s all just human behavior, right? And it’s all just aggregate human behavior. Every metric you look at is just human behavior in aggregate. Or it’s system behavior that was programmed by humans or whatever, right? It affects human behavior at the end of the day. And thinking about really, and to your point, right? Like, it’s just a rag on my own side of the house for a little bit, right? People saying, we want more users. Great, man! It doesn’t matter. Like I want a lot of things, it doesn’t help.
What’s your theory for what you can provide to those users to get them in the door, right? It’s having that really clear understanding of that value loop or that hypothesis. We can hypothesize, right? But hypothesis is not this idea of, well, if we do this, then we will get more users. More users is not the output of the hypothesis. The output of hypothesis, if we build this feature, then more people will show up every day. Because they want that value, right? It’s a slight difference, but it’s important. Anyway, point being, what behaviors are you supporting? What behaviors are people paying you to support? And then I find that when you have that, it makes it much easier to have this, you know, this idea of unified roadmap, at least to have these high level hypotheses, where you can have from the top down from the CEO, from the CTO, from whomever.
And oftentimes in my experience, there’s a healthy balance between the two, right? Where oftentimes you’re seeing this idea from leadership of pushing into adjacent areas, pushing into new products, new extensions, new markets. And then where you do expect the teams, say, hey, we should try and target. We think our product would be good for this other persona we don’t really target right now. If we did some stuff, let’s go dig into that. But you also don’t need to provide more detail than that. Although you can, right, team ownership of that too.
And then at the team level, team’s kind of saying, hey, we think that we can improve our chunk of this if we do this, right? And finding the right balance between those two, again, as a conversation between leadership and the teams. And it’s a conversation between product engineering to also say these technology initiatives, this goes to kind of one of the next things that one of the other articles that we wrote which is about service disruptions, right? Service disruptions are hugely painful for scaleups. And a lot of that has to do with not thinking about the fact that five 9s is not just a technical idea. It really means that people can’t get the value they’re trying to get from your product for many days or weeks or hours or whatever, depending on how many 9s you got.
And having that conversation through that lens of, well, how does this enable us? Or, you know, and if the answer is that you’re working on developer tooling to get the second order effect of being able to help our customers in an abstract way more quickly, that’s also part of it. But understand how it ties back to what you’re doing as a business. And that ultimately gives you a common language to negotiate with or to collaborate with and to find that space.
Henry Suryawirawan: Yeah, I find that the gist is always like understanding the business value, right? Understanding how the organizations can actually move the needle, right? Instead of, you know, just focusing on the functional aspect of each of the team, right?
[00:22:16] Bottleneck #2: Service Disruptions
Henry Suryawirawan: You just mentioned about service disruption. Maybe let’s go to that bottleneck, right? I also find that a lot of scaleups will have this challenge. For example, downtime or the system throws a lot of errors, right? Users complaining. There could be various other things, right? But tell us why this bottleneck is also so important for us to think about?
Tim Cochran: Why? Because this is what really hurts when you get to hypergrowth and you haven’t put the fundamental systems in place. I worked in the crypto space for a little bit. And yeah, when certain tweets went out, the systems all used to come down and losing millions of dollars, because they designed it in not a great way, right? So it’s interesting. Like, I think service disruption is like, there’s a lot of stuff that we’re quite pragmatic about, but I think almost like availability. It’s easier. It’s almost a non-negotiable. Of course, there’s a spectrum to it, but I think it’s a lot easier to just from the beginning to build those habits and practices in rather than trying to do it once you’ve scaled. It’s sort of a nightmare. So if you start to build it in a way that you are thinking about…
You know, I think we have to be a little bit careful about scalability, but scalability in a simple way, not in a complex way. But if you’re thinking about those things and high quality code and those kind of things. But yeah, I mean, it’ll take your business down, right? Like, because right at the moment when you finally, you’ve gone viral, your product’s caught up and then your customer service is disrupted.
Kennedy Collins: Yeah. And one of the things that I think about a lot, right, is as a startup, as a scaleup you often get one chance. Or two or you maybe get another one in five years if you stick around that long, right? But that first impression is often the only impression. If first impression is something that’s slow, doesn’t work, is down entirely. People move on. People go find a competitor. People will go find some other way to solve that problem. And I think the other thing to think about too, right, is like that’s an interesting one, I think, because if you think about, oh, well, let’s say you have, you know, four 9s, three 9s, I’m all about 9s today.
So you have, you know, three 9s of availability. Tim, how much downtime is that over the course of like a month? It’s a day or more, right? Like a month? No, over a quarter, it’s a day, I think, something like that. Point being, you might say, oh, that’s only 0.01 percent of users who are going to see that. Those 0.01 is never coming back, right? It doesn’t matter. Risk actually is a function of both odds and impact. And stuff that is low odds, but really, really high impact, you should work to avoid. It’s critical, right?
And to Tim’s point, it’s one of those things that, at least in our experience, is not that hard to just make different decisions every day to build it in. It’s not like it’s this thing where it’s this heroic effort, you have to invest a lot. It’s a lot of little decisions you make every day as you’re building software that allow you to have software that’s instrumented and scalable and that kind of thing, right? Whereas it becomes a Herculean effort later if you try and do it, and it’s even more expensive, usually. And so it’s one of those things like, as I said, it’s always these trade-offs we think about, but it’s one of the things that is really worth trying to build in from day one. Because ultimately, you’re trying to get to that growth. Ultimately, trying to get to that scale and resilience and monitoring and understanding, you know, how to do this obviously in a reasonable way, not gold plating anything is really worth your time. I’m curious, Tim, is there anything else you would put on that list of like non-negotiables? I have a thought but I want to hear yours first.
Tim Cochran: I mean, certainly, it’s in the realm of service disruption, but monitoring, observability, and monitoring, not just at a infrastructure level but at a business metric and a product metric level, I think is, cause it’s kind of like otherwise you’re in the dark, right? Cause you’re doing experiments, but I have no data to come off. So I think that sort of experimentation infrastructure is, for a scaleup, is important. And then, you know, I worked for ThoughtWorks, right? So I have to say testing.
Kennedy Collins: That, that was gonna be mine, actually. We’re big fans of TDD and big fans of kind of how all that stuff works.
Tim Cochran: Yeah.
Kennedy Collins: As evidenced by the fact that a product person understands TDD quite well at ThoughtWorks, anyway. But this idea of test-driven development, or even I said, basically for me, I think about it is as a scaleup, you know you will have to change the software a lot, right? If it’s not a one-off throwaway experiment that you’re just trying to see what happens, that you know you’re going to toss and kind of rebuild later, having good testing, good automated testing around that is really critical to enable that ability to change safely, that ability to tweak, that ability to experiment, that ability to evolve your product. If not, you end up in this place that we see so often where you hit that hyper scale. And then the first thing you have to do is a massive rewrite of your core platform. And then you’re not shipping any new features for six months or a year, and then you get lapped by your competitors. It happens. We see it a lot actually. And it’s really, really rough. And if you build in that ability to change through testing from day one, it really kind of makes that above ability possible.
Henry Suryawirawan: Yeah, thanks for those tips, right? Uh, I would just add one more, which is try to build a robust continuous delivery. Because like what you said, right? You will introduce a lot of change. So if you can deliver those changes fast, I think that will be great as well.
[00:27:09] Cross Functional Attributes
Henry Suryawirawan: Another thing that I find for service disruptions, right? Most people actually find it’s related to the non-functional aspect of your software, not really necessarily the functional. And hence again, coming back to our first bottleneck, right? Sometimes it is deemed as engineering problem. All these service disruptions probably for engineers to solve. Again, I think this is not healthy, right? How can the product also take accountability on such service disruptions and maybe come up with, again, like a unified roadmap, to actually tackle this together?
Kennedy Collins: Yeah. I, we, we don’t like the term non, I don’t like the term. ThoughtWorks has since not used the term non-functional requirements. We use term cross functional requirements a lot, right? Because it’s really this space of, yeah, it cuts across different specific pieces of functionality, but it is critical to the ultimate success of the product, right? And one of the things, Tim and I haven’t talked in a little bit, and one of the things that I’ve been on about lately is product people participating kind of the setting of SLAs or SLOs, right? Because oftentimes, you know, you ask engineering’s like, well, what should the SLA be? And they just guess, they go, I don’t know. Like it needs to be up a lot. Or most of the time, five 9s, six 9s.
Right now, I’m working on a commercial. I’m working with a client. Not a scaleup, but they run auctions, right? Those auctions happen in a set period of time. So we need to have five, you know, really, really high availability in a known scheduled period of time. And then it does not matter the rest of the time. And that’s a different consideration, right? Similarly, like, in some parts of that, you know, with auctions, bidding and response times with bidding can be really, really fast. Response times after you win, response times for taking a payment can be a bit slower. Especially, and this is funny, this thing we talked about in this auction business, just for example.
Response times for taking a payment in the auction business are actually not as important as they are in e-commerce, in my opinion. And this is what we’ve talked about from SLA perspective, because if you’ve won the auction, you’re already legally obligated to pay for it. If you have some downtime, if it’s slow, people come back because they are legally obligated to do so. Whereas in e-commerce, there’s friction that’s introduced by bad payment gateways and bad payment structure, where you really want to have that high availability and really, really quick, right?
And so talking with that, and product should be part of that conversation because thinking about those, the customer impact and the relative risk of downtime and the relative risk of service disruptions is important. This goes back to this idea that like risk is a function of both impact and odds, right? And if it’s low odds for really high impact, that’s one thing. If it’s something that’s like really low impact, it even doesn’t matter that much, right? And invest where it’s important. And that goes back to this idea of investing where it’s important, investing where it’s going to really impact your business.
Tim Cochran: Often I’ve seen the non-functional, cross functional used as an excuse for engineers to build the most fanciful systems, right? Like because it has to be like absolutely real time or something, and they want some sort of event sourced, event driven system. And because they want it to, and they’re just trying to justify it. So, yeah. I mean, there’s a lot of like pragmatism that I think is important. And it’s from both sides, right? Because I think probably what you’re talking about is like, sometimes like a developer will ask a product manager, it’s like, should it be real time? And they probably just say yes.
Kennedy Collins: That sounds great.
Tim Cochran: Unless they’ve actually understood it properly, like what is that trade-off? Like, you know, we’re probably talking like millions of dollars decision right there and then, right? And, you know, in real time, there’s such a different difference. Yeah…
Kennedy Collins: I mean, yeah, like you know, I think you told me this, Tim. It’s like, I was asking you about like the relative investment of uptime and you’re like every nine is another order of magnitude more expensive. And like, so what does it need to be?
Tim Cochran: Funny, I was going to say about this, which is sort of interesting is this tax can be hidden quite a lot, especially with scaleups. Because what can happen is you often have a core team and they may, unless you’ve got a good product management system, they may be just doing a lot of work to keep the system up, and you may not know it, because they’re sort of, we see this quite a lot like the core team, they build it, they can solve problems really quickly. And perhaps they don’t need to build automations at the beginning, because they can just solve it. But the problem is like over time, those issues increase. And then, you know, someone’s asking, well, why are we not building features fast enough? And it’s actually because a lot of the team is, a lot of these are taxed on the team. And often it’s just unknown, um, because it doesn’t show up in the product management system.
Kennedy Collins: Yeah, you don’t see that unplanned work. You talk about how fast the planned work is happening, but you don’t talk about what’s happening with the unplanned work and the volume of that unplanned work, right? And I mean, this goes to your point, Tim. It’s like a lot of that is just not lying to yourself. Like, I know that sounds simple, but it’s like being honest with yourself. And this goes back to this thing we talked about kind of blameless cultures and retrospectives and really talking about what’s going on in a blameless way. Because if I come into the room and I go, Tim, why don’t you ship anything this week? And you’re like, well, we’re working on the thing that you did that may not scale. And you know, there’s just an argument. It’s not a useful question to then go, could we make an investment to pay down some of this tech debt today, so that we can reap those benefits tomorrow? That’s a real question, conversation we should really have, which I think goes to one of the other topics we wrote an article about. But if you’re not having those conversations and in a meaningful and transparent way, it all just gets hidden or pushed under the rug, or, you know, you don’t want to bring it up and then it doesn’t get addressed.
Henry Suryawirawan: Yeah, I find that the tax is really important, right? The invisible tax. Sometimes this is very hidden, right? Sometimes the management and the product maybe only see about the features that is coming out. Sometimes you do have a lot of these issues. Maybe sometimes people call it developer experience these days, right? So I think highlighting that, making people aware that such thing exists is also very important.
[00:32:39] Bottleneck #3: Accumulation of Tech Debt
Henry Suryawirawan: And I think I find that service disruptions most of the time is related to tech debt, right? So engineers will say, yeah, it’s a tech debt. You know, we did it in a shortcut way and now we have to pay it back, right? So talking about tech debt will be never ending. But from your view, what are the typical tech debt things that people should really think about? And what are the warning signs?
Tim Cochran: The warning signs? Well, some of the warning signs is kind of what we talked about, right? Like development slowing down. That’s certainly one. Some of it could be in the impact to the end user. But I think some of it’s going to be harder to track. But unless you’re really sort of tracking developer satisfaction, it can be problems like, you know, often we’d go to a client and there’ll be some strange, unique system that they’ve created that everybody complains about. And perhaps it was not documented in a very good way or some legacy or something. And those kinds of things, like they really hurt, like onboarding, can hurt maintenance, those kinds of things.
The other thing is kind of like, it’s about engineering satisfaction, because if a developer is kind of like complaining about tech debt and they have no autonomy or no ability to reduce that tech debt, then that’s going to affect productivity, right? And, you know, often a pragmatic developer doesn’t understand that it’s debt and it’s something that we’ve taken on. But if there’s no like conversation about it, or there’s no light at the end of the tunnel, if you know what I mean. That’s when it can become an issue. And that’s when it’s probably like imbalanced in that sort of product engineering balance, right? Maybe leaning more towards the sales team or the product team is sort of making all the decisions.
Kennedy Collins: Yeah. One of the things, to go a little bit back to the conversation we had about reliability as well, right? Oftentimes, a lot of that time you could spend on paying down tech debt is spent on just keeping the lights on, right? That’s often a good signal that it’s time to invest a little bit more because, you know, you can buy that time back and buy that scale in the future and it’s worth taking the time now to pay down that debt a little bit. The other thing Tim mentioned briefly, but I think is really, really helpful is to think about onboarding, right? And to use onboarding as a moment to really identify some of that stuff. Like if you’re successful and lucky in scaling, you’re often hiring, right? And there’s new folks coming in frequently.
And those new folks, asking them and setting an expectation that they be sensitive to the weird things in your systems. And both pragmatic about like what can be fixed, what can’t be fixed, but sensitive to the weird stuff in the systems, right, is really helpful. Because, you know, you only get those new eyes once and it’s a really good time to say, you get that new person, they go, this is really weird. And you go, this is weird. I hadn’t noticed because we’ve been living with it for three years now, but this is weird. You’re right. We should do something about it, right? And that new person is often able to see that in a way that the people who’ve been around for a while just don’t anymore. You just look right past it. The same way that, you know, you got a pile of whatever in your house and you just, you know, your eyes just skip right over it. You don’t even notice. So yeah, having those new people and asking them, making it part of the onboarding actually to kind of identify some of those things and identify things that they don’t understand or they think are strange so that you can use that as a feedback loop is really helpful.
The other part of that, I think is just, and Tim, you probably speak as to how to create this better than I can, but there’s a certain like quality of work or pride of work thing that becomes important in this space as well. And I think this goes back to the satisfaction thing, like there’s that pride of work. And if you’re able to instill that, it creates a lot of value of almost like camping. This idea of cleaning up after yourself while you’re camping, leaving it cleaner than you found it, when you’re refactoring, that kind of stuff, right? And making those decisions and taking that, again, this goes back to the, you know, little things you do every day, taking that five minutes to do something as you’re going rather than leaving it off, is really, really valuable as well.
One of the things that we do, that I do a lot as a product manager is often set a budget where basically that’s the team’s carte blanche to spend on tech debt and specifically usually tech debt that is impacting their day to day, their quality of life. You know, making things weird, making things annoying. Oftentimes, to your point about a unified roadmap before, you know, if there’s larger technical initiatives or larger kind of infrastructure, reliability, whatever initiatives that you need to make, that should come in on a unified roadmap. But there’s also this budget of just like, this variable is named six different things in six different places. And we should probably get that, get some consistency here. So it’s not confusing. I’m sure actual developers on this call can give me better examples. But that’s just one that drives me crazy sometimes.
But yeah, and setting that budget, I think is really helpful. And that budget should be negotiable as well, right? Like if there’s a big crunch time, we need to turn that to zero right now, that’s fine. But, you know, what’s the payback period? I’ve even seen like, the Basecamp 37signals folks, they tend to do a pattern where they’ll do, I want to say it’s six weeks of like the feature side, and then they have two weeks of like hardening and that. And so they fit that budget that way, right? But finding that way to do that time and do that cycling back and that cleanup is really helpful. And kind of framing it that way. It’s like, hey, engineering leaders, this is your place to make decisions. And hopefully the engineering leaders are devolving that to the team too, so they can, Tim’s point, clean up after themselves and do that stuff that makes them proud of their work.
Tim Cochran: I do think that sometimes tech debt isn’t a very useful term, because I think while it’s sort of a good analogy and that kind of thing, it becomes this sort of amorphous blob of stuff. And it’s just like, and the product team’s like, why can’t the engineers go fast? It’s like, oh, tech debt, you know? And it’s like really, you should be describing what are actually in the backlog and the product team should take time to actually understand it. And the tech team should take time to explain it as well. But if you’re just like this this sort of like, if there’s just this list of tech debt that the product team doesn’t understand, then you can’t have that negotiation kind of thing. Sometimes it’s not very helpful. You just got to be careful not to just bucket everything. Because a lot of the stuff that we see in tech debt backlogs really isn’t…, not a lot of the things will actually affect a customer’s experience.
Kennedy Collins: Right. Exactly.
Henry Suryawirawan: Yeah. And I think sometimes, yeah, the term debt implies that why did you make the debt in the first place, right? But we forgot actually during that context when we make that shortcut or decision, right, we actually wanted to get the product market fit or experiment, do quick things as much as possible, right?
[00:38:37] Systems Ownership
Henry Suryawirawan: Another thing that I find part of the tech debt is the ownership, right? As the team grows, there are so many aspects of the system, it grows larger and larger, but somehow there are some parts of the systems which are ownerless. So maybe from your view, how can we have the ownership problem also tackled as we grow larger and larger?
Tim Cochran: Yeah, you nailed-it, right? It can be difficult at the beginning, right? Because you sort of have this small team that everybody owns everything. And probably at the beginning you may have like an individual would probably build something. But I think it’s important as you scale that you understand your technical landscape and everything has an owner. That is probably something we’d also say is sort of non negotiable, that kind of like making sure that everything has an owner and is assigned to someone and someone’s aware of it, even if they’re not working on-it.
Kennedy Collins: And then this goes back to that, you know, pride of ownership, pride of work thing, right? Like you care about things you own and you don’t like, you care less about things you don’t own and just having that ownership and also having engineering leadership that checks in, right? And just says, hey, how’s this thing going? And if you say, hey, I haven’t looked at that in months, because it’s not important. OK, that’s a discussion we can have. But like actually following up on and making sure that the people who are owning it are actually owning it, and not just have their name next to it on a sheet of paper somewhere, and aren’t actually kind of doing the things you need to do to really own something.
And yeah, I mean, I don’t know, it’s interesting, this conversation, it takes me back to kind of, you know. Forgive me everyone, if you’ve heard this before, but like the original metaphor of tech debt, right, where the term comes from is this idea of, in the same way that, you know, I took out a mortgage to buy my apartment, right? Like that debt allowed me to do a thing I wouldn’t otherwise be able to do, but I do have to pay that down. And if I don’t pay it down, the bank’s going to take my house. And it’s this balance of making some decisions right now to get some leverage when you wouldn’t otherwise be able to get that leverage. Which, you know, especially as a scaleup is really really critical. But also not getting so over leveraged that the smallest issue blows everything up. Or that you know you’re spending your entire paycheck on debt service instead of actually saving some money or you know investing in things for the future.
And so that, I think that metaphor is still really great and really strong for the conversation. I think to Tim’s point, just saying, calling everything tech debt and throwing your hands away is not really helpful, but that metaphor is actually a really useful frame to have conversations about this stuff.
Henry Suryawirawan: Yep. So I think it’s been a great talk about the product engineering friction, right? The service disruption and tech debt. They are all interrelated, I find, right? So you cannot just separate these three things.
[00:41:01] Bottleneck #4: Onboarding
Henry Suryawirawan: Maybe if we can go, just now we mentioned about onboarding, right? So that is also one aspect of the bottlenecks that you mentioned in your series. Onboarding and also having to hire a lot of people. Maybe a little bit here, what is your tips to actually give to people about onboarding and also hiring?
Kennedy Collins: I’ll do one really quickly, and then Tim, you think about this more than I do, you can probably take more time. But the last step of the onboarding checklist should be to provide feedback on the onboarding checklist, so that for the next person, it’s better. Anyway, that’s my one little thing that I think is a really helpful little tip, which goes back to retrospectives and everything else. But, anyway.
Tim Cochran: Well, so yeah, I have been thinking about it quite a lot. And so in the article, right, we have this kind of like optimal onboarding timeline. And, of course, it’s like, it’s a very extreme example. It’s extreme, but it’s based on real world experience. Like we work for companies that have amazing onboarding experiences. And I do believe that those companies can pivot quickly, can scaleup faster. So, I mean, one of the things we often think about when we think about onboarding is like, it’s kind of given a short shift because it’s treated as like the new hire orientation. The point we make in the article is that no, it’s actually a key business capability. And it can drive a lot of your scale.
But yeah, so just a few tips, right, is deploying on your first day. Having a workstation, having your dev environment set up and good to go. There’s a couple of things, I actually just saw some research from Microsoft about this, because they were noticing that people that deployed early actually were more productive in their career. They did this correlation. And they thought it was because, oh, they must’ve done like a story early and got into like learning the domain and all that kind of thing. But actually what it was is more they were doing a trivial change and it was because they were able to talk the language of the team. I recommend doing a trivial change on the first day just to make sure the environment’s set up and then you start to understand all the different systems and those kind of things.
A lot of the onboarding thing is like the way to think about it is not, I’m not really improving… I am improving onboarding, but I’m also improving the developer experience and effectiveness of the team. And a lot of the things that you’ll improve for… because the things that a new developer will find hard, probably, developers are also finding hard, but not to the same extreme, right? Cause like if you improve knowledge, if you improve the developer experience, the friction, the communication tools, you know, so it’s going to improve for everyone. So it’s not right just to think about it like an onboarding or a new hire orientation. It’s really improving the effectiveness of the whole team this way.
And then, you know, most of it, I mean, beyond the sort of technical aspects, you know, a lot of it is about building that ability to collaborate. And to quickly getting to know your team, work with your team, getting to know your cross functional counterparts. So some of those things that we have in our checklist is just really understanding what is the company mission, what are the business goals, like actually talking to a leader to present that. Having an OKR, those kind of things.
Kennedy Collins: Yeah, I mean, one thing I’ll say, especially for the product folks listening to this. Like at one of the scaleups I worked at, right? I was the, you know, Head of Product and I personally did the business context onboarding for every single new hire far after you would think I would have stopped doing that. Because I found it incredibly valuable. Like to Tim’s point, right? If I can spend an hour of my time, hour and a half, 90 minutes, I think is what we usually did and increase this person’s effectiveness by 10 percent over the course of the time they work here, that’s an incredible return on investment. Even though, yes, I know it’s hard for me to find 90 minutes a lot of the time.
And so, yeah, I made that priority. I made that a priority and just did it a lot of the time. Cause it’s really, really, really, really valuable. To make sure you understand and can ask those questions. They can ask those questions from someone who has a more holistic view, right? Having more senior folks do that, and then your team can onboard into your team context, but getting that holistic view so you can understand what’s happening outside of your team context is really helpful.
Which again, I think goes back to this idea of onboarding as well, and this idea of onboarding as not being just for the person who is being onboarded, but also for everybody, and also kind of building up this corpus and this set of things that is most important for the new person, but it’s helpful for everyone. Those onboarding documents are often so helpful for adjacent teams who need to understand about what you are doing who are new to collaborating with you, right. That onboarding happens constantly, especially as you grow about, you know, oh, we haven’t really worked with that team, but we need to integrate with them for this thing or that thing, or we need to collaborate them to ship this feature or whatever.
It’s much, much easier to do that if you can go look at their docs on Stoplight, if you can go read the business context behind what they’re doing and in Confluence, not to name products. But like, if you can go read those things and then go to the conversation rather than showing up and going, so I hear you guys are the contracts team. What does that mean? Like, like that’s such a wildly different experience to showing up and knowing what the software does, knowing what it does today, knowing what they think their goals are, knowing what their roadmap is in the near future, right? That makes it such a different conversation. And that’s helpful for onboarding, but it’s helpful for everyone. Or, you know, onboarding into these other things as you go and onboarding across the business.
And the other thing too is like scaleups change so fast that a team you thought you knew what they were doing six months ago or a year ago is completely different now, right? Even if it has the same name. And again, that being able to re-onboard yourself or re-orient yourself into what your colleagues are doing, it’s just very, very helpful. And using onboarding of new hires, which is, you know, a critical moment as a lens or a forcing function to improve all of that, I think is really key.
Henry Suryawirawan: Yeah, I like the point where you mentioned onboarding is not just for the new joiner, right? It can also be an investment for everyone, including the developer experience, right? And I like the final checklist that you have, that you mentioned in the beginning, Kennedy, like always try to give feedback to-the onboarding process so that you can have the next person enjoying the benefit of the improvement, right?
[00:46:35] 3 Tech Lead Wisdom
Henry Suryawirawan: So I think we cover a lot today. Uh, obviously we can’t cover everything, all the bottlenecks. I’d like to probably wrap up here and ask you one last question, which I always ask to all my guests, which I call the three technical leadership wisdom. I leave it up to you how to arrange the answers. But if you can share some of these advice, what would that be?
Tim Cochran: Okay, so I was thinking about… You asked this earlier, I’ve been thinking about it. So one thing that’s interesting, because I run into this quite a bit, right? When we talk about knowledge and sort of collaboration and that kind of stuff. And a lot of times it comes down to documentation. One of the things I think is the best way, especially for a junior developer, to kind of learn is via observation. And, you know, when you document something, you write down what you think is important. But actually, what might be really important is, you know, how the shortcuts, how does someone arrange their tabs, you know, these things. So I think especially with onboarding, like having opportunities for developers to observe other developers, especially senior developers is incredibly important, whether that’s pair programming, mob programming, those kind of things.
Kennedy Collins: It’s not just for developers. This is a thing I get into a lot too. It’s like pairing as product people, pairing as designers is really, really valuable too. In the same way, right? How you navigate through problems, how you navigate through finding information, seeing those things. And to Tim’s point, like the stuff you write down is not necessarily, you know, you write down the stuff that you think is important, but that’s not really, oftentimes, you know, the stuff that you do every day is often the more valuable stuff and you just don’t even think about it because it’s so automatic.
One thing I’ll add to this list, just really reflecting on a conversation we’ve been having, is that a lot of these things that we talk about, they’re not big initiatives. They’re not big efforts and they aren’t successful if they are big efforts. It’s stuff you show up and do every day. It’s it’s those little things that you do every day that add up to success, right? Making sure that you’re paying down tech debt every so often. Making sure that you are showing up and thinking about and taking the time to think about how the work you’re going is going to impact the overall goals of the organization. Not lying to yourself about the impact of the things you want to do because you want to do them, right? All that, it’s simple, but it’s hard. And that’s the stuff that surprisingly valuable and surprisingly difficult. But it’s also one of the things that I find heartening about that is because it’s simple, but difficult, anyone has access to it, right? It’s not this thing that you need to be a genius to do. It’s just a thing you need to have the commitment and the discipline and the showing up every day to give a shit to to do it.
Tim Cochran: I think, particularly, if we’re talking about scaleups. So, of course, right? You may have some exotic, complex technology that should be in the core feature of your system. Everything else should just be simple. Like, what’s the simplest way that I can build whatever supporting systems I need and not waste any time on it. And put all your energies into that complex things. And I know as an engineer, you get excited about solving a problem or creating a rules engine or whatever it is, right? But like, ideally, you can use a third party or something. But if you can’t, you know, how can we do it in the most simple way and still scalable, but just. Cause I think these sort of complex or applications that are over complex in places that it doesn’t need to be is, is a big one.
Kennedy Collins: Yeah. Yeah, no, I agree with that. That will make that number three. Cause one of the things, one of the best pieces of advice I got when I started my first company was, startups don’t starve, they drown, right? The thing that will kill you is a lack of focus, not focusing too hard in one plan and not being successful in that place, right? You obviously have to, again, stay honest with yourself and pivot when you need to, but you can’t pivot if you’re not going in a direction, right? That’s not what pivoting means. If you’re just spinning, it doesn’t matter. If you’re just trying to do everything all at once, it doesn’t matter. And so Tim’s point, focus on what the core is, what the thing that’s really the value you’re creating. Make everything else as simple as possible. Just don’t worry about it.
Henry Suryawirawan: Yep. Thanks for the plug, right? So I think I love all the wisdom, definitely. So if people want to continue this conversation or they want to reach out to you, ask you questions, is there a place where they can find you online?
Tim Cochran: Uh, yeah. LinkedIn.
Kennedy Collins: Yeah, I’m on LinkedIn. I am on the application formerly known as Twitter too. It’s just, it’s Kennedy Collins, first name, last name. That’s probably the best places, yeah.
Henry Suryawirawan: So thank you again for your time today. I really learned a lot and I hope the listeners here also learn a lot about the challenges in a scaleup, right? And hopefully you get some gist on what to do next in order to tackle those bottlenecks. So thanks again.
Kennedy Collins: Thank you everybody.
Tim Cochran: Thank you.
– End –