#40 - Data-Driven DevOps with Launchable - Kohsuke Kawaguchi
“By and large, the way people put together the delivery process is by gut and instinct. The next step up from there is to use the data that comes out of your system to help you make the right decisions. When I say data-driven DevOps, don’t rely on this human experience, and let the system tell you. We should be able to find that kind of information from data."
Kohsuke Kawaguchi is widely known as the creator of Jenkins and currently is the co-CEO & co-founder of Launchable. In this episode, Kohsuke shared about data-driven DevOps, developers productivity, the future of software testing, and why he created Launchable to help us move closer to achieve those. And in the beginning of the episode, Kohsuke shared his story on how he created Hudson during his time at Sun Microsystems, which eventually led to become Jenkins, the most popular open-source CI/CD tool used by millions.
Listen out for:
- Career Journey - [00:05:24]
- Hudson/Jenkins Story - [00:07:30]
- Current CI/CD Landscape - [00:12:18]
- Developer Productivity - [00:15:04]
- Improving Our Productivity - [00:17:06]
- Launchable - [00:21:06]
- Launchable Customer Story - [00:33:54]
- Future of Software Testing - [00:37:13]
- Data-Driven DevOps - [00:40:41]
- Running Launchable - [00:44:09]
- 3 Tech Lead Wisdom - [00:45:14]
Kohsuke Kawaguchi’s Bio
Kohsuke is the co-CEO & co-founder of Launchable. He is passionate about developer productivity. He created Jenkins, the most popular open-source CI/CD system used by millions. As CTO of CloudBees, he helped CloudBees go from <10 to 400+.
Kohsuke has received the O’Reilly Open-source Award, JavaOne Rockstar, Japan OSS Contributor Award, and Rakuten Technology Award.
- Website – https://kohsuke.org/
- Twitter – https://twitter.com/kohsukekawa
- LinkedIn – https://www.linkedin.com/in/kohsukekawaguchi/
- Launchable – https://www.launchableinc.com/
Mentions & Links:
- Continuous Integration (CI) – https://en.wikipedia.org/wiki/Continuous_integration
- Jenkins – https://www.jenkins.io/
- Hudson – https://en.wikipedia.org/wiki/Hudson_(software)
- CloudBees – https://www.cloudbees.com/
- Cruise Control – http://cruisecontrol.sourceforge.net/
- BazelCon – https://conf.bazel.build/
- Bazel – https://bazel.build/
- XML – https://en.wikipedia.org/wiki/XML
- W3C – https://www.w3.org/
- Java Enterprise Editions – https://www.oracle.com/sg/java/technologies/java-ee-glance.html
- Sun Microsystems – https://en.wikipedia.org/wiki/Sun_Microsystems
- Oracle – https://www.oracle.com/
Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.
Check out all the cool swags available by visiting https://techleadjournal.dev/shop.
And don't forget to brag yourself once you receive any of those swags.
I discover that I wasn’t the only guy who’s breaking builds all the time. Everybody else in the team did. I just didn’t even notice those.
As a general rule of thumb, developers are more interested in just doing things than convincing other people that doing this is the right thing.
Current CI/CD Landscape
One thing that I find interesting was when I originally did Jenkins, one key role, the big function was a place for the collective truths. So in other words, it was used as a mean for people in the team to be on the same page about the state of things. Did this change successfully land? Is this test passing or failing?
Nowadays, typically it’s almost completely invisible. Maybe the only interaction people have with it is when their pull request gets the green check marks.
When it’s invisible, you can go up a lot more in the complexity sophistication. So to me, this is actually an opportunity.
When people need to understand what those things being there means, you can’t really go too far.
I’m thinking mainly from the angle of, if I’m a leader of a team, then I’m thinking about, I have a set of people with different skill set and different level of experience and sophistications, and I need to rise all the boat. I’m going to make everybody a little bit more productive. So arm them with more something, like more information, more power.
What the CI system or DevOps in general is doing is that kind of like arming the humans, like amplifying their muscle with the machine power. So it’s almost like an exoskeleton, how to do more with less. That is what I think about the developer productivity.
We are just basically trying to create more seatbelts. Because we can churn out the changes as fast as we want if you don’t care about whether it works or not. And then it’s like an everlasting challenge. And there are so many other layers that we rely on to build confidence of the change.
Improving Our Productivity
Productivity as an individual, it almost feels like how can you grow? Or how can you become a better engineer? Given the same person, how can we do more?
The key thing for me is demand that you’d be armed. Demand with your boss in the organization that you need that exoskeleton, the power loader.
We have these tendencies, and I do, to build my own tooling. We know because we can, and it’s fun to do it. We tend to just prefer to roller on, instead of spending money.
I learnt a lot by writing code. Maybe more so by looking it back in a few years later, which ones worked out, which ones didn’t. So there’s definitely the value in time testing things. More generally, I think it’s the idea of retrospecting. When you do it with some intentionality, do it with some hypothesis. Without those theories and hypotheses and retrospect, it’s difficult to learn from how you did it.
Engineering work is more like finding the right balance or compromise. That balancing act is an important skill, but I also find it useful from time to time, just go deep dive. You decide to solve the problem the hard way and then go solve it. In some sense, I think it kind of grows the muscle.
Finally, I think the communication and influencing others. I think the communication skills in engineering become far more important.
We have 18,000 or so test cases. Every time I make one line change, I have to wait for one whole hour for all the tests to pass. And then, somebody comes in for a code review, maybe he doesn’t like the variable name. We make some changes, and then another whole hour of testing cycle before I can land this change into the master branch. I find that quite painful.
It’s a good thing to have a lot of tests, but if I’m changing a small part in your large system, then I don’t think running all the tests is actually particularly meaningful. There should be a smaller subset of the test that would be pretty effective in catching any of regressions that I might cause.
The idea of the smoke test is if you have a big test suite that takes like hours to run, so you create a really small subset, and you probably know if any of these tests fail, then it’s probably not even worth that whole tests.
What Launchable is doing is we look at the changes that the people have applied and created. And then we look at all the tests that you have, and then we make a prediction that okay, for these changes, you should be running these tests in this order to maximize the chance that you get the failure right away, right off the gate.
If you have this automation process going all the time, producing data and behaviors, then the system itself should be able to use those information. First, you can inform people better. And the second thing that can make the whole delivery pipeline self-tuning.
Are all these test cases actually really useful? We kind of amassed them over time. I mean, I don’t even know what this test case is testing. I just want to tweak the assertion to make it pass. When you add more tests down the road, maybe are they overlapping? So there should be some critical eyes to, let’s say, the Return on Investment. Every test case has incurred some cost in terms of the execution time, in terms of the maintenance, and all that. So is it actually producing the value to justify that cost? And then we should be able to get those information from the system itself that we don’t today.
As the project becomes bigger or changes, what’s the optimal strategy to do the merging or where to run the test actually should change. So we run it every day. Is it better to run it every day? Or is it better to run it every one, two days so that we have extra cycle that can be used to run different kinds of tests. Now what’s optimal? So people are not actually intentionally making those decisions.
If you can make those decisions data-driven at least, and then if you can further step up that decision by leaving those decisions to the machines.
The key to make it universally useful is to reduce the intrusion as much as possible. So make the surgery as small as possible. Because I know how scary for some people to modify.
Future of Software Testing
Ensuring quality became a lot bigger than testing.
Now for the most of the people, that’s not how we ensure quality. I mean, testing is still a key pillar, but now we have other means like feature flags, controlling, doing a smarter rollout, using monitoring to flag the failure risk before too many customers get impacted. I feel like there’s more defense in depth. There’s all sorts of practice combined, like each of them designed to reduce the risk, accepting the fact that some bad stuff is going to happen. So I think there’s a lot of that going on, more of that is going to happen.
And then, not only on the testing side, I think what needs to happen is focusing more on the increments. If you’re trying to ship every day, like deploy every day, you cannot possibly have capacity weeks. But the amount of tests that you amass could easily run up to those.
I think the right thing to focus on is not to look at one frame in the movie, but look at two frames in a movie, and compare the difference. Because if your speed goes up, that delta is going to go down.
By and large, the way people put together the delivery process is by gut and instinct. So how do you go up from there? I think the next step up from there is to use the data that comes out of your system to help you make the right decisions.
When I say data-driven DevOps, I mean don’t rely on this human experience, and let the system tell you. We should be able to find that kind of information from data.
3 Tech Lead Wisdom
Leadership by code is a fundamentally not a scalable enough way to make a difference.
Most technical leaders, their starting point is leadership by code. You just write a better code more quickly, and that’s how you influence people around you.
That is great, but it actually only carries you so far.
As a technical leader, you influence other engineers around you and make an impact by influencing people around you.
In some sense, they don’t really need you. They get things done in their own ways. So what is the value that you can add? One thing that I gravitated there is the power of storytelling.
The idea here is to help them mentally picture the problems you’re solving. And then also hoping to get them excited in solving those people’s problems.
Empowering more individuals.
You’re trying to do something by yourself is useful, but again, it only scales so much.
At some size of the organizations, it becomes more efficient to spot people who are trying to do the right thing, and then send more spotlights and resources to them.
Episode Introduction [00:00:54]
Henry Suryawirawan: [00:00:54] Hello everyone. This is Henry Suryawirawan. I’m back here again with a new episode of the Tech Lead Journal podcast. Thanks for tuning in and spending your time with me today listening to this episode. If you haven’t, please subscribe to the Tech Lead Journal on your favorite podcast apps and also follow Tech Lead Journal social media channels on LinkedIn, Twitter, and Instagram. And if you’d like to make some contribution to the show and support the creation of this podcast, please subscribe as a patron at techleadjournal.dev/patron, and help me towards producing great content every week.
For today’s episode, I am very happy to share my conversation with Kohsuke Kawaguchi. Many of you would have heard his name before, as he is widely known as the creator of the famous Jenkins CI, the most popular CI/CD tool used by millions of developers. And I guess, it’s still the leading CI/CD tool by a mile. I personally have always wanted to have a chance to speak and interact with Kohsuke, as Jenkins has always been one of my favorite developer tools since a long time ago, when I started my career. After his stint at CloudBees, Kohsuke is currently the co-CEO and co-founder of Launchable, a test recommendation engine that uses machine learning to speed up CI pipelines by selecting the right tests to run at the right stage of your development workflow.
In this episode, Kohsuke shared about data-driven DevOps, how to improve developers productivity, the future of software testing, and why he created Launchable to help us move closer to achieve those things. In the beginning of the episode, you can hear from Kohsuke his story on how he created Hudson during his time at Sun Microsystems to solve his own problem of frequently breaking the build, I mean, who never breaks the build, right, which eventually led to become Jenkins and the rest, many of us know is history.
It was really a fun and great conversation with Kohsuke, and I hope that many of you could also benefit from this episode. If you like it, leave this episode a rating or review or comment on your podcast app or social media channels. Those reviews and comments are one of the best ways to help me get this podcast to reach more listeners. And hopefully they can also benefit from the contents in this podcast. So let’s get this episode started right after our sponsor message.
Henry Suryawirawan: [00:03:48] Hey, everyone. Welcome back to another new episode of the Tech Lead Journal podcast. Today’s guest, I’m so excited. I’m sure most of you would have been familiar with his name. He’s actually Kohsuke Kawaguchi. One of the most popular figures in the tech industry, I would say. If you ever used the CI server, which is Jenkins, I’m sure you know Kohsuke. He is the creator of it. When he was working with Sun, it used to be called Hudson. And then because of few reasons, then it became Jenkins as a fork of open source project. And then Kohsuke fronted it, and became one of the most popular, if still not the most popular CI server available in the market today. Kohsuke also used to be the CTO of CloudBees, the company that runs behind the Jenkins that is supported for the enterprise. And also won few awards, O’Reilly open source award, JavaOne Rockstar Award, Rakuten award and things like that. So he’s very famous. In this time, Kohsuke is actually launching his new company and product which is called Launchable. I’m sure we will be talking a lot about it later on. So Kohsuke, welcome to the show. It’s great to have you here.
Kohsuke Kawaguchi: [00:04:58] Hey, hey. Thank you very much. I think you give me more credit than I deserved. Thank you for that.
Henry Suryawirawan: [00:05:03] Yeah, actually you are one of the person that I would like to have a chat, because few years ago, this could be even decades ago, right? I was so stunned by the Jenkins CI server. So, I was always wondering why this person is creating this project? It’s so cool and it’s free. So during that time, when I started my career, it was really, really cool. So really happy to have you here in the show.
Kohsuke Kawaguchi: [00:05:23] Thank you.
Career Journey [00:05:24]
Henry Suryawirawan: [00:05:24] So Kohsuke, maybe before we start, we would like to hear more from you about your career journey. So of course there are so many highlights, but maybe in short, if you are able to maybe summarize your career journey so far. What are the turning points, and maybe major highlights for you personally?
Kohsuke Kawaguchi: [00:05:40] Yeah, so I grew up in Tokyo, as you know. That’s where I picked up programming, and then I think by high school I started studying software. Made it operational, a little bit bigger in the college. And then after I graduated, I kind of got involved in this schema language for XML. I guess the picture that I want people to have in their head is like, it’s almost like a Star Wars movie. So there’s like a standard body called W3C. It’s like they’re building a Death Star. So they are the Imperial Force, all the vendors that you’ve heard of. And then there are these small bunch of idiots that are coming together. They have different ideas of how to define a schema language for XML. So they are like Rebel forces, and I kind of got recruited into that rebel force side, and I started working on it. And then because of that deviation, I landed in California, to work for Sun Microsystems. I kept doing XML stuff a little bit time. Eventually the Rebel lost. I mean, that’s how it always work, right? The empire wins. And then I started working more in the Java Enterprise Editions, and that’s when I started working on this software you mentioned that eventually became Jenkins.
But the Sun as a company kept going down. Their business wasn’t doing very well. So, I think it was 9 or 10 years, it got acquired by Oracle. So I experienced three months of Oracle, and then I left. By then, Hudson was really doing pretty well. So I thought, okay, let’s see what I can do with it. So I created a company that became CloudBees, and we have a lot of fun, and that lasted until 2010. That’s where I have the opportunity to learn as a CTO, and it was really my first experience at startup. That was a lot of fun. And then in, I guess 2019, so shortly before the whole pandemic happened, I found some other company. And then that’s where I’m working day in, day out. Probably still in the office??? in California.
Hudson/Jenkins Story [00:07:30]
Henry Suryawirawan: [00:07:30] Thanks for sharing that. It’s pretty interesting that the way you were working initially on schema language. So tell us more like, how did you end up working with Hudson? During that time, initially it was called Hudson. Because CI was, I think probably still in the early stage. I think there was probably one other CI server available, which is called Cruise Control. What was the intention behind Hudson?
Kohsuke Kawaguchi: [00:07:54] So around that time, Sun was a big company and then you are big company, people don’t seem to be working as hard as let’s say they are in small companies. So, they show up to the office, I don’t know, 09:00 10:00 AM, and then like by 6:00 PM, the office is completely deserted. So that was unthinkable from where I originally came from. In other words, I had a lot of time, and then I did enjoy writing programs. So like I had a lot of open-source projects going on at that time. Hundreds of them probably. In fact, I had this blog series called “Project of the Day”, and then I also created like a one little library here and there a day and post it those???. But there are a few other motivations. So one was that, like I mentioned, I was working for Java Enterprise Edition group. That’s a middleware that Sun was creating. It’s a standard so that the other people can build applications on top of it. It’s a platform, right? And then I realise like one day I looked around, and nobody, well, nobody might be too strong a statement, but most of the people who are working on producing the platform was not actually using the platform to do anything.
So that seems like to go against this like a dogfooding principle. So I felt like, okay, I got to write some of these, use the stuff that our team produces to see if it’s any good. Back then my day job??? was XML, and I was engineering this like a four people or so team. I used to be known as the guy who breaks builds. Cause I failed to commit some changes here and there and then next time somebody updates their workspace, like things don’t even compile. So they did a little bit of digging and pick up a phone and call me, “Hey, you touched this file last time and it doesn’t seem to compile, so like can you check what’s going on?” And sure enough, like my change wasn’t fully complete. So that was embarrassing. At some point I got to fix that. So I said, “Okay, let’s write some program that do that.” So that it catches my mistake before my colleagues. And then, you know, it just so happens that there was this CI software written in, I think in Groovy or maybe Python. But then I really liked the way it looks, so I started using it, but it just, it didn’t scale. So the moment I started putting a little bit more work it started to crumble. So I thought, “Okay, maybe I can write this.” So that’s kind of how it started.
Henry Suryawirawan: [00:10:01] Wow. It’s really interesting because it started as a personal project to help you to avoid embarrassment of breaking the build. And actually the concept of CI itself is yeah, to actually detect that early so that people either can revert or fix the build, so to speak. So then once it becomes Jenkins, what was your contribution in order to make it such a successful project? So why people are picking it up massively?
Kohsuke Kawaguchi: [00:10:27] Sure. So in the beginning, like I was the only user, and that went on for like, I think almost two years. I was using it in my team. And by the way, like I discover that I wasn’t the only guy who’s breaking builds all the time. Everybody else in the team did. I just didn’t even notice those. So that was useful, and then they kept giving me these feedbacks, “Maybe you can do this and that.” And that was really encouraging for me. So that kept me going. By then I knew like I have this theory about what makes this thriving open-source project. Then in my mind, that is the kind of get out of people’s way. So in other words, often people who want to participate in open source project has the idea of what they want to do, like the direction in which they want to take this project. What’s really painful is to convince other people, especially the existing contributors in the project, so that they buy into the idea. Especially the bigger changes, or unproven changes.
As a general rule of thumb, developers are more interested in just doing things than convincing other people that doing this is the right thing. So I thought if you can create this also a technical architecture that enables people to experiment on their own, and then build things on top of Jenkins. Collectively as a whole, it still looks like Jenkins. And then if we build a supporting social structures so that the people are incentivized to bring those pieces together into one place, and kind of collaborate in some limited degree. So they still get it. They still get that as the kings??? when it comes to their plugins. Collectively that we need to move to certain direction. So I implemented that, and it has a technical part like I mentioned, but also has the social part. That means maybe like I’d be key contributions. And then, there are also the people started writing those. I was no longer writing a CI system. I was writing a platform that allows other people to write the CI systems. So that’s how I kind of grew it.
Current CI/CD Landscape [00:12:18]
Henry Suryawirawan: [00:12:18] So in terms of the current DevOps CI/CD landscape, what’s your view on this, actually? Your personal view? Because it has grown so much since that day.
Kohsuke Kawaguchi: [00:12:27] Right.
Henry Suryawirawan: [00:12:27] There are so many CI or CI/CD tools available these days. Some are popular, some are like open source, some are commercial. So what’s your view on the current CI/CD DevOps landscape?
Kohsuke Kawaguchi: [00:12:38] Yeah. So one thing that I find interesting was when I originally did Jenkins, one key role, the big function was a place for the collective truths. So in other words, like it was used as a means for people in the team to be on the same page about the state of things. Did this change successfully land? You know, is this test passing or failing? Now that you think, I mean, in 2020, 21, talking about these things like crazy, but back then, like these things involve a phone calls and emails and stuff. That’s often there’s the pain days??? and stuff like that, that made it difficult for people to agree whether this problem that they were seeing yesterday has already been fixed or not. So in that sense, like a CI server was very much visible. It’s almost like a billboard that the people go to see. And then I think over the last 10 years, it sort of like completely flipped in the other directions. Nowadays, typically like it’s almost completely invisible. Maybe the only interaction people have with it is when their pull request gets this like green check marks to we’re good to go or not. Yeah, so that was a surprising change for me.
So from there, I think about two things. The one is that, when it’s invisible, you can go up a lot more in the complexity sophistication. So to me, this is actually an opportunity. When people need to understand like what those things being there means, you can’t really go too far. It’s like an automatic transmission in a car, like it still makes you feel like there’s a distinct gear that you’d be shifting, but then this technology innovation happens like a continuous variable transmission, completely get rid of the notion the gear. So when you’re automated, you can take on the second next level of like evolution. That’s something I think about a lot. That’s part of what eventually became into Launchable.
Another part of it is I think there needs to be a little bit more on the pushback, like a swinging back on to serving useful information to the developers. So the reason you’re running these CI processes so that it catches something and then informs developer, they work on something. And then it went back to used to just basically like a one big information, I think there’s a lot of opportunities lost. I think there’s an opportunity for us to do things better and that’s where I see the CI system currently.
Henry Suryawirawan: [00:14:53] Yeah, I agree like CI server currently is more like a transparent system. Hopefully, everyone has been running CI server for building their software, and it became part of the norm, I would say in terms of software development. Developer Productivity [00:15:04]
Henry Suryawirawan: [00:15:04] I think one of your passion is actually about developer productivity, right? So in your view, what is the productivity?
Kohsuke Kawaguchi: [00:15:11] Yeah. So I’m thinking mainly from the angle of, if I’m a leader of a team, then I’m thinking about, you know, I have a set of people with different skill set and different level of experience and sophistications, and I need to rise all of the boat. I’m going to make everybody a little bit more productive. So arm them with more something, like more information, more power. So that’s the angle that I’m thinking about. Eventually, like Hudson became one of the key driver for me. So that was not how I started it. But as it got used inside the organizations, I started thinking, “Oh, so what this is doing is it’s helping junior engineers to be more effective.” Just a little bit, and it’s not like this one software is going to make them twice as valuable. But I have my own little contribution going there, so that became like an important part.
Another way I think of this is you know, in the movie Alien, who’s the lady?
Henry Suryawirawan: [00:16:02] Sigourney Weaver?
Kohsuke Kawaguchi: [00:16:04] Yeah. So she is actually riding through this power loader, and then she fights with the aliens. So I think of those, like what the CI system or this DevOps in general is doing is that kind of like, arming the humans, like amplifying their muscle with the machine power. So it’s almost like an exoskeleton, how to do more with less. That I think is what I think about the developer productivity. We are just basically trying to create more seatbelts. Because like we can churn out the changes as fast as we want if you don’t care about whether it works or not. And then it’s like an everlasting challenge. I mean, testing is a part of it, but it’s also just one part of it. And there are so many other layers that we rely on to build confidence of the change. That’s kind of developer productivity.
Henry Suryawirawan: [00:16:50] Yeah. Actually I like the second angle. I wasn’t thinking it that way. So creating seatbelts, or maybe guard rails so that you are also productive but effective. It’s not like productive as in like churning a lot of outputs, but also the right output. So I like that definitely.
Improving Our Productivity [00:17:06]
Henry Suryawirawan: [00:17:06] So as a developer, maybe from your point of view, because you have written so many code in open source project, how should one be conscious about improving their developer productivity? We talked about enterprise with CI server and things like that, but how as individual, we should be conscious about improving our productivity?
Kohsuke Kawaguchi: [00:17:24] Yeah. So when you frame it like that, productivity as an individual, it almost feels like how can you grow? Or like, how can you become a better engineer? I mean, I can talk about that, but before I get there, I guess in more like a narrowly, this developer productivity, as I discuss in terms of like, you know, given the same person, how can we do more? I think the kind of key thing for me is demand that you’d be armed. Demand with your boss in the organization that you need that exoskeleton, the power loader. I guess, like we have these tendencies, and I do, I did to build my own tooling. We know because we can, and it’s fun to do it. We tend to just prefer to roller on, instead of spending money. Those money that we are spending it’s not even our own, right? But, I’ve been a long time user of IntelliJ. Because one of my hero programmer was using it and that’s how I picked it up. I think it was like 400 bucks per seat or something back then. And I thought, gosh, this is so expensive. It’s ridiculous. So I refused to spend money on it. But then like later I became a CTO, and like sales people, look at sales people. Every one of them spends $150 a month to Salesforce to get anything done. Our IDE is 400 bucks for perpetual license back then. I mean, right now I think they change to subscription, but back then, you just pay it once and then be done. I was so cheap. I didn’t want to do it. So that to me is like, that’s silly.
So I learnt a lot by writing code. Maybe more so by looking it back in a few years later, which ones worked out, which ones didn’t. So there’s definitely like the value in time testing things. More generally, I think it’s the idea of retrospecting. When you do it with some intentionality, do it with some hypothesis. “I’m going to do it this way this time, because I think of X, Y, and Z.” Normally, often people don’t think about those things, and they just do it because they feel like it. Without those theories and hypotheses and retrospect, it’s difficult to learn from how you did it. So to me, that’s the key thing, like intentionally trying a different API design style, like whether that work. Or like a walking back for what are the assumptions that I made in doing it this way? Did it actually hold or not? So somewhat like a mindset almost. I think that’s important.
So another thing, more often than not, in the context of work, engineering work is more like a finding the right balance or compromise. We don’t want to waste, “waste” like three hours of looking into something deep, and you can band-aid the program in 15 minutes. So that balancing act is an important skill, but I also find it useful from time to time, just go deep dive. You decide to solve the problem the hard way and then go solve it. In some sense, I think it kind of grows the muscle. Sometimes there’s no easier way out, and have to go straight, really, like no matter what the obstacle is. Being able to do that when you need to, I think that’s really powerful. I’m sure we all have this feeling of we spent three hours debugging, and eventually we only find like a one line that needs to be changed, not always. But from time to time, I think it’s a good thing.
Finally, I think the communication and influencing others. Back then when I started working for Sun, the programming are much more individual activities. We literally had this whole system, and split it up in like a two or three ways, and different people generally “own” different parts. They did the thing the way they wanted. Around the interface, we talk about how to do things, but that was it. But now it’s a lot more collaborative activities. So I think the communication skills in engineering become far more important.
Henry Suryawirawan: [00:21:06] Thanks for sharing your personal view on this. I’m sure the reason why you launched Launchable is also partly to improve this developer productivity. So you mentioned something about testing and Launchable is actually some kind of like the tooling arm with machine learning model in order to be able to predict something. So maybe you can share a little bit more what is Launchable? Especially for people who haven’t heard about it.
Kohsuke Kawaguchi: [00:21:28] Yeah. Thanks. I’m going to start with what we do today in terms of functional software, and then where I am trying to go because the latter is much bigger than the former. So I was working on this Jenkins projects. That’s the project that’s been alive for a long time. We have what, 18,000 or so test cases. Every time I make one line change, I have to wait for one whole hour for all the tests to pass. And then, somebody comes in for a code review, maybe he doesn’t like the variable name. We make some changes, and then another whole hour of testing cycle before I can land this change into the master branch. I find that quite painful. It’s good thing to have a lot of tests, but if I’m changing a small part in your large system, then I don’t think running all the tests are actually particularly meaningful. There should be a smaller subset of the test that would be pretty effective in catching any of regressions that I might cause. That’s the reason we have a smoke test. The idea of the smoke test is if you have a big test suite that takes like hours to run, so it creates like a really small subset, and you probably know if any of these tests fail, then it’s probably not even worth that whole tests. So that’s another way of saying, “Okay, sometimes the subset is useful.” That smoke test is usually like a fixed manually curated set of tests. Actually, if you think about it, it’s not probably very optimal.
So one way to think about what Launchable is doing is, so we look at the changes that the people have applied, created. And then we look at all the tests that you have, and then we make a prediction that okay, for these changes, you should be running these tests in this order, to maximize the chance that you get the failure right away, right off the gate. And then depending on where you use it, that can be used differently. If you’re using it for the pull request, then what that means is the chances of you getting the failure right off the gate, right after creating a pull request goes up. So you don’t have to be context switching to something else. You can immediately start working on the fix, and then push another change, and then work on it a lot more quickly, iterate on a lot more quickly. In other places, I mean, I mentioned like a one hour test is bad, but in actual real world software projects, one hour test could be actually awesome. I’ve seen a full cycle test that’s going to take seven days. It’s ridiculous. Let’s imagine like you find one bug and you got to fix it, and then you need to re-done the whole test cycle again. Like you can’t even do it, and then those are still happening. I’m sure like a lot of audience knows a project just like that. So like in those cases, imagine if you can tell people like, okay, these are the tests, in order to get 95% of confidence that these tests can be regression free, maybe you only need to run like a 15% and we can be precise about these things, and that’s what we do in terms of that.
So now if I may step on to where that’s going to lead to, this is in some sense, just one “minor optimization”. And I like this optimization because it’s a personally painful thing for me, and I know a lot of people suffer from it. But I think of it, I guess essentially like a self-tuning delivery cycle. Like I talked about how the stick shift evolved into automatic transmission, into continuous variable transmission that doesn’t even have a gear ratio. I think it’s something similar. If you have this automation process going all the time, producing data and behaviors, then the system itself should be able to use those information. First, you can inform people better. And the second thing that can make the whole delivery pipeline self-tuning. I’ll give you some example, we talked about this like one bit of information from the CI system you get. Sometimes failed. And then, so if something failed, what are you going to do? You click the link and look at how it failed, like you have five failed test cases. Now, what do you want to know? Okay. So three of these failures are actually like I know them to be a flaky test, like who doesn’t have a flaky test problem? So if somebody can tell you that, by the way, these three failures probably not worst thing I mentioned to you because they failed the same way left and right. But these other two are more interesting because those are new.
So those are like kind of the example of contextual information. And then, it’s sort of like a little bit like small ways that make you productive. So you make some changes and you break the master like two dozen different test cases because, you know, you update the system, you need to update the test code. And then you kind of begrudgingly do it because we all feel naked about quality, right? So like who am I to sit away??? the test. But are all these test cases actually really useful? We kind of amassed them over time. I mean, I don’t even know what this test case is testing. I just want to tweak the assertion to make it pass. When you add more tests down the road, maybe are they overlapping? So there should be some critical eyes to, let’s say, the Return on Investment. Every test case has incurred some cost in terms of the execution time, in terms of the maintenance, and all that. So is it actually producing the value to justify that cost? And then we should be able to get those information from the system itself that we don’t today. That’s kind of like the complex information I’m talking about.
Now if you look at the whole delivery pipeline, it’s not even just like one set of test suites. Like you have a different test suites, or maybe sometimes you have a different branch merging strategies, or design. Both are the seatbelts that we talked about. In some small teams, like a naive thing is like everybody’s creating a pull request and they get tested and land into the master. In a larger team, it actually breaks down. So they need to have team level integration branch, which changes are landing, and then a second level of integration. I’ve seen those phases. And then, there are tests, unit tests that run quickly, so you can run it very close to the left side of the development cycle, very closely where the coaching??? happens. Some of the integration tests can only run much later because it takes five hours to run. You can only run it every night. So all these pieces, the dude guy is going to go in and okay, we’re going to do it this way. I’m going to put these tests here to run it every night or maybe every week or something, and then that’s it. And then once that’s in place, it stays for years. Cause nobody dares to touch it. Well, of course, like as the project becomes bigger or changes, what’s the optimal strategy to do the merging or where to run the test actually should change. So we run it every day. Is it better to run it every day? Or is it better to run it every one, two days so that we have extra cycle that can be used to run different kinds of tests. Now what’s optimal? So people are not actually intentionally making those decisions. And so that’s the point, if you can make it those decision data-driven at least, and then if you can further step up that decision, like leave those decisions to the machines. Now that’s the war that I want to get to.
Henry Suryawirawan: [00:28:10] So I can definitely relate to some of the problems that you mentioned. One line change, like worse, if it’s just a, it’s a string, right? It’s like, it’s harmless, like a string on a label or something, and you run through whole cycle of the build and you wait for that until it finishes. Only then, of course, you are sure that it will pass, anyway. So I understand all these things that you mentioned. But how do you integrate Launchable? Where exactly that Launchable is actually triggered, and how does it help over the time to actually make this kind of prediction or judgment for you to tweak? And then what you should do afterwards? Maybe you can explain a little bit on that side.
Kohsuke Kawaguchi: [00:28:47] Yeah. So the key to make it universally useful is to reduce the intrusion as much as possible. So make the surgery as small as possible. Because I know how scary for some people to modify. It’s like a delivery pipeline. It can feel like a snowflake. So we created this like a very small open source command line tool, and that’s your agent. And then the first thing you do is when you’re building a binary, you’re going to tell us, okay, here’s the source code, that’s kind of turn into build, so otherwise known as buildable materials. What went into this program? Like a Git commit basically. Now at some later point when you run the tests, you want to tell us, okay, these are the 15,000 test cases I’m thinking about running and the software that I’m going to test this build, and I was asked to kind of point that back to the idea of buildable materials. And then we’re going to tell you, okay, so in that case, another parameter is what’s your goal? So maybe you tell us, let’s create 20 minute subset of the tests or there are different ways to specify it all. And then we’re going to say, okay, so in that case, run the test, number 1, 5, 7, and 20, and blah, blah, blah. And then we give it in the form that’s very easy to pass into the test runner.
You might be using, I don’t know, like a Maven or a Node, or like every programming language has multiple test runners. So like, it’s incredible number of those. But usually, these test tools are capable of receiving what tests to run because developer needs it interactively. So we use that mechanism to controlling the subset to the test. And then after everything is said and done, we need to see how those tests have behaved. So you point pass to these tests reports like usually JUnit files, it’s de-facto standard XML report format. And then we look at those to see, okay, this test has run and took a 1.5 second, this failed and blah, blah, blah, and I mean like a slur of those information on the available internet from your side and we crunch those numbers with AI. Tomorrow our brain would be a little better so that we can make a little better prediction. That’s how it works.
Henry Suryawirawan: [00:30:51] You mentioned about language and testing framework. Is this tool language specific? Like tied to a specific programming language? For example, we have also multiple layers of tests, just like you mentioned, you have unit tests, functional tests, maybe UI tests, security tests, performance tests. Can this be tool used for all those tests as well?
Kohsuke Kawaguchi: [00:31:09] So no, it does not depend on any programming language, and yes, it can be used with any tests. So quite intentionally, I decided that, at least right now, I’m not going to look into what’s the source code and how it’s structured. There are some tools and some approaches in which people parse these, let’s say the Java programming, Java source code and then build this syntax tree, and try to do this almost like a code analysis based approach, or the code coverage based approach. But that’s going to, how to say, naturally make it a programming language. We have this incredible diversity of how we do things. It’s actually a program, it’s really a program. But that’s the reality. So I wanted to make this universally applicable, which goes like, you know, quality program is universal. And then the flip side of it is I think it’s missing some, you know, like I’m failing to exploit some performance efficiency that the machine learning can do. But right now, frankly, squeezing like the remaining last 5% of the performance isn’t my concern. I think it’s the idea that something like this is useful, that we should be leaving more control to the machines. That’s the idea that we need to educate people, and people need to pick up. So that’s where we stand.
And the same thing about the phase of testing. You’re right, like maybe do anything from unit testing, to API testing, to UI testing, to whatnot. So the general idea is like you have this thing you’re testing that can be attributed to a certain source code, the buildable materials, and then you have this thing called, something that runs against it and produces a binary result pass or fail. Then we can fit into this mold. So that’s what we do. Like a unit test for example, if people tend to run that a lot, so that produces a lot of data, and more data tends to lead to a better model. So I’m seeing different performance. But then that is also true from one project to another. Right now, yeah, you can take this to anything, and then let’s see what it does in your project.
Henry Suryawirawan: [00:33:03] I mean, certainly it will work nicely with the software that has a lot of tests. But what if the software has very little tests? I’m sure there are many software these days that don’t have enough test coverage.
Kohsuke Kawaguchi: [00:33:14] Yeah. So there’s like a whole other area of quality effort like you said focused on automating tests. I’m not in that fight. Automating is hard. So I expect that to continue. But it’s also, the automation is a very fragmented field. Like in a browser, automation it drives one thing. Mobile automation is a wholly different game. And if you’re trying to automate the testing of a video game, it’s entirely another thing, and then so on and so forth. So, again, it’s really hard to build anything universal in that space. Yeah, so I leave that fight to these people. And then when you start to accumulate more tests, now you have a different problem, and then that’s where we can help you.
Launchable Customer Story [00:33:54]
Henry Suryawirawan: [00:33:54] Right. I’m sure you have a number of customers who have tried and used Launchable. Maybe any particular story that stood out for you that you can share with the audience here. How Launchable actually helped that customer to improve their productivity or their testing cycle?
Kohsuke Kawaguchi: [00:34:10] Yeah. So I went to this conference in BazelCon. So the Bazel is a build tool that came out of the Google. It’s an incremental build test systems. So, when people have this like a massive software, their build and test takes a long time. So these people rely??? into these tools. Now, changing a build tool is a huge deal. It impacts everybody’s working style. So like I said, BazelCon is a place for all the users, I guess the key users of the Bazel that comes together, and then stuck their notes. So one of the presentation there was the speaker was saying, “Oh, so in this company, we assign four engineers working on two years to take on this project, and we are almost done, but it’s still going to take half a year or so.” So I was shocked to hear, gosh, like you spent eight month years of like you know of the good engineers. These build guys, I know that they are good engineers, and then you’re still not done. And then the first guy who stand up to the question said, “Oh, that’s amazing. You did it so quickly. What was the trick?” And I was, like, probably shocked, like, “What’s going on here?” This is not the timeframe that I’m used to. So you know, as I’m saying to him when calling??? those guys. And then they introduced us to another team inside the company, and so now we started working with those.
So in some sense, like this is not the ideal world. Most of the time, especially in established project, these people are in between the rock and hard place, have to keep things running every day. And then, it’s difficult to that increment to be squeezed a little bit more, bandwidth speed, like whatever, either way, you know, your test hardware combination, like they are doing embedded development. So tomorrow like a new hardware type arrives. Some like run test overload, like increases like multiplied. So they need this immediate help to get stuff under control so that they have cycles and option to take on larger, longer project. I think they see us as this kind of key ways to buy more time. Because when you’re running a subset of tests, you don’t even need to change anything fundamentally different. You don’t need to spend four engineers working on two years, before you start to see any benefit. So that was that experience.
And there’s like another, on the totally opposite end of the spectrum. There’s a team of two engineers working on the web service, and their testing cycle is twenty minutes. So like a far cry from one hour unit test. But he was nonetheless still frustrated. 20 minutes is a difficult time. If I’m telling you that your next meeting is going to be in 20 minutes, what are you going to do? What can you do? It’s too short to work on something new. So it’s like adjusting this like a weird neck spot???. So we’re able to cut that down into like 11 or 10 or something, basically half. So he really liked that. That creates this much rapid cycle. He doesn’t need to context switch as much. He even said that allowed him to write more tests because like previously, he was wary that every test he adds, it’s going to add to the time. So that was a demotivator, but now he doesn’t have it. So that was another great story for me.
Future of Software Testing [00:37:13]
Henry Suryawirawan: [00:37:13] So you have been dealing with these problems since I think last few years, about software testing. So what do you think the future of software testing? Like tools like this, I think this is probably one of a kind. I haven’t heard such tool before. Like I know about Bazel incremental build, like it doesn’t build the whole software. But to test only a part, a subset of your test cases, I think it’s still one of a kind. So what do you foresee the future of software testing going forward?
Kohsuke Kawaguchi: [00:37:40] Yeah. So there’s like I think two things going on. One is like ensuring quality became a lot bigger than testing. So at Sun, again, it’s like 20 years ago, we are shipping packaged software. A lion’s share of the quality came by running tests, and making sure that things are good before you ship it. Now for the most of the people, that’s not how we ensure quality. I mean, testing is still a key pillar, but now we have other means like feature flags, controlling, doing a smarter rollout, using monitoring to flag the failure risk before too many customers get impacted. I feel like there’s more defense in depth. There’s all sorts of practice combined, like each of them designed to reduce the risk, accepting the fact that some bad stuff is going to happen. So I think there’s a lot of that going on, more of that is going to happen.
And then, not only on the testing side, I think what needs to happen is focusing more on the increments. Again, Sun was in the business of selling this packaged software, and then we are releasing it every maybe once in three months, or major release is every once two years. And then minor release like every six months, and then pack CDs every month or something like that. I mean, that’s kind of like a timeframe we used to operate on. So they’re having a full test cycle that takes two weeks. That’s kind of okay. Like it’s painful, but it was manageable. But if you’re trying to ship every day, like deploy every day, you cannot possibly have capacity weeks. But the amount of tests that you amass could easily run up to those. So I think the way to make this work is it’s almost like movie frames. If you look at the reels, there’s a 24 frames per second or something like that, right? One frame to the next is only going to change a little bit. But if you try to inspect every frame thoroughly, then you’re not going to have enough time. Maybe you can do with like one frame per second, not 24. So it really creates this natural upper limiting how frequent can you do? So I think the right thing to focus on is not to look at one frame in the movie, but look at two frames in a movie, and compare the difference. Because if your speed goes up, that delta is going to go down. So I think that’s the change that needs to happen. Some people doing it could be like in a hotfix. There’s a different process. Something sort of different process for hotfix is like this expedited lanes. That’s like a very crude way. Well, the hotfix is small, so we should be able to do something small, put that into production. So it’s the same thinking. I don’t think that they’re doing it consciously. But that needs to be more embraced and became more daily routines.
Henry Suryawirawan: [00:40:19] Yeah. I mean like hotfix, yeah. I must admit, I also have this kind of high-speed lane where you probably don’t write the whole test. Sometimes even you are so confident that you just deploy it in whatever way. So I can see definitely with tools like this, and hopefully in the future, more and more such tools available, we can improve this kind of hacky way of doing the software deployment. So I think that will be great, definitely.
Data-Driven DevOps [00:40:41]
Henry Suryawirawan: [00:40:41] So I saw another talk that you gave maybe few months ago. It’s about data-driven DevOps, right?
Kohsuke Kawaguchi: [00:40:48] Yeah.
Henry Suryawirawan: [00:40:48] So what do you mean by data-driven here?
Kohsuke Kawaguchi: [00:40:51] I mean, I talked a little bit about how like we’re sick of a manual transmission, to automatic, the continuous variable transmission. So, on my thesis, by and large, the way people put together the delivery process is by gut and instinct. So how do you go up from there? I think the next step up from there is to use the data that comes out of your system to help you make the right decisions. Is this test failure or flaky? And then the experienced person who’s been in this project for long, already knows that, “Oh, sometimes it fails in that way.” So let’s say it’s a tribal knowledge. When I say data-driven DevOps, I mean like don’t rely on this human experience, and let the system tell you. We should be able to find that kind of information from data. So that’s one.
And then I gave like a few stories of how the real world team are able to do something seemingly very small and easy, and still manage to get a lot of value out of these efforts. There was one example of a guy doing a regular expression pattern matching on the last X lines of build log. The regular expression, of course, that has to be like the tool that we bring in, oh, it’s almost got this like a smell of a backpage solution, right, from the beginning. So depending on how stuff failed, if the infrastructure failed, if the CI build testing infrastructure fail, then you want to notify the CI team. If the failure is due to the application core, or the test failures, you’re going to notify the application developers. And then like nothing is worse for the CI team, for the app developers to get notified of a failure of the CI team, because it really hurts their credibility. So he wanted to defend his own fame against this, and that was really effective. I can totally see that. So those are like great example in my mind about how you’re using data to make a little bit improvements here and there. So I’m not trying to talk about like a number of pull request per day, story points. I’m sure those are good for some things, but that’s not what I mean, but it’s more of the tactical improvements we can do. So that’s the kind of story behind data-driven DevOps.
Henry Suryawirawan: [00:42:52] So for those kinds of scenarios, then how would you actually collect the data? Because yeah, it’s not integrated most likely into our tools, into our software development. So how would you track all these data, and actually use that to do this retrospection, and also improve in the future?
Kohsuke Kawaguchi: [00:43:09] Yeah. So, luckily, we are not the only people who are suffering from the data problem. The whole world is going through the “the big data thing”. And so there’s a lot of people building these, getting data from whatever, the marketing, sales, etc, customer interactions, and then try to extract useful knowledge out of it. So this is a well traded space. Lots of BI solutions. And I’ve seen number of teams over the last decade, they built varying high solutions. You can spot those. Those are like a dime a dozen. Like if you ask me, then this might sound really selfish, but that smells like back in the days where every company was building their own CI system because they needed it. But it takes time and trust me, you don’t want to do it. So I think that’s a sign for me like this more general solution needs to appear. So that’s part of the motivation for me doing Launchable. It’s clearly that could be seen as a universal problem, and the people are solving it on their own, you should be able to do better than that.
Running Launchable [00:44:09]
Henry Suryawirawan: [00:44:09] So for people who would like to learn and use Launchable, is there some kind of a free tier, where do they play? Actually, I don’t know how to run this Launchable. You mentioned there’s a CLI, is it like running on local?
Kohsuke Kawaguchi: [00:44:21] Yeah. So if you go to LaunchableInc.com, you can sign up and get the API key within 15 seconds, and then there’s public documentations. And as I said, you only insert like at least few things in your CI script to get this going. And it’s free for open source project. It’s also free for small teams. And I’m trying to work on building like a playable demo. Because the thing is like before you can start seeing the impact in your project, it takes a little bit of time for data to have mass. So I have this like a toy project. The idea is like you go ahead and create the pull request to this toy project, and we will show you that it’s going to pick up the right test.
Henry Suryawirawan: [00:44:59] So I’m sure it will be great if It’s integrated to open source project, and you can see over the time, actually because open source normally has a lot of contribution. Especially the most popular one, of course. Probably it’s a good way to demo the capability of Launchable, right?
Kohsuke Kawaguchi: [00:45:13] Yeah, exactly.
3 Tech Lead Wisdom [00:45:14]
Henry Suryawirawan: [00:45:15] So Kohsuke, thank you so much for your time. Before we end the conversation, normally I would ask one question for all my guests, which is about three technical leadership wisdom for all the listeners to listen from. What kind of advice would you give to them?
Kohsuke Kawaguchi: [00:45:28] Yeah. So this is an interesting question that made me pause a little bit. I think most technical leaders, their starting point is like a leadership by code. That was the case for me. You just write a better code more quickly, and that’s like how you influence people around you. And that’s great, but it actually only carries you so far. You might be able to do it like a 3 times better, maybe 10X better, but by the time you need to influence 10 people, like your 10X better coding skill has become awashed. So I felt like I struggled a lot from jumping from that to the next. So I think recognizing that’s a fundamentally not a scalable enough way to make a difference. I think that’s one.
And two, I often describe people as like a sphere of influence. So the way as a technical leader, you influence other engineers around you is, you know you make an impact by influencing people around you. In some sense, they don’t really need you. They get things done in their own ways. So what is the value that you can add? One thing that I gravitated there is it’s like a power of storytelling. At one point in CloudBees, I made it my job to go out to the software development teams, and talk to how they do things and their struggles, and it created the story almost like in a portrait of their lives, and then I bring it back to the company. The idea here is I want to help you mentally picture few years problem you’re solving. And then also hoping to get you excited in solving those people’s problems. If you can paint, you can see your customers' picture. It’s like, it’s always better than like trying to help the abstract side of people. That’s a power of storytelling and I’d like to think it made some impact. And then changes how people do things a little bit differently. And that’s like all you can hope for, but those things are much more scalable. Like I can write this one piece in a few hours, and it gets read by 200 people or so. So coding is difficult, but writing more so. Presentation similar.
And then along similar lines, I started thinking about empowering more individuals. So you’re trying to do something by yourself is useful, but again, it only scales so much. At some size of the organizations, like it becomes more efficient to spot people who are trying to do the right thing, and then send more spotlights and resources and I can tell people like, okay, this guy is doing awesome things, let’s help him. So that, I find to be somewhat rewarding too, but more effective. Those are the things that I felt like it’s worthwhile to think about.
Henry Suryawirawan: [00:48:05] Those are pretty good. I really like the storytelling part, and also like putting the spotlight for good behaviors that you want to promote within your team. I think that really resonates with me as well. So thanks, Kohsuke, for this lovely chat. So if people want to find you online, they want to connect with you or learn more about this Launchable product, maybe where can they find you online?
Kohsuke Kawaguchi: [00:48:25] Yeah. So Kohsuke.org is my website, and there are all the kind of key links available. LaunchableInc.com is our company website. I’m there on Twitter, Facebook, et cetera, the same with the company and Jenkins is also as well. So yeah, that’s how you can find me.
Henry Suryawirawan: [00:48:43] So thanks, Kohsuke. I wish you good luck with Launchable. Hopefully, one day I will be able to play around with it as well.
Kohsuke Kawaguchi: [00:48:49] Thank you. Yeah. And I’m looking forward to going back to Singapore at some point.
Henry Suryawirawan: [00:48:53] All right, cool. Then we’ll see you by then.
– End –