#34 - Improving Developers’ Productivity With Universal Code Search and Sourcegraph - Beyang Liu
“Developer productivity is not lines of code written. It’s not the number of commits. It has to do with the ultimate problem you’re solving and the users that you’re solving it for.”
Beyang Liu is the CTO and co-founder of Sourcegraph, a developer tools company that brings universal code search capability for developers. In this episode, Beyang shared with me his perspective about developers’ productivity and how we should go about measuring developers’ productivity, including the danger of measuring productivity by using proxy metrics. We then discussed the rationale for universal code search and why he thinks there is a massive need for it to increase developers’ productivity, borrowing from his experience working at Google, and especially to cope in the current era of “Big Code”. Towards the end, Beyang shared how individuals can improve their personal developer productivity and what the future state of developer tools would look like. Also, listen to some of the Sourcegraph cool use cases that Beyang shared based on the feedback given by his customers.
Listen out for:
- Career Journey - [00:04:53]
- Developers Productivity - [00:07:47]
- Measuring Developers Productivity - [00:12:15]
- The Danger of Proxy Metrics - [00:16:51]
- Productivity in Enterprise vs Startup - [00:23:40]
- Rationale for Code Search - [00:26:54]
- Code Search Case Studies - [00:33:16]
- Other Useful Developer Tools - [00:38:32]
- Ex-Googler’s Guide to Developer Tools - [00:42:20]
- Improving Personal Developer Productivity - [00:46:07]
- Future State of Developer Tools - [00:49:32]
- 3 Tech Lead Wisdom - [00:54:28]
_____
Beyang Liu’s Bio
Beyang is the CTO and cofounder of Sourcegraph, a developer tools company bringing universal code search to every developer in open source and every software organization, including leading companies like Uber, Dropbox, Yelp, PayPal, Cloudflare, and more.
Follow Beyang:
- Twitter – https://twitter.com/beyang
- LinkedIn – https://www.linkedin.com/in/beyang-liu/
- Website – https://beyang.org
- Sourcegraph – https://sourcegraph.com/
Mentions & Links:
- “An ex-Googler’s guide to dev tools“ blog – https://about.sourcegraph.com/blog/ex-googler-guide-dev-tools/
- An ex-Googler Technologies Guide – https://github.com/jhuangtw/xg2xg
- Quinn Slack – https://www.linkedin.com/in/quinnslack/
- 📚 Accelerate – https://amzn.to/3tcjzc6
- DORA metrics – https://cloud.google.com/blog/products/devops-sre/using-the-four-keys-to-measure-your-devops-performance
- Flow state – https://en.wikipedia.org/wiki/Flow_(psychology)
- Observability – https://docs.honeycomb.io/learning-about-observability/
- TI-83 graphing calculator – https://en.wikipedia.org/wiki/TI-83_series
- BASIC – https://en.wikipedia.org/wiki/BASIC
- Google – https://about.google/
- Microsoft – https://www.microsoft.com/
- Palantir – https://www.palantir.com/
- Google Code Search – https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43835.pdf
- Microsoft Windows – https://www.microsoft.com/windows/
- Scrum – https://scrumguides.org
- DevOps – https://en.wikipedia.org/wiki/DevOps
- Coveralls – https://coveralls.io/
- Codecov – https://codecov.io/
- Sentry – https://sentry.io/
- Honeycomb – https://www.honeycomb.io/
- Pulumi – https://www.pulumi.com/
- Terraform – https://www.terraform.io/
- HashiCorp – https://www.hashicorp.com/
- AWS CloudFormation – https://aws.amazon.com/cloudformation/
- Docker – https://www.docker.com/
- Kubernetes – https://kubernetes.io/
Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.
Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.
Career Journey
-
After I left Google and worked at other companies, I realized that code search was not the standard thing, and so it always felt like this piece of the toolkit that was missing.
-
We spend very little of our times doing that (the act of creation) and way too much of our times just trying to make sense of the existing code base. We saw also that pain reflected over in the software engineers at our customers that we were working with. That was the kernel or the initial conversation that ultimately grew into Sourcegraph. So wanting to get back to that awesome flow state that every developer loves that inspired us to get into programming. Make that more accessible to day-to-day developers by helping them understand their existing code.
Developers Productivity
-
Developer productivity is not lines of code written. It’s not number of commits. It’s kind of irreducible to any sort of low level metric like that. Because if you think about it, at the end of the day what we do as developers is we’re really creating new technology. Every single feature or product that we work on is this new thing. It’s never existed before. Our job is to build a program that solves that new problem in a new way. It’s almost like a mini R&D project rather than a factory model.
-
It has to do with the ultimate problem you’re solving and the users that you’re solving it for.
-
Organizations that end up treating that (lines of code) as a proxy, even like a first order proxy, I think end up introducing perverse incentives into their development teams. That’s how you get a lot of code written, but not actually solving user problems.
-
The other kind of angle to look at this is to view it in terms of the overall picture. In some sense, every piece of software is worth some amount of time saved or money because time is money in a certain sense. So if you look at the value that you’re creating, one thing that you can look at is how much is this piece of software being used by users? Who’s using it? And how much are they paying you? Or how much revenue are you generating from this piece of software? Because that gives you a better proxy for how much productivity you’re generating.
-
If you want to measure productivity at a high level, you start with the end user. Think about time saved or money saved through the use of your software.
-
One responsibility of the product organization is to roughly map user value, time saved, whatever matters to the company and to the customers to this point system that reduces all those complex factors and their interrelationships between different teams to this rough point allocation that’s easy for engineering teams to reason about.
-
At an individual level, it’s almost like this feeling. It’s almost like this binary thing. As a developer, you either feel productive, you don’t feel productive. So at an individual level, I think it often reduces to just which mode are you in? Are you in paralysis mode or are you in getting stuff done, shipping code mode?
Measuring Developers Productivity
-
It is important to measure things and to quantify them. Because numbers keep people honest. Obviously, they’re never going to capture everything that you care about. But in order to have something falsifiable and checkable, it’s good to define a metric and then track your progress toward it. So metrics and quantifying things are important. But it’s really important what metrics you choose and how you quantify that.
-
Right now, I don’t know if there is like a single universal metric of developer productivity that makes sense to adopt across every software organization. Because I think whatever metrics that you use are, in some sense, tied to your unique organization, and tied to the problem that you’re solving, or the type of software that you’re building.
-
Every organization is a little different. It starts from the top. That person has a number that they’re trying to optimize that as a proxy for the overall end user value that they’re trying to deliver. And then part of the art and craft of building an organization is figuring out what the right breakdown of responsibility is. And what metrics are going to be used to evaluate how each constituent part of that organization is working.
The Danger of Proxy Metrics
-
If you’re using any sort of point-based system, one of the things you want to be wary of is ensuring that you’re being honest with yourselves about point allocation. A lot of it depends on the specific people you have on the team.
-
One thing you can do is you can send out a survey. Just surveying your developers to see how satisfied they are and ask them how productive they feel relative to their ideal level of productivity, or the levels of productivity that they attained working at previous companies. A lot of developers have a good innate sense of how productive they feel, cause again, it’s this binary thing.
-
You can also look at the ops and deployment side. There’s a standard set of metrics now, or I should say an emerging standard, related to how quickly you deploy and ship software.
-
Unfortunately, I don’t have a great answer, other than talking to your developers and asking them, “Hey, are you hurting right now?” or “Do you feel like you’re being really productive?”
-
There isn’t really a good universal set of metrics that you can look at on the application engineering and product side that gives you a good idea of how good your team is operating. I think that side of the picture is almost like emotional, intuitive, empathetic. And I think that’s why empathy is such an important quality in engineering managers. Because they kind of have their finger on the pulse of the development team. They kind of know what the potential of the people is on that team, and how close they are to realizing or actualizing that potential.
-
I think you can’t use those (e.g. code coverage) as top-line indicators of overall health. They are just KPIs that you track to address specific problems. And you always keep the human in the loop here. You’re always in touch with your development team to ensure that as the number goes down, people also feel better about it.
Productivity in Enterprise vs Startup
-
I think fundamentally it’s the same. Software developers, wherever you are, whether in a big company or small company, there’s a couple of things that you really care about that make you really productive. Fast feedback loop, are you able to build tests, compile the code quickly? How long does it take between offering a change and seeing it in production and getting into the hands of users?
-
One of the things that might be undervalued at both startups and large enterprises is the amount of the job of being a professional software engineer or developer that is reading and understanding code.
-
Before you can actually write the code that builds the specific feature that you’re implementing, you have to understand the code base that you’re contributing to. You have to understand how your change fits into that. You have to understand what other shared libraries or packages exist in your code base, or perhaps out there in the open source world that you can use in leverage because you don’t want to reinvent the wheel. And then, when you go to submit the change, most organizations these days practice code review. So someone else on your team has to go and review the code and understand your change, understand how it impacts the rest of the code base. Ideally understand it as well as the person who authored the change.
-
I think where it gets different between the two scenarios is just the constraints that you have to work with. If you’re working inside a large existing organization, especially one that has an established product with a ton of users, you’re going to be working in an environment with more constraints. Those constraints are sometimes bad, but often good, and they always exist for a reason. These are all constraints that are necessary to make the team functional on a larger scale. All in the service of building a more complex product, that’s able to do more stuff for more users.
-
It becomes a bigger part of the problem-solving role of software engineering inside larger organizations. Because you’re not just solving for the needs of the users directly, but you’re solving also for these constraints imposed by the overall organization, in the service of the overall user set or customer set of the organization.
Rationale for Code Search
-
The problem is this collection of challenges and difficulties that people are starting to call big code. Big code is a buzzword. It’s like big data in a sense. But what it encapsulates is all the things that get much harder when you’re developing code at scale inside this world and universe with way more code than there used to be. And that applies both to code bases inside the organizations.
-
And the other side is the exploding world of open source. The past 10 years have seen this sharp uptick in the volume, in the number, and the diversity of open source libraries and packages that are available.
-
The challenges of working inside a large code base, it means that there’s more context that you might be unaware of. There’s more to read and understand before you can start writing your new feature with confidence.
-
All these challenges are related to the fact that we’re operating in this world of an unprecedented volume of code, and all of it might be relevant to the task you’re doing. And you want to take advantage of that wealth of knowledge. And at the same time, you have to satisfy the constraints that volume of code might impose on you.
-
All these challenges they add up, and they add up so much that at some point reaches a breaking point. At some point, it reaches the point where the constraints that you have to work with and the context that you have to load up in your head start to be too much to fit into your IDE. You can’t just clone all the code that you need to know about to your local machine, and one by one set up their development environments and spin up in your IDE.
-
The best way to learn a new library is often by example.
-
You’re operating in this giant code base, you never know when you’re going to have to understand a part of that code that’s unfamiliar to you, or that you’ve never worked on before. And in order to get there as quickly as possible with minimal friction, with minimal context switching, you need something like code search, that’s optimized to help you discover and understand those unfamiliar pieces of code.
Other Useful Developer Tools
-
This is a really exciting time to be working on developer tools and to be a software developer in general.
-
It used to be very much the case that good developer tools were associated with a single proprietary ecosystem.
-
It’s also been a bit of a challenge because now there’s like a bunch of different choices to choose from.
Ex-Googler’s Guide to Developer Tools
-
Google as a development organization is one of the most advanced and sophisticated in the world. They pioneered a lot of tools and technologies that have since either made it into open source or they’ve inspired similar open source counterparts.
-
A lot of the tools that they use internally are similar to ones in open source, but they’re not the same ones.
-
The developer experience inside Google is so good that one of the first things that developers do when they leave Google is they try to recreate pieces of that developer experience.
-
Our goal in the post was to present some of that information in more of a narrative fashion. And talk about not only what tools you might want to look at, but like, how do you bring those tools into your organization, as an ex-Googler entering the outside world.
Improving Personal Developer Productivity
-
You always want to start with the pain that you’re trying to solve.
-
When you’re coding every day, make a note of what you find difficult? What annoys you? Where you’re kind of like context switching away from code? Like anything that takes you out of that flow state. Anything that takes you out of that, or anything that prevents you from getting into that, that’s something that you should address. And you should address it in the way that programmers address every problem, which is figure out how to automate it. Chances are if it’s causing you pain, it’s because it’s something rote or repetitive or manual, or potentially unnecessary.
-
Find developers on the internet who write blog posts, or maybe they tweet about their workflow or the tools that they use, and use the tools that they use.
-
One of the ways you get good at any sort of craft is you learn from the masters of that craft.
-
Software development is one of those things where there’s so many tools that chances are if you talk to anyone who’s been doing this for at least some period of time, they’ll probably have at least one tool that they are aware of, that would be really useful to you, that would be really cool to try out.
Future State of Developer Tools
-
I think you’ll see more and more companies that are building tools, where the tool that’s relevant for the individual developer, that part is going to be open source. And then there’s an enterprise component that is more important for teams or organizations that might be kept proprietary in order to be able to build a sustainable business on top of the technology.
-
With the explosion in diversity of tools that we’re currently looking at, there will inevitably be the pendulum swinging backwards a little bit into more consolidation. Especially around particular ecosystems and targeting specific aspects of the software development life cycle.
-
I think now we’ve gotten to the point where there is no single entity that can capture all the creativity and innovation that can happen with software. And you see that with the web.
-
You’re going to see the continual flourishing of third-party independent developer tools ecosystem, that’s going to resist complete consolidation.
-
As software continues to eat the world, as software development becomes ubiquitous, at some point, the dev tools market will become like the productivity tools market. Because code will become so synonymous with knowledge work that the vast majority of people in the community will be building software in some shape or form. And that’s going to be like this huge, vibrant, diverse ecosystem.
3 Tech Lead Wisdom
-
When it comes to motivating the people on your team, and unlocking their potential, you always want to start with the why aspect of their jobs.
-
So rather than telling people do this, do that, you always want to give them a kind of high-level goal that they can creatively use their creativity to find a solution to.
-
That creative drive is what motivates every developer. It’s why we got into programming. That act of creation. That act of creative problem solving.
-
If your manager is explaining a problem to you, ask the questions about, what is the end desired state here? Rather than just taking down the orders. Because I think that will ultimately help you do your job better, and lead to a happier manager because you’ve done a better job. You’ve found a more creative solution to the problem that you were asked to solve.
-
-
If you find yourself doing anything twice that you don’t like doing, find a way to automate it, or look for a tool that automates it.
- That is your kind of job as a programmer, is to automate things. And if you’re not automating your own life, then you’re not living your own values.
-
Remember to have fun. It’s amazing that we get to work in a job like this.
- At the end of the day, the best way to make yourself do a good job, and also not get overwhelmed by the complexity you’d have to deal with is to remember what got you into programming in the first place. Focus on that and try to experience that as many times per day as possible. Because I think if you do that, it will drive you to focus on more of the creative aspects of the job. It will also motivate you to automate the more mundane or rote aspects of the job. And it’ll just make you a happier person because it’ll feel like you’re doing something that’s truly exercises the human creative aspects of your brain.
Episode Introduction [00:00:45]
Henry Suryawirawan: [00:00:45] Hey everyone. Very excited to be back here again with another new episode of the Tech Lead Journal podcast. Thank you for tuning in and spending your time with me today listening to this episode. If you’re new to the podcast, know that Tech Lead Journal is available for you to subscribe on major podcast apps, such as Spotify, Apple Podcasts, Google Podcasts, YouTube, and many others.
Also, please check out and follow Tech Lead Journal social media channels on LinkedIn, Twitter, and Instagram. Every day. You will find words of wisdom from the latest podcast episode, and I share them on those channels to give us some inspiration and motivation for us to get better each day. And if you’d like to make some contribution to the show and support the creation of this podcast, please consider joining as a patron by visiting techleadjournal.dev/patron. I highly appreciate any kind of support and your contribution would help me towards sustainably producing this show every week.
For today’s episode, I am happy to share my conversation with Beyang Liu. Beyang is the CTO and co-founder of Sourcegraph, a developer tools company that brings universal code search capability for developers in open source and every software organization, including leading companies like Uber, Dropbox, Yelp, PayPal, and CloudFlare.
In this episode, Beyang shared with me his story on why he started building Sourcegraph, and the kind of problems that he’s trying to solve with it, learning from his experience working at Google, and experiencing firsthand a great developer experience using Google internal tool called Code Search. He shared with me his perspective about developers’ productivity, and how we should go about measuring it, including some words of caution for measuring productivity using proxy metrics, such as lines of code, number of commits, or pull requests, etc.
We then discussed the rationale for universal code search, and why he thinks there’s a massive need for such developer tool to increase developers’ productivity, and especially to cope in the current era of “Big Code”, a new kind of challenge that every software team and company has to deal with related to the ever-increasing size and complexity of the codebases that a typical organization has to maintain. Towards the end, Beyang shared how individuals can improve their personal developer productivity and what the future state of developer tools would look like. Do also listen and check out some of the Sourcegraph cool use cases that Beyang shared with me based on the feedback given by his customers. Those use cases are pretty cool to me. And you may understand why universal code search might just be a tool that you need to increase your developers’ productivity.
I enjoyed my conversation with Beyang. And if you also enjoy listening to this episode, consider helping the show by leaving it a rating, review, or comment on your podcast app or social media channels. Those reviews and comments are one of the best ways to get this podcast to reach more listeners. And hopefully they can also benefit from the contents in this podcast. So let’s get this episode started right after a sponsor message.
Introduction [00:04:24]
Henry Suryawirawan: [00:04:24] Hey, everyone. Welcome to another new episode of the Tech Lead Journal. Today, I have a special guest with me. His name is Beyang Liu. He’s the co-founder and the CTO of Sourcegraph. So if you haven’t heard about Sourcegraph, it’s actually a tool for developers to do code searching and basically helping the developers in order to be more productive with their code. I’ll let Beyang to share more about Sourcegraph later. So Beyang, welcome to the show.
Beyang Liu: [00:04:51] Hey Henry. Thanks for having me on.
Career Journey [00:04:53]
Henry Suryawirawan: [00:04:53] So Beyang, before we start, maybe let me hear from you, your introduction. What’s your background? What’s your highlights? Turning points in your career for us to hear about.
Beyang Liu: [00:05:03] Yeah, definitely. So, I’ve been programming since maybe like middle high school. I think I got into it first on like a TI-83 graphing calculator. There’s like a dialect of BASIC that you could code in. That kind of got me into it. It was really fun. I was always kind of a math and science nerd in my elementary, middle school days, and this kind of fit right in with that. After I got to college, I took CS101 and fell in love with it, and things unfurled from there, I guess. From there, I ended up doing this internship at Google during my college years. I just mentioned that because that was where I first got exposed to this concept of code search, which is a thing that starts me down this path that ultimately culminated in starting Sourcegraph with Quinn, and building developer tools. But that was cool, cause it gave me a bunch of experience playing around with like different developer tools that Google built in internally. One of which was with this Google Code Search thing. After I left Google and worked at other companies, I kind of realized that code search was not the standard thing, and so it always felt like this piece of the toolkit that was missing. That was the case at Palantir. I joined Palantir shortly after graduating college. I worked on the commercial side of the company. We’re building big data analysis tools for large banks and financial institutions. And that’s where I got to work closely with Quinn, my co-founder, for the first time. We were on this kind of like small startup within a startup team, trying to tackle the data related problems of these large banks. Quinn and I were both programmers by background. Both had this kind of like side hobby of trying out different tools, different editors, different command line, utilities. And we were always talking about how much of our time was spent reading through existing code and making sense of that, and also this like gap between.
When we first got into programming, I think a lot of people get into it and fall in love with it. With that sense of being able to create something. You sit down. You type away at your desk, and then a couple hours later or maybe in couple days later, however long the coding session is you walk away with something that you built and created. That is almost like this living thing that does what you want. Contrast that to like the experience of a typical professional software developer, which we’re getting more and more acquainted with. And we’re like, man, we spend very little of our times doing that and way too much of our times just trying to make sense of the existing code base. We saw also that pain reflected over in the software engineers at our customers that we were working with. That was the kernel or the initial conversation that ultimately grew into Sourcegraph. So wanting to get back to that awesome flow state that every developer loves that inspired us to get into programming. Make that more accessible to day-to-day developers by helping them understand their existing code. So we ended up starting Sourcegraph from there. At this point, probably the majority of my professional career has been working on Sourcegraph. So we started the company back in mid-2013 and now it’s seven going on eight years later.
Developers Productivity [00:07:47]
Henry Suryawirawan: [00:07:47] Thanks for sharing your story. Definitely, it’s interesting because you bumped into this opportunity also by chance, seeing how Google does the internal tooling for developers. You mentioned a lot of times about developers productivity. Maybe let’s start by the definition. What do you mean by developers productivity? Because these days, there are so many things about productivity, right? Everyone is into productivity, like your to-do lists, your calendar, whatever apps that you use. What do you mean exactly by developers productivity? Is it any different than any kind of productivity out there?
Beyang Liu: [00:08:17] Yeah. That’s a great question. Developer productivity is a notoriously difficult thing to define. So I will try to come at it from maybe a couple of different angles and see if that can help hone in on roughly what it means. First off, I just want to say, developer productivity is not lines of code written. It’s not number of commits. It’s kind of irreducible to any sort of low level metric like that. Because if you think about it, at the end of the day what we do as developers is we’re really creating new technology. Every single feature or product that we work on is this new thing. It’s never existed before. Our job is to build a program that solves that new problem in a new way. It’s almost like a mini R&D project rather than a factory model where you just like churning out widgets or whatever. Because of that property, you can’t really break it down into okay, this is just the sum of all the functions and classes and files that you author. It really has to do with the ultimate problem you’re solving and the users that you’re solving it for. So, that’s one angle. It’s not lines of code. It’s not like any sort of low level metric. Organizations that end up treating that as a proxy, even like a first order proxy, I think end up introducing perverse incentives into their development teams. That’s how you get a lot of code written, but not actually solving user problems.
The other kind of angle to look at this is just view it in terms of the overall picture. In some sense, every piece of software is worth some amount of time saved or money because time is money in a certain sense. So if you look at the value that you’re creating, one thing that you can look at is how much is this piece of software being used by users? Who’s using it? And how much are they paying you? Or how much revenue are you generating from this piece of software? Because that gives you, I think, a better proxy for how much productivity you’re generating. Because at the end of the day, software is all about solving and automating problems for people, saving people time. Having the computer do something so that a human being doesn’t have to do this rote and repetitive process. And so if you want to measure productivity at a high level, you start with the end user. Think about time saved or money saved through the use of your software. Now, that’s the high-level principled way of looking at it. How does that actually help you if you’re an engineering manager or a programmer on a development team inside a large organization? The connection between your individual work or your team’s work to the overall revenue of the company, or the time saved across all the users of the product, is often a hard line to draw. And that’s why you get things like the product organization coming up with points, like sprint points if you use Scrum or agile or one of those processes. One of the responsibilities of the product organization is to roughly map user value, time saved, whatever matters to the company and to the customers to this point system that reduces all those complex factors and their interrelationships between different teams to this rough point allocation that’s easy for engineering teams to reason about. And so then you’re talking about how many points are you doing in a given quarter or so?
I don’t know if that’s a satisfying answer, but that’s roughly how productivity is measured organizationally. But I think at an individual level, it’s almost like this feeling. It’s almost like this binary thing. As a developer, you either feel productive, you don’t feel productive. If you’ve done any amount of software engineering, you know that feeling. The difference between being in this flow state where your brain feels like it’s firing on all cylinders. You know you have all the contextual information you need, and you’re just coding away, and you’re like, “Holy crap, I’m getting a lot done.” Versus the other feeling, which is when you’re like, “Hmm, I don’t actually know how this piece of code that I’m contributing to works.” So all the changes I make are like too local, or I keep breaking things, or I keep having to ask this other person a bunch of questions, or the iteration cycle is too long for me to get feedback, to get into that kind of flow state. So at an individual level, I think it often reduces to just which mode are you in? Are you in kind of paralysis mode or are you in getting stuff done, shipping code mode?
Measuring Developers Productivity [00:12:15]
Henry Suryawirawan: [00:12:15] I like the way that you explain that at the end of the day, it has to bring value either to your users or bring revenue to your company. But obviously, this is a very tough things to define. Because how do you actually map back what you do as a developer? You know, like churning code, maybe debugging, testing, and map that back into how you bring values to the users. And I think it’s taken for granted in many companies that they don’t actually do this kind of measurement. I know when you say for individual, we know it when we feel it. There will be a day when you feel like, “Oh, you’re on fire. You’re getting a lot of things done. Your code just works pretty fine.” But there are days where you are drudging through maybe processes, maybe, I don’t know, like very difficult bugs to solve. Or even competing priorities between multiple tasks that you need to do. But in the first place, because in many companies, I’m sure that there’s no such thing as a good well-defined developers productivity metrics or measurement. But do you think that we should start measuring and investing our time in coming up with this developers productivity? Numbers, metrics, or even like some kind of rough proxy?
Beyang Liu: [00:13:19] Yeah. That’s a great question. I think it is important to measure things and to quantify them. Because I think numbers keep people honest. Obviously, they’re never going to capture everything that you care about. But in order to have something falsifiable and checkable, it’s good to define a metric and then track your progress toward it. Because if you just define things in terms of qualitatively, it’s very easy to trick yourself and other people into thinking that, “Oh, we actually hit that.” Where maybe you didn’t actually deliver on what you were trying to deliver on. So metrics and quantifying things is important. But it’s really important what metric you choose and how you quantify that. Right now, I don’t know if there is like a single universal metric of developer productivity that it makes sense to adopt across every software organization. Because I think whatever metrics that you use are, in some sense, tied to your unique organization. And in some sense, tied to the problem that you’re solving, or the type of software that you’re building. So, the way I think about it is if you’re a single developer working on a solo project that’s completely self-contained. Let’s say you’re an indie developer, you’re going to build a simple tool, then you’re going to sell that for money. Your metric, the one that matters, is really revenue. How much money are you generating? How much are users willing to pay for this piece of software? Because that tells you how useful this piece of software is to them in some sense. Or it’s at least a lower bound on how useful this thing is to them.
And you scale that up to a hundred-person org, a thousand-person org. At the top of your product engineering organization, there’s a person, maybe their title is VP, maybe the title is CTO, maybe it’s Director, who is responsible for delivery of the whole system. And is, in some sense, responsible for ensuring that system is delivering as much value as it can to its users. Whether it’s measured by revenue that product generates or time spent using the product. Whatever it is, they probably have one or two metrics that really are the KPIs, key performance indicator of how well they’re doing their job. When they report to the CEO, when they report to the board, that’s how they’re evaluated, at least in most well-run software companies. It’s that person’s job to then go break it down, like the people reporting them, “Okay. What do I need them to do in order to achieve this goal?” In some cases, they might be able to break it down such that, okay, this part of the product is very independent from this other part of the product. So I’m going to have two reports of mine. You’re responsible for this one, and you’re responsible for that one. And I’m going to look at the engagement numbers of each separately, and I’m going to tie your compensation to those metrics. Cause that’s a really good proxy of the overall usefulness of each part of the system.
Now, that’s the lucky case, if you can take apart the things and it breaks down nicely without interdependencies. I think more commonly is that there’s a lot of interdependencies. You can’t really say that you can’t evaluate any piece of an end user software application in isolation without reasoning about its impact on other parts. And there’s also cross cutting concerns, such as technical debt, test coverage and things like that. You have to worry about like the overall health of your team. And at the end of the day, it really is that person’s responsibility to figure out how this all breaks down.
Yeah, I unfortunately don’t have a satisfying answer here because I think every organization is a little different. It starts from the top. That person has a number that they’re trying to optimize that as a proxy for the overall end user value that they’re trying to deliver. And then part of the art and craft of building an organization is figuring out what the right breakdown of responsibility is. And what metrics are going to be used to evaluate how each constituent part of that organization is working.
The Danger of Proxy Metrics [00:16:51]
Henry Suryawirawan: [00:16:51] So, in the beginning you mentioned things like story points, right? These days, almost every companies run some kind of agile methodology, Scrum, or t-shirt size, or whatever they use. A lot of them actually start using that as the measurement of productivity, for either like the product engineering team or even like individuals. And I know these are like two-edged sword. Sometimes it works. But many cases, actually it doesn’t work because you know, you can game the system. You create as many points as you want to, but actually it doesn’t mean anything. Or sometimes you just fake it. Okay, this story points, let’s assume it’s done, let’s create a new card. So it’s an illusion that you’re making progress. But my question on this is that do you start seeing this as a problem? Coming up with a proxy that actually doesn’t work. And if so, what will be from your experience, some basic indicators that maybe any team could start with? That doesn’t lead them into something that mislead the actual things that they want to measure.
Beyang Liu: [00:17:45] Yeah. So, if you’re using any sort of points-based system, I think one of the things you want to be wary of is ensuring that you’re being honest with yourselves about point allocation. A lot of it depends on the specific people you have on the team. If you have someone who’s experienced on running Scrum or agile, who knows how to work the system, that can make the difference between this being a working system for your team and it being a not working system for your team. Another thing that a lot of organizations do is there’s this separation of product from engineering. So that product is responsible for allocating the points, and then engineering is responsible for executing the projects associated with those points. And because there’s that separation, you remove one of the incentives to game the system. Because in some sense, engineering is measuring themselves by velocity, but they don’t get to choose how many points are associated with each project. So that’s if you’re using a points-based system, that’s not for everyone. Full disclosure, Sourcegraph we used a points-based system in the past. We currently don’t on any of the engineering teams. We might use one on a select engineering teams in the future. Not ruling it out. It’s a perfectly good system, but it’s not the one that we’re currently using.
I think there are a lot of other proxy metrics that you can look at, that gives you a sense of, is your organization overall in a healthy or not healthy state? I think one thing you can do is you can send out survey. Just surveying your developers to see how satisfied they are and ask them how productive they feel relative to their ideal level of productivity, or the levels of productivity that they attained working at previous companies. I think a lot of developers have a good innate sense of how productive they feel. Cause again, it’s this binary thing where you either feel like, “Okay, I’m getting stuff done,” or “No, there’s a lot of barriers and paper cuts that are hurting me, preventing me from getting my job done.” You can also look at the ops and deployment side. There’s a standard set of metrics now, or I should say an emerging standard, related to how quickly you deploy and ship software. So those, I think they go by the name Accelerate, or sometimes DORA metrics. I think there’s four of them. It have to do with like how frequently you deploy to code. How much time there is between submitting a commit into the upstream Git repository to when it shows up in production. Things like that. Those are good things to minimize. If they’re too high, then that’s an indicator that your org is likely not as productive as it could be. In fact, probably much less productive than it could be. So those more quantitative things are tied more to the operational side. In my view, the operational side tends to be a little bit more quantitative because ops is all about taking the software, the source code that you’ve written, and just getting it into production. And that process tends to be, these days, a more automated process. A lot of humans have been taken out of the loop there with things like CI/CD. You have these systems that are designed to take the changes to source code and then push them through some sort of QA pipeline until they’re automatically deployed.
On the more product and application engineering side, it’s tougher to quantify. And that’s why you have these points systems, where that’s an attempt at quantification through product prioritization. But you know, unfortunately I don’t have a great answer. Other than talking to your developers and asking them, “Hey, are you hurting right now?” or “Do you feel like you’re being really productive?” There isn’t really a good universal set of metrics that you can look at on the application engineering and product side that gives you a good idea of how good your team is operating. I think that side of the picture is almost like emotional, intuitive, empathetic. And I think that’s why empathy is such an important quality in engineering managers. Because they kind of have their finger on the pulse of the development team. They kind of know what the potential of the people is on that team, and how close they are to realizing or actualizing that potential.
Henry Suryawirawan: [00:21:25] I like the way that you simplify all this just by talking to your developers. So sometimes a lot of managers, a lot of leaders, they forget this aspect. Maybe also relates back to the empathy thing that you mentioned. Because maybe they try to find the perfect system, either like agile story points, technical debt, lines of code being tested, or whatever that is, without actually thinking back on the human side. The developers themselves, they might give a lot of insights into: are they productive? Or are they being hindered by processes? Or politics, teams, or even technical debt which has been surmounted over a period of time. So I like the way that you relate back to developers. For leaders out there, I think it’s very easy. Just talk to your developers, ask them how do they feel when they work on some code lately?
Beyang Liu: [00:22:08] Just real quick. I just want to clarify. You’re absolutely right. You should have your finger on the pulse of the development team and you should be talking to your developers daily. I’m not saying that you can’t come up with any sort of quantitative metrics on the engineering side. Like a lot of organizations do. Let’s say, you identify tech debt as a big problem, and then you come up with a way to evaluate code for tech debt. Maybe there’s some simple parsing or linting mechanism that is a proxy for that. Or if there’s not enough tests in your code base. If that’s a clear need, then you can spin up a coverage tool like Coveralls or Codecov and track that metric down because that gives you a line by line. But I think you can’t use those as top-line indicators of overall health. They are just KPIs that you track to address specific problems. And you always keep the human in the loop here. You’re always in touch with your development team to ensure that as the number goes down, people also feel better about it.
Henry Suryawirawan: [00:22:57] So, in the context of the technology industries these days, right. Obviously, there’s this enterprise to traditional enterprises, companies who have been around a long time. They might have a well-defined software development process. And there’s also the startups, which probably arguably, do not have any process in place and they just figure it out along the way. So do you think there are these two sets of problems? Where from the legacy point of view, maybe you have traditional ways of doing software development, either the waterfall or the software development life cycle that companies adopt. And also the other extreme, which are the startups, who are trying to come up with a good way of making their developers more productive, churning up more features for their products. Do you think there are these two extremes that you need to think about, especially for those who work on these set of companies?
Productivity in Enterprise vs Startup [00:23:40]
Beyang Liu: [00:23:40] Yeah. So between startups and enterprises, I guess the question is how different are they, actually? Do we need a complete different set of rules and best practices for how to do great software development inside large enterprises versus startups? My sense on that is it’s always yes and no, but I think fundamentally it’s the same, right? Like software developers, wherever you are, whether in a big company or small company, there’s a couple of things that you really care about that make you really productive. Fast feedback loop, are you able to build tests, compile the code quickly? How long does it take between offering a change and seeing it in production and getting into the hands of users? All those loops, you care about making as quick as possible. You care about automating as much as you can. So if there’s any sort of QA that you need to do in order to ensure that your software meets a certain quality bar, ideally you want that automated without human beings in the loop. So that everything just works and you can focus on the creative parts of the job. And you want a really robust development environment where it’s easy to look up the context that you need to have in order to be productive.
One of the things that might be undervalued at both startups and large enterprises is the amount of the job of being a professional software engineer or developer that is reading and understanding code. Before you can actually write the code that builds the specific feature that you’re implementing, you have to understand the code base that you’re contributing to. You have to understand how your change fits into that. You have to understand what other shared libraries or packages exist in your code base, or perhaps out there in the open source world that you can use in leverage because you don’t want to reinvent the wheel. And then, when you go to submit the change, most organizations these days practice code review. So someone else on your team has to go and review the code and understand your change, understand how it impacts the rest of the code base. Ideally understand it as well as the person who authored the change, which is no small feat. In some ways it’s more difficult, but we can get into that later. That’s a problem that applies, I think, universally from the largest enterprise development organizations all the way down to the individual developer.
I think where it gets different between the two scenarios is just the constraints that you have to work with. So, if you’re working inside a large existing organization, especially one that has an established product with a ton of users, you’re going to be working in an environment with more constraints. Those constraints are sometimes bad, but often good, and they always exist for a reason. You have quality constraints to ensure that code that breaks functionality for end users doesn’t make it into production. You have the constraints of fitting your change into the overall code base. Following the stylistic conventions, meeting the necessary test coverage bars, passing architectural review from senior engineers and architects who are responsible for maintaining a certain level of consistency and quality across the code. These are all constraints that are necessary to make the team functional at a larger scale. All in the service of building a more complex product, that’s able to do more stuff for more users. And so dealing with those constraints, especially in the context of a larger code base, I think it becomes a bigger part of the problem-solving role of software engineering inside larger organizations. Because you’re not just solving for the needs of the users directly, but you’re solving also for these constraints imposed by the overall organization, in the service of the overall user set or customer set of the organization.
Rationale for Code Search [00:26:54]
Henry Suryawirawan: [00:26:55] Definitely makes sense. I think you brought a wonderful point. For developers, most of the time, we don’t work with like greenfield projects. We don’t come up with code from the scratch. Even if we do, right, we normally utilize code from somewhere, either like open source projects, somewhere like Stack Overflow, and we just search and maybe copy-paste and adopt. I think it’s a good segue as well, to talk about code search, which is what you’re doing with Sourcegraph. So maybe you can share why do you think companies should invest in building tool like Sourcegraph or having a code search capability within a company? And how does it differ with normal IDE which can search code as well? So maybe you can share a little bit more about code search.
Beyang Liu: [00:27:36] Yeah. So, code search is a solution to a problem, and I’ll start with the problem. The problem is this collection of challenges and difficulties that people are starting to call big code. Big code is a buzzword. It’s like big data in a sense, I guess. But what it encapsulates is all the things that get much harder when you’re developing code at scale inside this world and universe with way more code than there used to be. And that applies both to code bases inside the organizations. Now you have more and more companies that are more and more software driven, and they have larger and larger code bases than ever before. And a lot of these companies, by the way, are not like what you would call Tech companies traditionally. There are companies in other sectors like agriculture, healthcare, or finance, or whatever. There’re some really old established companies that are now mainly software companies.
So that’s one side of big code. It’s the code inside your companies. These large proprietary code bases. And the other side is the exploding world of open source. Open source has been around for a while now. It’s decades old, but the past 10 years have seen this just like sharp uptick in the volume, in the number, and the diversity of open source libraries and packages that are available. A lot of this driven by proliferation of the web and web services. All of which is powered by these open source libraries. There’s none software organization today that isn’t heavily reliant on open source code, which is amazing in a way. But with all this progress also come challenges, right? So the challenges of working inside a large code base, it means that there’s more kind of context that you might be unaware of. There’s more to kind of read and understand before you can start writing your new feature with confidence.
There’s often this like paradox of choice or embarrassment of riches issue that comes along with, you know, you start writing something, you don’t want to reinvent the wheel and you’re like, " Oh yeah, there’s probably an open-source package that already does this thing that I’m building. Let me go find it." And turns out there’s not just one, there’s five. And then you have to go and figure out which one is the right one to use? If I choose incorrectly, it could cause me a lot of pain down the road, so this is important decision. But how do I evaluate things? All these challenges are related to the fact that we’re operating in this world of unprecedented volume of code, and all of it might be relevant to the task you’re doing. And you want to take advantage of that wealth of knowledge. And at the same time, you have to also satisfy the constraints that volume of code might impose on you. So, this is going back to the proprietary side of big code. If you’re contributing to a large established code base, you need to ensure that you follow conventions, you need ensure that the change that you commit doesn’t just work in isolation, but interacts well with other components. You need to understand essentially like how it’s going to fit in.
So anyway, all these challenges they add up, and they add up so much that at some point reaches a breaking point. At some point, it reaches the point where the constraints that you have to work with and the context that you have to load up in your head start to be too much to fit into your IDE. You can’t just clone all the code that you need to know about to your local machine, and one by one set up their development environments and spin up in your IDE. And at some point it can get so bad that inside some of these larger organizations, the customers that we’ve worked with, the developers literally say like, “We don’t feel like we can get stuff done. Like we’re not shipping things anymore because the burden of the existing code base is so great. And so that’s where code search comes into play. Because code search is this tool that 20 years ago, there wouldn’t be a need for it inside most organizations because code bases were smaller back then. But these days, because everyone’s working inside these large code bases where they want to know about the code that’s not their local machine. They want to understand what’s out there. They want to see what libraries exist and how to use them. They need a tool to search over this giant corpus of data. Just like in the early days of the internet, you didn’t necessarily need a search engine because it was smaller and you could use something like Yahoo and that got you everywhere that you might want to go.
But then, as the volume of data and knowledge increased in that corpus of data, you wanted something that allowed you to instantly jump to what you’re trying to get to. And make accessible the kind of long tail of things available in that global knowledge graph. And so, with code, it’s things like, okay, how do I discover the libraries that are available either internal or external, that might help me get my current job or task at hand done. Once I find the available libraries, how do I use them? How do other people use them? The best way to learn a new library is often by example. So I want to see like how this particular function is called by other folks, my teammates, or maybe other open source contributors out there. Once I start using it, maybe there’s a bug that happens in production. Now I have to go and explore this unfamiliar code base. This code that I didn’t write, someone else wrote. Maybe I’ve never met that person before, but now I need to go and debug their code and figure out, okay, is this error message coming from within this library? And if it is, how can I fix it? And sometimes you’re doing it while production is down. So your company’s literally burning money while you’re trying to fix this problem. So all these things go back to like you’re operating in this giant code base, you never know when you’re going to have to understand a part of that code that’s unfamiliar to you, or that you’ve never worked on before. And in order to get there as quickly as possible with minimal friction, with minimal context switching, you need something like code search, that’s optimized to help you discover and understand those unfamiliar pieces of code.
Henry Suryawirawan: [00:32:53] So this is my first time actually hearing the term “big code”. I mean, we all know about big data, but big code, I think the way that you explain it, I think it makes sense. We see it a lot of more and more code. Even if you just count GitHub repositories, the open source world, I think it just increases exponentially. It will keep growing as more and more people are into engineering, programming, and things like that. I think, yeah, the term big code certainly makes sense.
Code Search Case Studies [00:33:16]
Henry Suryawirawan: [00:33:16] Coming back to the use case here of the code search, maybe you can share some of the case studies from some of your customers, how does actually something like code search help to improve their productivity? Is there any things or particular research that you have done that actually, “Oh, by using code search, we improve something by X?”
Beyang Liu: [00:33:36] Yeah. So there’s a couple different case studies that are relevant. I’ll start with kind of the most concrete one, which is when your site goes down, when there’s a production issue, you want to get the site back up as quickly as possible, and you want to solve the root problem as quickly as possible. But in either case you want that time to be as short as possible, ideally seconds, if not minutes. Definitely within the same day. When an issue arises, the first question engineer gets paged and often woken up in the middle of the night and you wake up, wipe the sleep from your eyes, and log onto your machine, the first question is “what the heck went wrong?” You’re trying to root cause that issue. So you go through the logs. You see an error message in the logs, and now you have to go and actually make that change in source code. Or maybe you have to identify which commit caused that change and revert that.
A lot of our customers, when they start using Sourcegraph, it becomes their go-to tool for doing this. You just take whatever errors appear in the logs, you plop it into the Sourcegraph, and 9 times out of 10, that gives you exactly what you’re looking for. You click on the result. You see exactly the line of code that’s throwing the error. And then from there, you can quickly figure out what’s going on and write the fix and commit it. Get production back online. So we’ve heard from customers who’ve honed in on that use cases. We’ve turned some incidents that would have been like hour long downtime incidents to things that were solved in a matter of minutes. Which is huge, I think, especially in this age when a lot of applications are web services. So if there’s a production error, it means your site is literally down, and it means you can’t make any money off the site. And your users are also yelling at you because this thing that they probably started to depend on, rely on, is currently unavailable. So that’s, a very like acute thing that we’ve been able to solve for.
At the opposite end of the spectrum, there is what impact have we had on just like the day-to-day developer productivity? That as I said before is a notoriously difficult thing to measure. So one of the challenges that we face is, when we go sell the companies, they’re like, “Okay, you know, we’d like to measure the impact that you have on our organization.” And then we say, “Okay, great. How do you currently measure the productivity of your developers?” Some people, they use sprint points, in which case, we’ll say, after adopting Sourcegraph, you’ll see an increase in overall velocity measured in points. But if they don’t use one of those frameworks and they don’t have a good way of measuring developer productivity, then one of our roles is sort of becomes our responsibility to help them determine a metric, that is a reasonably good proxy for the productivity of their developers. And sometimes, it literally just reduces to developer satisfaction. So we had cases, especially over the past year, with COVID and the pandemic, we actually had zero customers churn through the past year. When we went and asked our contacts at a lot of customers and said, “You know, it’s interesting. We knew that your business was really hard. We expected you to be a potential churn risk.” They said, “Well, when COVID first hit, we did like a survey of all the third-party vendors and tools that we use to see which ones we could possibly cut. Because remember, February, March last year, the sky was falling and everyone was like, okay, if we have to cut things, what can we cut? When they got to Sourcegraph, and they’re like, “Hey development team, how would you feel if we cut Sourcegraph?” The response was just overwhelming. This is a need to have tool. If you remove this tool, I literally won’t be able to get my job done. The response was that intense. For that reason, it saved our necks last year. So that’s like the opposite end of the spectrum.
And then there’s this like third angle which touches upon one of the other metrics that I mentioned before, which are these kind of ad-hoc or organization specific metrics. If a particular organization has a goal in mind. Let’s say it’s like increase test coverage by X percent, or maybe we want to deprecate this old hacky API that we’ve been trying to get rid of for the past couple of years, but somehow, it’s impossible to stamp out. We’ve recently built into our app, the ability to track usages of particular functions and things like that. And also the ability to track this numerically on like a chart basis. So if you have one of these metrics that you can define in terms of a small piece of code that takes your code bases input and spits out a number for what is this metric, we can actually help you track that number as it goes down. Another case study is like a particular customer has this use case of deprecating this API. And so we actually have a burn down chart of here are all the outstanding usages of this API. Now your goal as an organization is to make this go to zero by the end of the year, and here’s how you’re doing right now.
Henry Suryawirawan: [00:37:54] Wow. I love all those use cases that you mentioned. Certainly I can relate, and I really like the true validation of your tool. When things are in crisis, like in the COVID situation, the true validation of developers, most of them who would like to have Sourcegraph still in place, I think that’s a true validation of how your tool or code search in this place, is the true measurement of how developers can be productive. Definitely some of the companies or teams might find it fancy for these use cases. But I think if you ask developers, these are definitely very important, but there is just no tool around to solve this problem, and thank you for solving this problem, definitely.
Other Useful Developer Tools [00:38:32]
Henry Suryawirawan: [00:38:32] Few years ago, you wrote this blog post about current state of developers tools. And I think code search with Sourcegraph is one thing. What are the other tools that you think worth to try out and experiment whether it can bring the same impact like Sourcegraph into a team?
Beyang Liu: [00:38:47] Yeah, totally. I think this is a really exciting time to be working on developer tools, and I guess to be a software developer in general. Because what we’ve seen in the past five years or so, but probably even the past, like two or three years, there’s just been this explosion in the number of available open source developer tools, and developer tool companies. It used to be very much the case, I think, that really good developer tools were associated with a single proprietary ecosystem. I think Windows in the nineties they had Visual Studio and .NET and C#. I guess to some extent you could argue like the Java ecosystem is fostered by Sun and then Oracle. And that was the case for a long time. But I think in recent years, perhaps driven by the proliferation of cloud and web services, you’ve just seen this explosion of tools that are not tied to any particular vendor ecosystem. They’re just available for anyone to use, regardless of whether you’re using this particular technology stack. That’s been a huge boon because it’s freed up a lot of room for innovation. But it’s also been a bit of a challenge because now there’s like a bunch of different choices to choose from. It’s no longer just choose your vertical, choose which stack you’re on, and then adopt all those standard tools in that ecosystem.
Sorry, I’ve digressed from the original question here, which is what tools would I recommend that are really useful? Yeah. So I think there are a lot of great tools. I think that there is a lot of great work being done in the DevOps or Ops side of things right now. So couple of tools that we use at Sourcegraph on the monitoring side, we use this tool called Sentry. It’s like application error monitoring. Think of it as like a stacktrace explorer, but on steroids for your production systems. That’s great. We use this tool called Honeycomb, which is for observability. Observability is this new way of thinking about how you keep an eye on the state of your production systems. It allows us to instrument our application and then go and explore the data set of production events in this kind of open-ended fashion, which is very important for debugging anomalous events. Hopefully, you’re fixing issues as they pop up in production. But that also means that every new error in production is one that you haven’t seen before. So in order to be able to diagnose this effectively, you kind of want this more open-ended exploration tool, which is what Honeycomb is great for.
On the deployment side, we use a tool called Pulumi and we also use Terraform from HashiCorp. The infrastructure management and deployment tools have gotten a lot better since 2013. The tools that we’re using, I think it was like AWS Cloud Formation and Docker didn’t even exist back then. Nowadays, you got tools like Terraform and Pulumi that make it much better to manage all the configuration state of what you want deployed in kind of declarative fashion. So easy to reason about, and then the system takes care of spinning up the proper infrastructure. There’s obviously Docker and Kubernetes, which have made deploying software a lot easier. I think especially in multi-service applications, which Sourcegraph is a multi-service application. It’s also been critical for making our software available in a self-hosted way. The conventional wisdom these days is the world is moving to cloud. But from our experience, many, if not most large enterprises still prefer a solution that they can deploy into their own environment. Especially a product that indexes their code, which is very sensitive data. And so without Docker and Kubernetes, it’d be much harder to offer a self-hosted product because you wouldn’t have as much control over the application deployment environment. And a lot of this is going to be organization specific, right? So what works for us might not necessarily be the right thing for you. But those are the tools we use.
Ex-Googler’s Guide to Developer Tools [00:42:20]
Henry Suryawirawan: [00:42:20] Right. And I can also relate that to another article that you wrote, which is “An ex-Googler’s guide to developer tool“. During your internship, you were exposed to a lot of internal tools that maybe Google built, or maybe they used from either open source or wherever they find it. So why do you think it’s important for you to write it? And what’s your mission in educating people to know about what tools exist in Google and how it maps back to the world?
Beyang Liu: [00:42:45] Yeah. So I have my own experience as an ex-Googler. But to be fully upfront, I did an internship at Google, so I was there for all of three months. It’s been a while since I’ve been at Google and I had that experience. But what we started to realize as usage of Sourcegraph grew was that we were landing in those customers because an ex-Googler would bring us in. Because the pattern that we saw over and over again was someone would leave Google, they would miss Google Code Search, which is Google’s internal code search utility index. It has all the Google’s code, makes it accessible to every developer. They would look around and they would find Sourcegraph, and they would say, “This is awesome. I’m going to introduce this to my new company.” Google as a development organization is one of the most advanced and sophisticated in the world. They really pioneered a lot of tools and technologies that have since either made it into open source or they’ve inspired similar open source counterparts. But because all these kind of technologies or things were first built at Google, and there’s like Google specific tools inside Google, Google has become something of this like evolutionary, the analogy I like to use is it’s kinda like Australia. You go to Australia and you look at the animals there, they look similar to animals elsewhere in the world, but they’re all different. They got kangaroos and stuff. Google is like that, where they branched off from the rest of the world a while back, and they’ve been on their own evolutionary path. A lot of the tools that they use internally are similar to ones in open source, but they’re not the same ones.
And so one of the challenges that we saw, and speaking with all these ex-Googlers who became Sourcegraph users and customers, was like, “Hey, you know, having some difficulty mapping from Google internal technology to what’s available outside.” And that’s a pain point because I think the developer experience inside Google is so good that one of the first things that developers do when they leave Google is they try to recreate pieces of that developer experience. Because like I felt, the absence of code search was super painful. And so we wrote that blog post based on conversations with a bunch of such ex-Googlers and tried to put together a kind of a quick explainer for how do you map from Google internal tools to the ones available outside. To be upfront, this was not the first posts that touched on this topic. I think there’s a GitHub repo actually, which I think we linked to in the post that just gives a comprehensive listing of all the Google internal tools and their external counterparts. And our goal in the post was just to present some of that information in more of a narrative fashion. And talk about not only what tools you might want to look at, but like, how do you bring those tools into your organization, as an ex-Googler entering the outside world.
Henry Suryawirawan: [00:45:18] So for those of you who would like to learn more about all these mapping, what tools exist in Google and how it maps with others in the world, I’ll put that in the show notes. Also, having seen inside Google itself, I can validate that actually these tools really brings a lot of productivity, and they are well-integrated end-to-end. From the beginning of your software development life cycle, to the end, up to your code being deployed on, even maybe the logs and monitoring and observability and all that. So sometimes the integration part probably is one of the most crucial piece of all these developer tools. Because without that, you’ll still scrambling in silos, when you want to debug something, then you go to this tool, but when you find another problem, you go to the other tools. So I think it will be great one day in the future, hopefully soon, that we will start seeing all this interoperability between different developer tools. And it will be great experience for developers to have that as well.
Improving Personal Developer Productivity [00:46:07]
Henry Suryawirawan: [00:46:07] So coming back to the developers productivity, right? So we talk a lot about from the product engineering point of view, company, team and all that. But what if like I, as an individual, as a programmer, I would like to also improve my productivity? What should I do? Maybe you have some perspective here?
Beyang Liu: [00:46:24] Yeah. So on an individual basis, I think you always want to start with the pain that you’re trying to solve. When you’re coding every day, make note of what you find difficult? What annoys you? Where you’re kind of like context switching away from code? Like anything that takes you out of that flow state, where the state of flow is this kind of mental state where you just feel like your thought processes is extremely fluid and you’re just cranking out code. Anything that takes you out of that, or anything that prevents you from getting into that, that’s something that you should address. And you should address it in the way that programmers address every problem, which is figure out how to automate it. Chances are if it’s causing you pain, it’s because it’s something rote or repetitive or manual, or potentially unnecessary. Something that is boring to your brain, but you have to do. And so when you go to automate it, there’s obviously two ways of doing that. You can either build something that automates it for you, or you could find a tool that already exists, that solves the issue. Do a quick Google search and see if anything exists. So that’s like a very needs driven way of addressing pain points.
The other angle I’d recommend going about it is just find developers on the internet who write blog posts, or maybe they tweet about their workflow or the tools that they use, and use the tools that they use. Cause I think one of the ways you get good at any sort of craft is you learn from the masters of that craft. And really, it’s not even about learning from masters of the craft per se. Cause I think software development is one of those things where there’s so many tools that chances are if you talk to anyone who’s been doing this for at least some period of time, they’ll probably have at least one tool that they are aware of, that would be really useful to you, that would be really cool to try out. In fact, we actually started doing this thing during quarantine on our development team, where a bunch of engineers on the Sourcegraph team would join like a live stream and we’d have one person just screen-share their dev environment set up and walk us through like, how do you make a code commit? Show us your editor. What command line utilities do you use on a daily basis? And that was awesome. On the first one that we did, I walked away with three new tools that I had to try out that weekend. Cause it was like, “Wow, you can actually do that? I had no idea.” I don’t think it’s announced yet, but we’re going to turn that into a web series. Cause we think that, that sort of conversation is so interesting and useful. Especially in the COVID world or as the world becomes more remote, I think one of the things that we lose out on is the ability to look over our teammate shoulders, walk over to the desk and be like, “Oh, what is that on your screen?” That sort of like social connection, I think it’s just a great way to discover new tools and technologies.
Henry Suryawirawan: [00:48:58] I totally agree with you on this. So from my experience, the way I figure out other tools is when I normally did pairing with other developers. So during the time when we can sit together side by side. And also, yeah, like what you said, on Twitter people sometimes post random things that look cool, or maybe it’s just new open source tools that someone who just wrote it and publish it. You think that it will help for your productivity. So I think, yeah, I totally agree. And I think the permutation of many things that we can apply in our developer workflow, I think, is just tremendous. There will be a lot of things that you can try to improve in terms of your productivity.
Future State of Developer Tools [00:49:32]
Henry Suryawirawan: [00:49:32] So a little bit here, I know that you mentioned, this is the time, the current exciting times for developer productivity tools, right? What are some of your predictions? Seeing this, maybe, I don’t know, five years forward, what will be the state of this developer productivity tools?
Beyang Liu: [00:49:47] Yeah, so I think right now we’re in this explosion of different tools, in the kind of diversity of offerings, in the parts of the software development life cycle that they tackle, and it’s amazing. I think that will continue over the next five years. I think this is a trend that’s already playing out, which is more and more of these tools are either open source or open core, including the ones that enterprises pay for. And so there’s a lot of companies that are pioneering business models built on top of open source technologies, which I think is really cool. Because developers care a lot about the openness of the software they build on top of, and also the software they use. I think that kind of gets at the importance of being able to introspect into your tools. If something goes wrong, you can poke in and see what went wrong. There’s also this almost like philosophical feeling that, “Hey, if I’m building a substantial portion of my workflow on top of this tool, then I deserve a certain level of freedom in terms ability to continue using that tool. To some extent, I think people are still burned by proprietary lock-in, which was a common strategy, I think, for software vendors in the 1990s. But increasingly, the consensus is that open is the way to go. And so I think you’ll see more and more companies that are building tools, where the tool that’s relevant for the individual developer, that part is going to be open source. And then there’s like an enterprise component that is more important for teams or organizations that might be kept proprietary in order to be able to build a sustainable business on top of the technology. So that’s one trend that I think will continue.
I do think that with the explosion in diversity of tools that we’re currently looking at, there will inevitably be the pendulum swinging backwards a little bit into more consolidation. Especially around particular ecosystems and targeting specific aspects of the software development life cycle. You’ll see essentially like winners and losers to a certain extent. And the winners will probably buy up some of the losers, and there’ll be some amount of consolidation and standardization. But I don’t think it will ever go all the way back because I think we’re really in this new world right now. I think in the old world, the model was you had these proprietary vendors that built up these vertically integrated ecosystems that they controlled. They controlled the marketplace. They controlled what partners were able to get in front of customers, and they controlled the technology stack. Microsoft built out the tech stack for Windows and made it a really lucrative environment thing. And I don’t think that model is necessarily wrong. It obviously yields a lot of value over the years. But I think now we’ve gotten to the point where there is no single entity that can capture all the creativity and innovation that can happen with software. And you see that with the web. Why is the web the platform that appears to have won out over all the existing kind of incumbent desktop operating systems? And I think that some of it is tied to the openness of the web. The fact that it’s not under the control of any single entity and that permits this kind of flourishing and diversity of ideas that can spring up and find users and use cases that would have been unthinkable even a couple of years before. And so I think that’s going to very much continue.
You’re going to see the continual flourishing of this like third-party independent developer tools ecosystem, that’s going to resist complete consolidation. Into that kind of world, what’s going to become more important are these kind of like meta tools or aggregators that unify multiple tools into a single coherent experience. And at some point I think, as software continues to eat the world, as software development becomes ubiquitous, at some point, the dev tools market will just become like the productivity tools market. Because code will become so synonymous with knowledge work that the vast majority of people in the community will be building software in some shape or form. And that’s going to be like this huge, vibrant, diverse ecosystem. The last thing I’ll say there is that in that explosion of different offerings, anytime you have a large dataset or a market that explodes like that, I think search is going to be an incredible piece of functionality. Because there’s just going to be like so many things to consider, and a good search engine I think will be necessary to make sense of it all. And I hope Sourcegraph can continue to play a good role there.
Henry Suryawirawan: [00:53:51] Definitely very exciting. You’re right. I mean, software is eating the world. So more and more people get into programming, writing code, more and more open source projects are being published. And it’s going to be tremendously growing over the time, and I’m sure developer tools will be aplenty as well, for us to try out. And I think don’t forget as well. The code these days is not just sometimes in text, right? There are a lot of audio, something like this, or even like videos on YouTube. I think those probably are some of the opportunities that haven’t been tapped. Like people who are teaching tutorials on the videos. Maybe sometimes they just want to search which part of the video that actually teach this. Maybe it’s going to be cool.
3 Tech Lead Wisdom [00:54:28]
Henry Suryawirawan: [00:54:28] So, thanks so much for sharing all these stories. I think due to time, we have to cut it out. But before I let you leave, normally I would ask this one question for all my guests to share with my audience here, which is the three technical leadership wisdom. So Beyang, do you have any kind of wisdom that you want to share with all of us here?
Beyang Liu: [00:54:45] Yeah. So the one piece of advice that I got early in my career as a Tech Lead or an engineering manager that really stuck with me is when it comes to motivating the people on your team, and unlocking their potential, you always want to start with the why aspect of their jobs. So rather than telling people do this, do that, you always want to give them a kind of high-level goal that they can creatively use their creativity to find a solution to. For a junior engineer, that high-level goal might be something that is relatively low level. Go in, make this button work really well for our users. But there should always be that component of here is what the end goal is, and I do not care how you do it. That’s up to you to figure out. That’s up to you to exercise your human brain and creative inspiration to figure out how best to solve that problem. Because I think that creative drive is what motivates every developer. It’s like why we got into programming, right? That act of creation. That act of creative problem solving.
I think oftentimes, especially as the organization grows, and you’re trying to solve these bigger problems, and you’re solving them by breaking them down into smaller bits, it can be very easy to just become more prescriptive or more imperative when it comes to telling people what they need to do. But what I found works for both myself and the people I work with is always keeping that goal in a way that the person actually satisfying that goal is able to work creatively. That’s good advice, I would say, for both like Tech Leads, when you’re talking to others and explaining a problem that you want them to solve, and for the person who is receiving that. If your manager is explaining this problem to you, ask the questions about, what is the end desired state here? Rather than just taking down the orders. Because I think that will ultimately help you do your job better, and lead to a happier manager because you’ve done a better job. You’ve found a more creative solution to the problem that you were asked to solve.
I think another piece of advice is if you find yourself doing anything twice that you don’t like doing, find a way to automate it, or look for a tool that automates it. That is your kind of job as a programmer, is to automate things. And if you’re not automating your own life, then you’re not living your own values.
And then lastly, just remember to have fun. It’s amazing that we get to work in a job like this. Because coding is so fun. You get to create these like applications that are used by dozens or hundreds or thousands, or maybe sometimes millions of users. And you get to do all that just by sitting at your desk and typing things into a computer. I think that there’s a lot of complexity to deal with. Professional software engineering is there’s a lot of people issues. There’s a lot of understanding existing context. There’s a lot of slog to the job. But I think at the end of the day, the best way to make yourself do a good job, and also not get overwhelmed by the complexity you’d have to deal with is to remember what got you into programming in the first place. Focus on that and try to experience that as many times per day as possible. Because I think if you do that, it will drive you to focus on more of the creative aspects of the job. It will also motivate you to automate the more mundane or rote aspects of the job. And it’ll just make you a happier person because it’ll feel like you’re doing something that’s truly exercises the human creative aspects of your brain.
Henry Suryawirawan: [00:57:53] Thanks for sharing the wisdom. I like the last one. Definitely, we all got into programming because of something. Something got, you know, sparked when we tried it, writing some code and something magically happened. I think it’s also synonymous with the productivity concept of the developer, right? When you are productive, basically you can churn out code that actually works, and maybe impacted many other people’s lives. But I think we have to be conscious over the time, if let’s say we don’t enjoy what we are doing, the programming things, maybe the productivity goes down and we are not having fun. So you always have to look back and think what got you into the programming in the first place.
So thanks again for your time, Beyang. For people who want to find you online or learn more about Sourcegraph, do you have a place where they can go to?
Beyang Liu: [00:58:35] Yeah. For Sourcegraph, just go to sourcegraph.com. You should be able to search over all the open source code from that URL, or you can download and run your own Sourcegraph instance. And for me, probably the best place to reach me is just on Twitter. I’m @beyang, B E Y A N G. Feel free to reach out or DM me or tweet at me or whatever you like.
Henry Suryawirawan: [00:58:55] Cool. So thanks again, Beyang. I hope one day I would be able to use your tools in my development workflow. And it will be great, I think, to have all these super powers for developer.
Beyang Liu: [00:59:05] Awesome. Thank you, Henry. And thank you so much for having me. This was great.
– End –