#205 - Measuring Code Quality in the Age of GenAI - Matt Van Itallie

10-Feb-2025 58 mins Matt Van Itallie

included in Culture & Practices Leadership AI/ML & Data Engineering AI/ML

“GenAI can be such a contributor to developer productivity that not doing it leaves value on the table. Organizations go slower.”

Brought to you by Lemon.io

Lemon.io is your go-to platform for hiring top-tier, pre-vetted software engineers from Europe and Latin America. You'll get matched with your developer in just 48 hours.

Tech Lead Journal listeners get 15% off your first 4 weeks of work at lemon.io.

Ever wonder what a credit score for your codebase would look like?

In this episode, Matt Van Itallie, founder of Sema, discusses how his company is revolutionizing the way we assess and understand code quality, including the portion of GenAI usage. Learn about Sema’s innovative approach to technical due diligence and their comprehensive credit score system.

Key topics discussed:

The concept of “code as data” and how it’s changing codebase evaluation
Measuring and improving developer productivity in the age of GenAI
The importance of the Generative AI Bill of Materials (GBOM) in technical due diligence
Why having 15-30% of your code originating from GenAI could be optimal
The seven modules Sema uses to evaluate codebases, including GenAI usage, security, and team retention
The CTO dashboard concept and its potential to transform software engineering metrics
Why treating code as a craft is crucial for effective communication with non-technical stakeholders

Whether you’re a developer, engineering leader, or investor, this episode offers invaluable insights into the evolving landscape of software evaluation and the growing importance of quantitative metrics in technical due diligence.

Listen out for:

(02:05) Career Turning Points
(05:15) Treating Code as Data and a Craft
(11:09) Comprehensive Codebase Scans
(14:31) How to Explain Codebase Health
(20:31) Measuring & Improving Developer Productivity
(23:33) GenAI for Increasing Developer Effectiveness
(25:59) CTO Dashboard & The 7 Metrics
(29:55) Measuring GenAI Usage
(31:51) Healthy Dose of GenAI Usage
(36:50) Generative AI Bill of Materials (GBOM)™
(39:24) Technical Due Diligence
(45:18) Sema Adoption
(49:48) Integrating with Sema
(52:17) 3 Tech Lead Wisdom

_____

Matt Van Itallie’s Bio
Matt Van Itallie is the Founder and CEO of Sema.

Formerly, Matt was an Operating Executive at Vista Equity Partners portfolio companies, Chief Analytics Officer for a $1BN operating organization, and a McKinsey consultant. Matt has a JD from Harvard Law School.

Sema is the leading provider of comprehensive codebase scans that assess the risks of software and tech-enabled businesses. They have analyzed over $1.6T of software enterprise value.

Sema is a leading expert on managing GenAI risks, with presentations to leading investors and operators across sectors. They are the inventors of the Generative AI Bill of Materials (GBOM).

Follow Matt:

LinkedIn – linkedin.com/in/mvi
Sema – semasoftware.com
Comprehensive Codebase Scans – semasoftware.com/comprehensive-codebase-scans

Mentions & Links:

Code is a Craft, Not a Competition – https://dev.to/this-is-learning/code-is-a-craft-not-a-competition-and-why-it-matters-f3o
Standards for GenAI Code Use & Risk Management – https://www.semasoftware.com/blog/standards-for-genai-code-use-risk-management-september-2024
Generative AI Bill of Materials – https://www.semasoftware.com/blog/generative-ai-bill-of-materials-gbom-tm-a-primer
Snyk – https://snyk.io/
GitHub Copilot – https://github.com/features/copilot
ChatGPT – []https://chatgpt.com/( https://chatgpt.com/ )

Our Sponsor - JetBrains

Enjoy an exceptional developer experience with JetBrains. Whatever programming language and technology you use, JetBrains IDEs provide the tools you need to go beyond simple code editing and excel as a developer.

Check out FREE coding software options and special offers on jetbrains.com/store/#discounts.
Make it happen. With code.

Our Sponsor - Manning

Manning Publications is a premier publisher of technical books on computer and software development topics for both experienced developers and new learners alike. Manning prides itself on being independently owned and operated, and for paving the way for innovative initiatives, such as early access book content and protection-free PDF formats that are now industry standard.

Get a 45% discount for Tech Lead Journal listeners by using the code techlead24 for all products in all formats.

Our Sponsor - Tech Lead Journal Shop

Are you looking for a new cool swag?

Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.

Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.

Like this episode?

Follow @techleadjournal on LinkedIn, Twitter, Instagram.

Buy me a coffee or become a patron.

Buy me a coffee

Quotes

Career Turning Points

Sema is about treating code as data, code as math.
I decided to found Sema for a pain point I personally was experiencing, as someone who is data focused and tech-savvy, I nonetheless struggled to understand what was coming out of the engineering team leadership.
We founded Sema to bridge the gap, to make technology understandable to non-technical audiences, like the C suite and the Board of Directors and make it easier for non-technical folks to make their preferences known back to the engineering team to bridge that gap.

Treating Code as Data and a Craft

Sema has two products today. One of them is like a credit score about a software engineering organization in that it analyzes different sources of information about the code base and comes up with a one to a hundred score. Which can be controversial and it’s hard. It’s easy to do that wrong and hard to do it right. We spent many years trying to get those numbers right.
We look at the number of in-file security warnings, which are warnings that would be detected by a SAST or a DAST tool, and we count the critical warnings. Because most businesses and organizations don’t have time to fix all security issues, they focus on the most critical ones. Let’s say the number is 250 high risk security warnings.
Whether 250 high risk security warnings is too high, too low, or just right depends on the context of that organization. When we look at that number, 250, we compare it to similar size and stage codebases. What you’d expect to see on the security and technical debt and all these other factors varies depending on the size and stage of the business. If the company is young with small code base, which is a segment fewer than two years old, fewer than 25 developers, having 250 high risk security warnings would put them in the highest risk.
We put it in quartiles, and without looking at the data, I can tell you that if you’re in young and small code base with 250 high risk security warnings, you’re definitely in the fourth or highest risk quartile. By contrast, if you’re in the segment called giant, and giant is a thousand or more all time developers, not necessarily current, but all time developers, 250 high risk security warnings is the best quartile.
Credit score report is frequently used in technical due diligence, where someone is thinking of buying a software company or a team, a company that has software and that’s a high stakes moment. We want to be as explainable as possible given how stressful it is. Being able to explain it mechanically and give the context. It’s not right or wrong. It’s the decisions the team has made, the business that has made all along.
Business executives think of the sales team or sales numbers. In sales, more sales is better. Like sports, more points is better. In code, are more lines of code better? Maybe? Some of the best coding work people have done is removing code. Or instead of writing something, finding the right open source package.
Sema’s job is to produce the data, but also to facilitate a conversation. We saw this trend. What is the reason behind it? It’s a question not just for the engineering team. It’s a question for sales and product and finance and the CEO.
I’ve never met a team that didn’t want to fix security risks or clean up technical debt, but there’s not always money for that. There’s not always time given the business needs. That conversation, coding is a craft, but we have to fit that into these business contexts where there’s optimization functions. That conversation matters so much more.

Comprehensive Codebase Scans

We have two products, and the first one called Comprehensive Codebase Scans, which is the credit score report, is a point in time report using static tools that you described, a SAST tool. It’s not dynamic. We don’t build the code. A static SAST tool, a CVE scanner, linter scanner, cyclomatic complexity scanner, there’s about 50 of those tools all at once. The code scans product is like bringing your car to a different mechanic to understand how much maintenance there was.
We’re brought in what we call moments of evaluation. Started with technical due diligence, helping a buyer decide if a company is safe enough, if the code base is healthy enough to buy. We also help sell side prep, which means a company wants to get bought, they want to know what score they’re going to get before they go into the test.
The other product is a roll up of existing SAST tools to tell that story in an ongoing way.
Neither one requires engineers to change tools. You should let engineers pick their own tools. Back to code is a craft. Even when it’s important for executives to understand what’s going on. You still want to give power in decision making and authority to engineers and engineering teams.

How to Explain Codebase Health

It’s important for everyone in their careers to start developing a perspective on what they want to do next, in five years, and in ten years. It doesn’t matter if it changes. It will change, because as you develop in your career, you learn things. You’re testing hypotheses by having different experiences either within the same job or in different jobs.
If you are an engineer or an engineering lead, and you have an interest in becoming a manager, as a VP of engineering, CTO, CEO, you have to understand the business impact or the organizational impact of the code, less the specifics of the code. You have to understand how the business optimization function.
That is about codebase health, which gets summarized, because CEOs have to get the summary view of sales, and the summary view of marketing. They will get the summary view of code one way or the other. You are better served by being able to explain that in a way that the CEO will understand so she can support you in the ways that you need.
The better you understand the pressures and constraints of the rest of the organization, the more likely you’re going to get what you want. That’s not just about coding, that’s a lesson for life.
There are good ways and bad ways to ask for more time to work on unit testing. The two good ways are to explain it in terms of business outcomes or engineering impact.
Business outcomes could be, “We should have more unit testing, because the number of user identified bugs has doubled over the last six months. Those came from users at our biggest customers, our most important customers. If we had more testing, we could keep those customers happier.” That is language that the CEO and the CFO and sales understands.
The second good reason is engineering impact. We need to add more tests, because we estimate that our team could be 10 percent more effective in six months, because they’re not just hand checking their code each time. Testing speeds up velocity and throughput of engineering work. Either or both of those reasons, they have to be true, are compelling reasons for an executive team to invest.
Unfortunately, what is not a compelling reason for almost any organization is we don’t think the code is as good as it should be. The quality is as good as it should be. We would like it to be better.
You want to be persuasive. It’s true about code quality, functional or non-functional requirements. It’s true about anything, understand what matters to your audience.
If your listeners can think about how non-functional requirements can impact the organization, they can make requests for resources, for support, for time to work on them that are more likely to succeed than if they just express it in terms of the craftsmanship of the code.

Measuring & Improving Developer Productivity

There are many ways to think about developer productivity and the effectiveness of the engineering organization.
As easy as it is to pick wrong metrics that tell the wrong story, if you can explain it quantitatively, you’re more likely to have a conversation effectively with other members of the organization.
There’s always going to be this tension. I say to non-technical executives, imagine going to talk to a potter. If you weren’t an expert potter yourself, you would not go to that potter and say, you should spin that wheel faster, or you should use this kind of clay instead of this kind of clay. Because pottery is a craft. It’s not reducible to, how many parts can you do? And can you do them faster? Much of coding can feel like that in a good way.
One of the amazing things about careers in engineering is you get to do a craft for a living. You get paid well to work on something that you can be proud of and put yourself into. It would be easier for the rest of the business if you could explain the effectiveness of the team in a way that makes sense to the non-tech folks.
There are different ways to measure engineering effectiveness and efficiency. We love DORA metrics. By the time you’re a large software team, you should be using DORA metrics, but for the rest, you should figure out a system that works for you and is understandable to the rest of the organization.

GenAI for Increasing Developer Effectiveness

One change in understanding developer effectiveness and increasing developer effectiveness is the use of generative AI tools for coding.
If your company doesn’t let you, email me and I will send your CEO, CTO or Chief Legal Counsel a note explaining how important it is for engineers to use the right kinds of tools. To be able to use GenAI tools at work.
We measure how much Gen AI code is in the codebase. In that credit score, companies get a red if their coders aren’t using it enough. Because GenAI can be such a contributor to developer productivity that not doing it leaves value on the table. Organizations go slower.
It’s like open source. Imagine a company saying, we’re not going to use open source. We’re going to build everything ourselves. That would be crazy! It means you’re not taking advantage of best practices that someone else has figured out, a community has figured out. It means you’re wasting time. It would be annoying as a developer to do this by hand when a good solution exists.
We are at that stage of GenAI code usage, it would be crazy in almost any setting to start from scratch on a new topic, rather than seeing how GenAI assistance can help. Because of this opportunity to increase developer throughput through GenAI tools, our CTO dashboard measures how much GenAI code is in use. We explain it back in terms of a return on investment for the organization. It helps make the case for those GenAI tools for the coders.

CTO Dashboard & The 7 Metrics

Today, we use seven: how much GenAI is in use, security of the code, cybersecurity (e.g. pen test results and dark web scan), open source legal risk (also known as open source intellectual property risk). Using open source is a good idea, but certain packages come with intellectual property risks that have to be managed, that matters to medium and large organizations.
The other three are some forms of code quality or technical debt. Process is the sixth module, about the consistency, or lack thereof, and the trends in development activity. The seventh area is team. In the team module, most importantly, is the retention of developers who know the code, the deep subject matter experts in the code.
Everyone knows how important it is to have the expertise of developers. No matter how good the documentation is, no matter how skilled a new developer is, there’s wisdom and knowledge about that codebase itself. Developer retention matters.
The team module is the single most important of the modules with respect to the overall score.

Measuring GenAI Usage

One of those methods is in production, a deep learning based prediction of GenAI code versus not GenAI code. Deep learning means training. We started with code that was not GenAI originated, and code that was GenAI originated.
For GenAI originated code, we synthesized it. We had an LLM produce samples that came out of the LLM. For not GenAI code, we used open source code created before GenAI tools, so we had true comparison sets.
That was a good starting point for the prediction model, but we’ve added data science on top of it to refine it, to further train the model itself, and add complimentary rules based on data science team’s work. When we find things that we can conclusively prove are GenAI or not GenAI, we override what the model says.

Healthy Dose of GenAI Usage

Sema tries to be objective on balancing the risks and rewards of using GenAI code, and setting standards for the industry. Those standards have two parts. One is about GenAI originated or not. The second is if it is GenAI originated, how blended is the GenAI code?
We’ll do GenAI originated first. The safe zone, what would be green or a strength in the credit score, is 15 to 30 percent of the code originating with GenAI. Less than that, it could be yellow risk low, orange risk medium, or high risk red, depending on how much less GenAI code was being used. For reasons of organizational effectiveness, you’re not going to be able to go as quickly as you could if you’re using GenAI coding the right way.
We think it is appropriate to be using more GenAI if you have the proper protocols and safety measures in place to be using GenAI appropriately. For GenAI code usage, you can solve being a red or high risk by making sure engineers have the right licenses. It’s important that they use licenses that are not training their model on your code. That’s one part of the standards, GenAI originated or not.
The other part is pure versus blended GenAI code. As big of a fan we are of GenAI code, it frequently is wrong. It hallucinates. It lacks context. It’s insecure. It makes up packages. If you looked at a code base that was entirely GenAI pure, meaning it came out of the prompt and hadn’t been modified, you should be nervous. Did they not find any security issues, or did they find them? Did they not fix them? What about the institutional knowledge? There are things that can’t be solved just through prompting.
If it’s more than 10 percent of the total code base is pure GenAI, we start asking more insistent questions about how did you get to this place.
For example, if you had 50 percent GenAI usage, which would be potentially red, but 5 percent pure, that means they are blending 45 percent of it. If they were using the right kinds of tools in the right way, that would go to strength. By contrast, if you had 15 percent GenAI code usage, but 15 percent was pure, that would be high risk, at least medium risk, because you’re not taking the precautions to make sure the code is as high quality as it needs to be.

Generative AI Bill of Materials (GBOM)™

The traditional bill of materials is about the open source packages being used.
Why does an SBOM matter? Because open source code has security risks, CVEs. It has intellectual property risk if you use certain licenses the wrong way. It has operational risk, even if there aren’t any security issues from the code being old, if you’re many versions behind, there’s a risk of it no longer being able to work. Security risk, intellectual property risk, maintainability risk for open source, and that led to a bill of materials. They are used by insurance companies, lawyers, and due diligences in evaluation circumstances.
Same thing applies for Generative AI, which is why there is now a GBOM, a Generative AI Bill of Materials.
Sema is honored to be the inventor of the Generative AI Bill of Materials. GenAI is a good idea, like open source. But it has maintainability risk. It lacks context. It has security risk. It comes with vulnerabilities, like other code. It has intellectual property risk under certain circumstances.
The Generative AI Bill of Materials, or GBOM, is used in high stakes situations. Today, it’s part of diligences. If you have a software team, and you’re getting investment from one of the best software investors, there’s a chance they’re going to ask you for a GBOM that Sema would produce as part of the scan. Over the next years, this will become more important for procurement, for insurance, for regular check-ins, for the same reasons that the traditional BOM is used.

Technical Due Diligence

Technical due diligence is near and dear to our heart, because it’s how Sema got started. We’ve had the pleasure of evaluating a trillion and a half worth of companies and technical due diligence and other settings.
In the olden days, technical due diligence was done entirely qualitatively. You would read code. You’d read snippets of code. You would interview people. You’d get a sense of the team. Now, code to craft, those are important, but imagine trying to understand more than 100,000 lines of code by reading it and try to get sense of it. It’s not possible, much less if there’s millions, tens of millions, billions of lines of code. We’ve scanned companies with billions of lines of code.
A real challenge is if you’re about to make us write a serious cheque to invest in or to buy a software organization, we recommend understanding it qualitatively and quantitatively to make sure that you’re de-risking it, and not surprised by any amount of cleanup that you have to do on the other side, so it’s not a surprise later.
We believe that quantitative scans are, in partnership with qualitative assessment, an important part of doing tech due diligence the right way. That’s from the buyer’s perspective.
From the seller’s perspective, a challenge can be explaining the choices that you’ve made in the context of the size and stage of your organization.
I would expect most times when another company is buying another company’s software, that the buyer is more advanced or further along in the market. Because they’re further along in the market, they can afford fixing product technical debt and security debt. The most important thing for early stage businesses is not non-functional requirements, it’s product market fit. It’s traction. It’s revenue. If we were to see a code base that was “totally clean”, “totally perfect”, that was early stage, and there’s no such thing as totally perfect anyway, but it would be a red flag that they aren’t building as fast as they can.
You should get into that risk zone. That is how technical debt including security debt should work. You should carry some debt with you.
Depending on your size and stage, folks who are on the receiving end of a technical due diligence who are selling their company or raising money, explain your choices in terms of business goals. I love code quality, but investors don’t care code quality for its own sake. They care because it delivers business outcomes. It helps developers go faster.
Too much tech debt slows down engineers and could hurt revenue. Even if you’d like it to be better, put it in terms of how you’ve made those choices. That is the way to be persuasive, whether it’s a manager that you work with or it’s an investor or an acquirer who’s looking at your company.

Sema Adoption

We’re excited about the possibility and realized vision of Sema helping bridge that gap between tech and non-tech in a way that’s understandable and credible for both sides. Having established ourselves as a leader in technical due diligence, we’re excited about the CTO dashboard that is still in beta, but excited about what’s coming.
When I started this business seven years ago, I would explain what we were doing, look at Salesforce. Salesforce is a tool that helps individual salespeople keep track of the sales work they are doing. It helps the head of the sales team, such as a Chief Revenue Officer, explain what is going on with sales to the CEO, to the Board of Directors. You would never, if you were anything more than a tiny company, not have a CRM. You wouldn’t do it, because from the sales people’s perspective, they need a tool to manage and provide executive insight into sales. You couldn’t run a modern business.
Sales is more important than code quality. That’s why an executive insight into sales has a 100 percent adoption. Sema’s tool and other executive insight products into code do not have 100% adoption yet. But as the years go by, you’ll see getting to close to 100% adoption of a CTO level dashboard for the same reasons that sales uses something like a CRM.
With the addition of how much GenAI is changing things, with more code being written in full or in part by GenAI, you need automated systems to keep track, because much of it is outside of the heads of the individual developers. Because they’re relying more on GenAI. The rise of GenAI will be another factor leading to the rise of CTO dashboards.

Integrating with Sema

The CTO dashboard connects to GitHub, Azure DevOps, and several other developer tools, including security tools. It starts with whatever engineering tools you are using. If we don’t have them, we’re adding them.
We are opinionated about the metrics. If you come to Sema, we’ll give you an opinion. This is how an investor will view your code, because you should hear that opinion so you shouldn’t be surprised. But we are not opinionated on what tool you should use. That’s for you to decide.
Your business is figuring out the right tools and make your code optimal, which does not mean perfect. Perfect is not a good idea, and it’s not feasible. It means optimal, while you’re building product market fit and driving revenue. Our job is to help you tell that story with whatever set of tools you decide to use.
It is important to be able to explain code to non-coders, but it’s hard, and it’s easy to make poor choices, like lines of code increasing. You have to keep track of how big the code base is. I’m not saying you should never pay attention to lines of code.

3 Tech Lead Wisdom

Figure out how to explain code to non-coders. The more empathy you have, regardless of what you want to do for your career, the easier it will be for you to achieve that for your code, business, team, and yourself.
Treat your career as a series of hypotheses to test.
- If you’re a team lead, you think you want to be an engineering manager? Try it. Give it your best efforts and learn and do it for two years. The best you can.
- If you’re going to work for 40 years, two years of investment and seeing if you want to be an engineering manager is a good use of time because if you like it, it’s a good job. If you don’t like it, you have 38 other years when you’re not being an engineering manager. You’ve ruled it out. You’ve collected data instead of dreaming about alternatives, and that is important for high quality career choices.
- It’s not imagining how a different job could be, it’s experiencing it. Because the grass is always greener. If you compare your current job to the theoretical best parts of all the other jobs you might take, you’re going to be unhappy. Because no job is better than the theoretical composition. But that’s not the next job you’ll have. The next job you’ll have is a specific job with a specific boss in a specific sector. So try it out.
- I’m not telling you to switch your jobs. Take on different projects within your organization. But collect data about the things you like to do.
Learn, experiment with different stages of organizations because that’s going to be a key part of professional happiness.
- Most of us are good at thinking about the industry we’re part of: gaming, fintech, insurtech. We’re also good at figuring out the role. You want to be a tech lead. You want to be an individual contributor. You want to be an architect. You want to be a CTO. Those are important questions. But my experience is folks undercount an important third dimension, and that is the stage of the organization. Just like the stage matters for codebase health, it matters for job satisfaction.
- Broadly speaking, there are four stages that are different. Startup, zero to one. Fast growth or scale up, one or two to seven or eight. A great organization that’s staying that way, it’s in the eight to ten range. And turnaround, where it used to be good, but now it’s low, and you’re trying to get back.
- I know those four stages are different, because I’ve had enough jobs to sit in those different places, and I’ve experienced them. I also know that I’ve never met anyone who was happy in more than three, and most people are only happy in one or two.
- Even if everything else is right about your job title and comp and people and sector, folks who like startups are impatient when an organization moves slowly. And folks who don’t like messes don’t like turnarounds. Some people love turnarounds, and if that’s you, the world needs more people who can do that.

Transcript

[00:01:36] Introduction

Henry Suryawirawan: Hello, everyone. Welcome back to another new episode of the Tech Lead Journal podcast. Today, I have with me, Matt Van Itallie. He’s the founder of Sema, a company that is very interesting when I do my research. So Matt, I think today we’re going to talk about what Sema is doing and all the things that probably are unique for the tech listeners here to learn from you. So welcome to the show.

Matt Van Itallie: Thank you so much, Henry. I’m so glad to be talking with you and really looking forward to this conversation.

[00:02:05] Career Turning Points

Henry Suryawirawan: Right. Matt, I always love to ask my guests first, maybe to share any kind of career turning points that you think will be interesting for us to learn from that.

Matt Van Itallie: Absolutely! So I am the son of a computer programmer and a math teacher. And as you’re about to hear, I had a pretty varied career before Sema, but Sema ultimately is about treating code as data, code as math, and so it kind of is the connection all the way back from the beginning, although I most certainly didn’t learn it, didn’t know it at the time. I learned BASIC on a Commodore 64, so I show my age, but was not a professional coder. I worked as a management consultant, serving governments and private sector organizations. I then served in government doing data, data and analytics for school districts. And then went into software, doing data and analytics and customer support.

And decided to found Sema for a pain point that I personally was experiencing, which is as someone who is very data focused and let’s say tech savvy, I nonetheless struggled to understand what was coming out of the engineering team leadership. That their deep levels of technical expertise, as a not deeply technical experienced person myself, I didn’t quite understand it as well as I wanted to. And so we founded Sema to bridge the gap, to make technology understandable to non-technical audiences, like the C suite and the Board of Directors and also make it easier for non-technical folks to make their preferences known back to the engineering team to bridge that gap.

[00:05:15] Treating Code as Data and a Craft

Henry Suryawirawan: Very interesting story in the beginning when you mentioned code. You treat code as data, right? So I think maybe for some of us, it’s unintuitive. So maybe tell us a little bit more. What do you mean treating code as data? Because all along when we write programs, right, we write code to actually execute something, right, in order to do something. So why are you treating code as data? Maybe a little bit on here.

Matt Van Itallie: Absolutely. So I’m going to answer that, and then I’m going to talk about code as a craft, because that’s a really important caveat, but I’ll answer your question first. So Sema has two products today, and I’m just going to use one as an example. One of them is like a credit score about a software engineering organization in that it analyzes many different sources of information about the code base and comes up with a one to a hundred score. Which of course can be very controversial and it’s hard. It’s easy to do that wrong and hard to do it right. And we spent many, many, many years trying to get those numbers right.

Let me give you just a tiny example of what we do. We look at the number of in-file security warnings, which are warnings that would be detected by a SAST or a DAST tool like Snyk, and we count the critical warnings. Because most businesses and organizations don’t have time to fix all security issues, so they really focus on the most critical ones. And then, let’s say the number is 250 high risk security warnings. Now you could say ‘objectively’, I’m using air quotes if anyone’s just listening, maybe that number seems too little or seems too high. I would suggest it depends. That whether or not 250 high risk security warnings is too high, too low, or just right depends on the context of that organization.

And so when we look at that number, 250, we compare it to similar size and stage codebases. This is probably obvious to your listeners, but we have to explain codebases to non-technical audiences. What you’d expect to see on the security and technical debt and all these other factors really varies depending on the size and stage of the business. And so again, back to that, 250 high risk security warnings. If the company is young with small code base, which is a segment fewer than two years old, fewer than 25 developers, having 250 high risk security warnings would put them in the highest risk. Uh, you’re smiling. I mean, you can like you can intuit it, that’s a lot of security warnings for a company of that size and stage. And so that, specifically, we put it in quartiles, and without even looking at the data, I can tell you that if you’re in young and small code base with 250 high risk security warnings, you’re definitely in the fourth or highest risk quartile.

By contrast, if you’re in the segment called giant, and giant is a thousand or more all time developers, not necessarily current, but all time developers, 250 high risk security warnings is best quartile. It’s best quartile, because there’s a couple reasons, but one is it’s a lot of code and it’s, you know, you’re going to have, you know, maintaining a certain level of security warnings as you go.

And so what I love about this story is it’s, uh, it’s, there’s no, it’s very specific, it’s mechanical, we really like being able to explain it so there’s no black box involved in this. I should say that that credit score report is frequently used in technical due diligence, where someone is thinking of buying a software company or a team, a company that has software and that’s a very scary, very high stakes moment. And so we want to be as explainable as possible given how stressful it is. So being able to explain it that mechanically and give the context. Again, it’s not right or wrong. It’s just the decisions the team has made, the business that has made all along. So that’s a very long answer. But only part one.

And the second part, because I’m so passionate about this, is as much as you can understand code with data, it only goes so far. And I think the best thing I’ve written, if I may, the best thing I’ve written over the last seven years is a piece two years ago called ‘Code is a Craft, Not a Competition’ and that is fundamental to my worldview and also what Sema does. And again, I know this is obvious to your listeners, but, you know, business executives, you know, they think of the sales team or sales numbers. And sales, more sales is better. And so, like sports, more points is better. Uh, more goals is better, whatever you want. In code, are more lines of code better? Maybe, right? Some of the best coding work people have ever done is removing code, right? Or instead of writing something, finding the right open source package, right?

And so Sema, our job is to produce the data, but then also to facilitate a conversation. Okay, we saw this trend. What is the reason behind it? And it’s frankly, it’s a question not just for the engineering team. It’s a question for sales and product and finance and the CEO, right? Go back into that security risk issue. I’ve never met a team that didn’t want to fix security risks or, you know, clean up technical debt, but there’s not always money for that. There’s not always time given the business needs. And so that conversation, coding is a craft, but we have to fit that in to these business contexts where there’s optimization functions. And so that conversation matters so much more.

[00:11:09] Comprehensive Codebase Scans

Henry Suryawirawan: Wow, very intriguing and interesting, definitely, right? When we talk about code as data, right? So you kind of like analyze maybe from patterns and, you know, contextual thing as well, team size and things like that. And you mentioned that this is kind of like similar to some static code analysis, probably, like you mentioned about Snyk, SAST, DAST tools, right? Maybe some engineers are also familiar with static code analysis, maybe like the linter, so maybe some kind of code pattern, cyclomatic complexity, and things like that. So how does Sema actually differ from all these tools, or is it more like working on top of these tools and aggregating, you know, more patterns and more signals from those tools?

Matt Van Itallie: That is an amazing question, Henry! We have two products, and the first one, the called Comprehensive Code Base Scans, which is the credit score report, that is a point in time report using those static tools that you described, a SAST tool. It’s not dynamic. We don’t build the code. So a static SAST tool, a CVE scanner, linter scanner, cyclomatic complexity scanner, there’s about 50 of those tools all at once. This is like, if I can use the analogy, the code scans product, bringing your car to a different mechanic to understand how much maintenance there was in other settings. And so the main, the other maintenance is SAST tools, using linters on a regular basis. Hopefully, a security tool. Hopefully, uh, linters, etc.

We’re brought in in what we call moments of evaluation. Started with technical due diligence, so we’re helping a buyer decide if a company is safe enough, if the code base is healthy enough to buy. We also help, sell side prep, which means a company wants to get bought, they want to make sure, they want to know what score they’re going to get before they go into the test, so to speak. And so very commonly, one of the findings from a SEMA scan is you now need to give your developers those tools like you described. Because we’re not, that product is not a SAST tool. It is a point in time evaluation of the state of all of the tooling you have. So that’s one of our products.

The other product is a roll up of existing SAST tools to be able to tell that story in an ongoing way. So if you’re using a security tool and you have 250, and it reports, let’s just use Snyk, it reports 250 warnings, and you’re of this size, you’re going to get this score, and you can see that over time. So a roll up of all those other tools. That’s the SaaS version of this. Very fundamentally, it does not, neither one requires engineers to change tools. I don’t need to tell you. You should let engineers pick their own tools. Back to code is a craft. You know, the professionalism, the commitment to doing it right. Even when it is important, sometimes it’s very important for executives to understand what’s going on. You still want to give as much power as you can in decision making and authority to engineers and engineering teams.

Henry Suryawirawan: Right. So thanks for sharing what your tool is doing, right? So I think I can pick two interesting things that probably we can discuss further, right? The first is about engineering leaders knowing the state of their codebase, the healthiness, how the team is doing, right? And the other one is the technical due diligence thing, right?

[00:14:31] How to Explain Codebase Health

Henry Suryawirawan: Maybe let’s start with the challenge of engineering leaders kind of like knowing the signals from the codebase, right? Because sometimes to be, I mean, transparent, like leaders won’t have time to actually go into deep into the code and looking at everything what the developers are doing, especially if the teams are large, right? So maybe tell us a little bit more, what kind of challenge that engineering leaders are facing here? Why should engineering leaders know the credit score, so to speak, right, the credit score of, you know, their systems?

Matt Van Itallie: Yeah, and Henry, you are great at helping your listeners think about their career planning. So I’m going to do a little bit of career planning first. I believe that, as hard as it is, it’s really important for everyone in their careers to start developing a perspective on what do they want to do next, and what do they want to do in five years, and what do they want to do in ten years. It doesn’t matter if it changes. And by the way, it is going to change, because as you develop in your career, you learn things. Hopefully, you’re testing hypotheses by having different experiences either within the same job or in different jobs. And you can see I’m a data driven guy. I love collecting data on those experiences and then using that to decide what you want to do next. I’ve been very lucky to have these very varied experiences.

So of course, if you are an engineer or an engineering lead, and you have an interest in becoming more and more of a manager, well, as a VP of engineering, CTO, CEO yourself, the more you have to understand the business impact or the organizational impact of the code, the less that the specifics of the code and more, well, how does it impact, I’ll just say business, although it applies to nonprofits and governments as well. So certainly for anyone listening who wants to advance in their careers in management, you have to understand how the business optimization function. And that is about codebase health, which gets summarized, because CEOs have to get the summary view of sales, and they have to get the summary view of marketing, so they need to, they will get the summary view of code one way or the other. And you as, you know, listeners, you, you are better served by being able to explain that in a way that the CEO will understand so she can support you in the ways that you need.

And I would say, even if you are very interested in staying in the level, and I don’t mean judgment, I just mean organizational level. The better you can understand the pressures and the constraints of the rest of the organization, the more likely it is that you’re going to get what you want. By the way, that’s not just about coding, that’s really a lesson for life.

I’ll give you my favorite example. So we, um, when we do these code scans, we have a report out to teams. We ask them, we measure their testing levels. And let’s just focus on unit testing for a second. We discussed with them what led to the unit testing levels that the organization has, and do they think, then we asked them, do you think unit testing should be increased? And it turns out there are good ways and bad ways to ask that I want more time to work on unit testing. The two really good ways are to explain it in terms of business outcomes or engineering impact.

Business outcomes could be, yes, I think we should have more unit testing, because the number of user identified bugs has really doubled over the last six months. And, even to go a step further into the business world, and I recognize that those came from users at some of our biggest customers, our most important customers. So if we had more testing, we could keep those customers happier. That is language that the CEO and the CFO and sales really understands. So that’s a business reason.

The second other very good reason is engineering impact. Which is to say, because we need to add more tests, because developers, we estimate that our team could be 10 percent more effective in six months, because they’re not just hand checking their code each time. That testing actually speeds up velocity and throughput of engineering work. Either or both of those reasons, obviously they have to be true, are very compelling reasons for an executive team to invest. Unfortunately, what is not a very good, not a compelling reason for almost any organization is we don’t think the code is as good as it should be. The quality is as good as it should be. We would like it to be better.

Now, you can get to the right answer, you just have to frame it in this, I mean, again, you have to be honest. You have to be honest. But if you want to be persuasive, again, it’s true about code quality, functional or non-functional requirements, it’s true about anything, understand what matters to your audience. And so, again, a long story short, Henry, if your listeners can think about how non-functional requirements can impact the organization, then they can make requests for resources, for support, for time to work on them that are more likely to succeed than if they just express it in terms of the craftsmanship of the code.

Henry Suryawirawan: Wow. So I think many listeners here who listened to what you just said, right, definitely can relate to some of the pains, especially when you’re growing, right? So when, maybe when you’re small size, it’s kind of like understandable, right? Your codebase, the healthiness and all that. But once you have larger teams and, you know, many multiple streams of deliveries, right? This is where the challenge is. And I used to work in like a scale up, right, as an engineering leader. It’s always difficult when we are asked to explain how does the engineering do at this point in time? What are the challenges? Why is it so slow to deliver something? You know, what are the tech debt impact to the business? So it’s always very difficult, and to quantify that actually is not an easy job.

[00:20:31] Measuring & Improving Developer Productivity

Henry Suryawirawan: So, these days, there are so many trends about developer productivity, measuring code metrics and all that. So how does this differ to, you know, those kind of developer experience, developer productivity kind of thing? Is this kind of the same kind of techniques that we want to do in order to improve developer productivity? Or is this something different? Because I know that you also have a unique way of coming up with the metrics, you know, the CTO dashboard, so to speak. So maybe if you can explain a little bit.

Matt Van Itallie: Yeah. There are many, many many ways to think about developer productivity and the effectiveness of the engineering organization. What I would say to your listeners is it is… As hard as it is, and frankly, as easy as it is to use, to pick wrong metrics that tell the wrong story, if you can explain it quantitatively, you’re just again so much more likely to be able to have a conversation effectively with other members of the organization.

Now, there’s always going to be this tension. I sometimes say to non-technical executives, imagine going to talk to a potter. A potter who works in a pottery guild for fun, who sits at a wheel and throws clay pots. If you weren’t a very expert potter yourself, you would not go to that potter and say, you should spin that wheel faster, or you should use this kind of clay instead of this kind of clay. Because pottery is a craft. It comes with expertise inside. It’s not really reducible to, well, how many parts can you do? And can you do them faster? And da da da da da, right? It’s not, especially, you know, as a hobby, it’s really different.

You’re laughing. I’m sure coders, much of coding can really feel like that in a good way. And I think one of the most amazing things about careers in engineering is you actually get to do a craft for a living. You get paid sometimes very well to work on something that you can be really proud of and put all of yourself, put all of yourself into. I would suggest, I know you would make the lot of the rest of the businesses, it would be easier for the rest of the business if one of the things you could do in exchange for having such a neat job, it’s not always neat, but frequently, is try to explain the effectiveness of the team in a way that the rest of the team makes sense to the non-tech folks.

So it’s kind of a non answer first, Henry, which is there’s many different ways to measure engineering effectiveness and efficiency. Obviously, we love DORA metrics. We think those are, you know, really proven to be very strong. So generally speaking, but at least by the time you’re a large software team, you should be using DORA metrics, but the rest of them, in general, you should figure out a system that works for you and is understandable to the rest of the organization.

[00:23:33] GenAI for Increasing Developer Effectiveness

Matt Van Itallie: That said, one change in understanding developer effectiveness, and frankly increasing developer effectiveness, is the use of generative AI tools for coding. So I am sure that almost everybody here is using GenAI coding tools at work. If your company doesn’t let you, email me and I will send your CEO a note explaining or CTO or Chief Legal Counsel how incredibly important it is for engineers to be able to use the right kinds of tools. But to be able to use GenAI tools at work. We now measure how much Gen AI code is in the codebase. And in that credit score companies get a red, if their coders aren’t using it enough. Because GenAI can be so, uh, such a contributor to developer productivity that not doing it leaves value on the table, has, you know, organizations go slower.

You know I like analogies, it’s like open source. Imagine a team, a company saying, we’re not going to use open source. We’re going to build everything ourselves. That would be crazy! It would mean you’re not taking advantage of best practices that someone else has figured out, a community has figured out. It would mean that you’re wasting time. And of course, it would be incredibly annoying as a developer to have to do this by hand when a good solution exists. We are absolutely at that stage of GenAI code usage, it would be crazy in almost any setting to start from scratch on a new topic, rather than seeing how GenAI assistance can help. Whether they’re specific to coding like a GitHub Copilot, or whether they’re a general LLM like ChatGPT, it absolutely makes sense for the developers and makes sense for engineers. Because of this opportunity to really increase developer throughput through GenAI tools, our CTO dashboard actually measures how much GenAI code is in use. And we explain it back in terms of a return on investment for the organization. Again, the executives want to see the ROI on things, and so everyone is now getting a tool, $40 a month, you know, per user per month seat to be able to show, and this is the impact, this is a return on investment. Helps continue to make the case for that for those GenAI tools for the coders.

[00:25:59] CTO Dashboard & The 7 Metrics

Henry Suryawirawan: You just opened up a topic that I think pretty much everyone is interested in at the moment, right? But before we go to discussing more about GenAI, right? What are the things that you put in your CTO dashboard kind of a thing, right? So maybe we talk about GenAI just now, we talk about like security findings. Maybe what are the things that you actually include as part of your credit score?

Matt Van Itallie: Absolutely. And so the things at that level of abstraction, we call them modules. And today, we use seven: how much GenAI is in use, security of the code, cyber security, so pen test results and dark web scan, if you will. Open source legal risk otherwise known as open source intellectual property risk. Using open source is an extremely good idea, but certain packages come with intellectual property risks that have to be managed, so that is something that matters to medium and large organizations.

And then the other three are various forms of code quality or technical debt. Process is the sixth module, and that is about the consistency, or lack thereof, and the trends in development activity. A quick story on that one. We looked at a company in the fall of 2020 and did a credit score report on them and they had a 20% decline in development activity in the spring. Now 20%, you know, if it’s a small team is not that much, but this was a big, big team and 20% was meaningful. I love code as data, but code is a craft and needs a conversation. So we went back and asked that code owner. Hey, we observed in spring of 2020, your development activity seemed to drop off by about 20%. And this was recent, and we were doing it in the fall. And they said, you’re right. In early spring 2020, COVID came and had a really negative impact on our business. And to make sure that we could be financially secure, we furloughed employees for one day a week. They got unpaid Fridays off. I mean, I literally still get goosebumps about this story. So they literally went from working five days a week to working four, and the data showed this 20 percent decline because it actually, in that case, the data was, you know, was really clear and there was a good story. So that’s an example of process metrics.

And the seventh area is team. And what’s in the team module, most importantly, is the retention of developers who really know the code. So deep subject matter experts in the code. All of everyone here listening knows how important it is to have the expertise of developers. No matter how good the documentation is, no matter how skilled a new developer is, there’s wisdom and knowledge about that codebase itself. That developer retention really matters. When Sema got started, many of our clients did not know that. Didn’t know it as deeply and didn’t know it as quantitatively, and so the credit score shows, you know as an example, if you have a small number of security warnings and a consistent development process, but all of your developers have quit and been replaced, that’s super risky. Super super risky than the reverse. So the team one, the team module is frankly the single most important of the modules with respect to the overall score.

Henry Suryawirawan: Wow, so very intriguing to see, you know, these various dimensions and aspects that you just covered, right? So definitely, we can’t cover all of them, but let’s try to kind of like dive deep, maybe some of them. So the first one is definitely the GenAI usage. I think this is kind of like a very controversial topics sometimes. Some companies allow developers to use it, maybe even freely, subscribe to all the different AI tools. But some companies be more cautious because of data leakage. Or maybe the code patent, right? And things like that.

[00:29:55] Measuring GenAI Usage

Henry Suryawirawan: So maybe in the first place, I want to understand from you, how do you actually come up with these metrics, like the number of GenAI usage? Because, I don’t know, like to me as a layman, right, the way I see us using it is like copy paste from the tools back to the code, right? There’s no kind of like trace, copy from where. So maybe tell us in the first place, how can you get this data of GenAI usage?

Matt Van Itallie: Yeah, so, there are two methods that Sema is working on. One of those methods is in production, and it is a deep learning based prediction of GenAI code versus not GenAI code. Of course, deep learning means training. And so we started with code that was not GenAI, definitely not GenAI originated, and code that definitely was GenAI originated. How did we do that? Well, for GenAI originated code, we synthesized it. We had an LLM produce samples that by definition came out of uh, the LLM. And then for not GenAI code, we used open source code created before GenAI tools, coding tools were in the wild, so we had true comparison sets. That was a good starting point for the prediction model, but we’ve since then added just a ton of data science on top of it to refine it, both to further train the model itself, but also add complimentary rules based on data science team’s work. That when we find things that, we can conclusively prove are GenAI or not GenAI, we override what the model says. That itself is a super fun problem for those of you who like AI, I, I mean, I hope you get to work on some, building AI solutions, that one has been a really fun one for our engineers and scientists, and I’m very proud that they’ve produced. I mean, it’s a hard problem, I’m very proud of what they’ve been able to produce so far.

[00:31:51] Healthy Dose of GenAI Usage

Henry Suryawirawan: So maybe let’s imagine that you can come up with this percentage of GenAI usage in the codebase, right? So what is the healthy mix here? Is it like, I don’t know, 50%, 10%, 30 percent or is it even like higher than that? So because in the beginning you said that everyone should use GenAI, if not that you are like missing a lot of potential. So what is the healthy mix in your view?

Matt Van Itallie: For sure. And Sema sees ourselves as trying to be as objective as we can on balancing the, the risks and the rewards of using GenAI code, and frankly, setting standards for the industry. I’m sure you’re going to say it’s at the end of this call, but if you go to semasoftware.com and you look at the first blog article pinned is the current version of our GenAI standards, which we have updated probably three times this year, and you know, it’s going to evolve over time as the tools change. But those standards have two parts. One is about GenAI originated or not. The second is if it is GenAI originated, how blended is the GenAI code? So we’ll do GenAI originated first.

Right now, the safe zone, what would be green or a strength in the credit score, is 15 to 30 percent of the code originating with GenAI. Less than that, as I mentioned, it could be yellow risk low, orange risk medium, or high risk red, depending on just how much less GenAI code was being used. Again, for reasons of organizational effectiveness, like you’re leaving, you’re just not going to be able to go as quickly as you could if you’re using GenAI coding the right way. So lower, it could become yellow or orange or red. Also above, could be yellow or orange or red.

Now, this ties into having a conversation. We do think it is very appropriate to be using more GenAI than that if you have the proper protocols and safety measures in place to be using GenAI appropriately. So in this case, security warnings, you should fix them. You should solve being a red in security warnings by getting rid of security warnings. For GenAI code usage, you can solve being a red or high risk by, for example, making sure engineers have the right licenses. It’s really important that they use licenses that, where you don’t, they’re not training the code, their model on your code. And so that’s an example where the behavior probably is okay, it just needs to be tweaked a little, as opposed to something that needs to fundamentally change. So that’s one part of the standards, GenAI originated or not.

The other part is pure versus blended GenAI code. And the idea behind this is, as big of a fan as we are of GenAI code, it frequently is wrong. It hallucinates. It lacks context. It’s insecure. It makes up packages, right? All of these things, just like GenAI human language output. And so if you looked at a code base that was entirely GenAI pure, meaning it came out of the prompt and hadn’t been modified, you should be very nervous. You will be very nervous. Did they not find any security issues or they found them? Did they not fix them? What about the institutional knowledge? And like there’s just things that can’t just be solved through prompting.

And so we have a of GenAI if GenAI originated, pure rating, and basically if more than 10 percent of the total code base is pure GenAI, we start asking more and more insistent questions about how did you get to this place. So that’s, Henry, I hope you’re having a good time. Every question you ask is a good one, and it takes me like five minutes to answer. But this one, I really wanted to give it the care. So for example, if you had 50 percent GenAI usage, which would be potentially red, but 5 percent pure, that means that they are blending 45 percent of it. If they were using the right kinds of tools they were using in the right way, that would go to strength. That would go to strength. And by contrast, if you had, let’s say 15 percent GenAI code usage, but 15 percent was pure, that actually really would be high risk, because you’re not, at least medium risk, because you’re not taking the precautions to make sure the code is as high quality as it needs to be.

Henry Suryawirawan: Wow, very interesting indeed, right? Because as I use GenAI quite a number of times in the work these days as well, right? So sometimes I can see, you know, GenAI producing wrong output. Sometimes it couldn’t even come up with the right output, right? So the pure versus blended here, definitely very important, right? For engineers to understand, be conscious like what you copy from the GenAI output is not necessarily the best, given your context, given your design, given your, you know, solution, right? So I think that’s kind of like a key here.

[00:36:50] Generative AI Bill of Materials (GBOM)™

Henry Suryawirawan: And also, I think if we go to your website, we can also see one other interesting thing. You have this concept called Generative AI Bill of Materials. I know that these days in the DevOps world, we have this SBOM, right? Dependency Bill of Materials. So what is this Gen AI Bill of Materials? How do you come up with it? And why is it important to also have this BOM for GenAI?

Matt Van Itallie: Yeah, well, you exactly understand it, so of course, your audience knows that the traditional bill of materials is about the open source packages being used. Why does an open source, why does an SBOM matter? Well, because open source code has security risks, CVEs. Because it has intellectual property risk if you use certain licenses the wrong way. Because it has operational risk, so even if there aren’t any security issues from the code being old, if you’re many versions behind, there’s a risk of it no longer being able to work. So security risk, intellectual property risk, maintainability risk for open source, and that led to a bill of materials, and your audience has, you know, at least seen them. They are used by insurance companies, and lawyers, and other, and due diligences, in these other evaluation circumstances.

Well, guess what? Same applicable thing applies for Generative AI, which is why there is now a GBOM, a Generative AI Bill of Materials, which all kudos to our customers who asked us to build this. Sema is very honored to be the inventor of the Generative AI Bill of Materials. Because you just heard all the reasons. GenAI is a really good idea, just like open source. But it has maintainability risk. It lacks context, Henry, like, it could lack context. It has security risk. It can make up what comes with vulnerabilities, just like other code. And it has intellectual property risk under certain circumstances.

And so the Generative AI Bill of Materials, or GBOM, is used in high stakes situations. Today, it’s already part of diligences. So if you’re, if you have a software team and you’re getting investment from one of the best software investors, there’s a reasonable chance they’re going to ask you for a GBOM that, in this case, Sema would produce as part of the scan. Over the next years, this will become more important for procurement, for insurance, for regular check-ins, for exactly the same reasons that the traditional BOM is already used.

Henry Suryawirawan: Wow, very interesting. So I’m sure many listeners here will be very interested, because now like we can’t really trace which code that is generated by AI. Maybe we just assume that this developer is very clever, right? But sometimes it’s produced by AI. So having this GBOM definitely very, very useful.

[00:39:24] Technical Due Diligence

Henry Suryawirawan: That’s also a good segue because you mentioned about evaluation, technical due diligence, you know, investors wanting to understand the kind of software that the portfolio companies are producing. So tell us the kind of a big challenge of technical due diligence, because for me, I’m just relating to my past experience, right? Sometimes when we want to buy a software, right? We actually don’t know what we are buying. The vendors always say nice things, you know, great things, especially on the marketing materials, right? And we kind of like assume the best intent from them. But actually, when we take the code, it’s actually quite crap, right? And there’s so many things that we need to fix. So I think technical due diligence is something that I think is very important, especially if you want to buy software, you want to invest in a startup. So maybe elaborate in the first place, what kind of a big challenge this technical due diligence in the current situation, in the current world?

Matt Van Itallie: Sure. So technical due diligence is near and dear to our heart, because it’s really how Sema got started. And we’ve had the pleasure of evaluating a trillion and a half worth of companies and technical due diligence and other settings. So to some extent, I see less about challenges and more about opportunities. But I guess, fundamentally, you know, remember me back data, you know, code is data. In the olden days, technical due diligence was done entirely qualitatively. You would read code. You’d read snippets of code. You would interview people. You’d get a sense of the team. Now, code to craft, so those are all incredibly important, but my goodness, imagine trying to understand, I don’t know, more than 100,000 lines of code by reading it and try to get sense of it. It’s not possible, much less if there’s millions, tens of millions, billions of lines of code. We’ve scanned companies with billions of lines of code.

And so, of course, a real challenge is if you’re about to make us write a serious cheque to invest in or to buy a software organization, we highly recommend understanding it qualitatively and quantitatively to make sure that you’re de-risking it, and then also not surprised by any amount of cleanup that you have to do on the other side, so that it’s not a surprise later. So very much we believe deeply that quantitative scans are, in partnership with qualitative assessment, are a really important part of doing tech due diligence the right way. So that’s, I guess that’s from the buyer’s perspective.

I think from the seller’s perspective, a challenge can be explaining the choices that you’ve made in the context of the size and stage of your organization. And so, Henry, you said something like, you’ve seen codebases that are crap. It’s actually kind of expected. I would actually expect most times when another company is buying another company’s software, that the buyer is more advanced or is more further along in the market. Because they’re further along in the market, they can afford fixing product technical debt and security debt, etc. The most important thing for early stage businesses is not non-functional requirements, not software companies. It’s not non-functional requirements. It’s product market fit. It’s traction. It’s revenue. And, you know, if we were to see a code base that was quote, “totally clean”, quote, “totally perfect”, that was early stage, and there’s no such thing as totally perfect anyway, but it would be a really red flag that they aren’t building as fast as they can. And so, Henry, do you, uh, drive stick shift cars? Have you ever driven one?

Henry Suryawirawan: No, I haven’t.

Matt Van Itallie: Okay, I’m old, so some of your listeners might have driven a stick shift car where you have to engage the clutch to shift between gears rather than an automatic. And when I was learning to drive a stick shift, one of my teachers said, if your car does not stall, then you are riding the clutch too hard. And that, again, if you have never driven stick, it’s kind of a weird analogy, but basically, stalling is not good. But if you never stall, that means you’re not, you’re staying too far away from the right zone. You should actually get into that risk zone a bit. And that is exactly how technical debt including security debt should work. You should carry some debt with you. And depending on your size and stage, you should have more or less back to the, the segment that you’re in.

And so I guess folks who are on the receiving end of a technical due diligence who are selling their company or who are raising money, what I would say to you is, back to the beginning of the conversation, explain your choices in terms of business goals. I love code quality, but investors don’t care code quality for its own sake. They care because it delivers business outcomes. It helps developers go faster, right? Too much tech debt slows down engineers and could hurt revenue. Those, explain it in those terms, even if it’s also true that, you know, you’d like it to be better, but put it in terms of how you’ve made those choices. That is the way to be persuasive, whether it’s a manager that you work with or it’s an investor or an acquirer who’s looking at your company.

Henry Suryawirawan: Well, after you explained about the stick shift car, actually I did drive those car, but we used to call it manual instead of, instead of automatic. Yeah. So I kind of like understand what you’re explaining, right? So if you are not in that state where, you know, like, I don’t know. I don’t know how to explain it. But yeah, I can kind of like relate to what you said, right? So if your code is too perfect or if you drive it too smoothly, right? So sometimes you are not into taking those advantages of, you know, like maybe, I don’t know, accelerating more or maybe controlling the car more, right? So I think definitely what you said from the buyers and sellers perspective is very, very important.

[00:45:18] Sema Adoption

Henry Suryawirawan: So how about the widespread of this adoption of the tool, right? Because I think what you mentioned is definitely advantegous to not just, you know, investors who want to buy software, but all, practically all engineering teams could benefit from having such capability to measure the effectiveness, the healthiness of the software that they’re producing, right? So what do you foresee the widespread use of Sema probably in the software world?

Matt Van Itallie: Well, with a question like that, I’m happy to answer it. Of course, we’re very excited about the possibility, and not just the possibility, but the realized vision of Sema helping bridge that gap between tech and non tech in a way that’s understandable and credible for both sides. Having established ourselves as a leader in technical due diligence, we’re very, very excited about the CTO dashboard that is still in beta, but we’re very excited about what’s coming.

And you know, when I started this business seven years ago, I started thinking about this business nine years ago, and I started it seven years ago, when I would explain what we were doing, I’d say, well, look at Salesforce. Folks who haven’t spent that side of the house, Salesforce is a tool that helps individual salespeople keep track of the sales work that they are doing. And it helps the head of the sales team, such as a Chief Revenue Officer, explain what is going on with sales to the CEO, to the Board of Directors, etc. And you would never, ever, ever, ever, if you were anything more than a tiny company, not have a CRM. You just wouldn’t do it, because keeping track, because both from the sales people’s perspective, they need a tool to manage and to be able to provide executive insight into sales. You just couldn’t run a modern business.

Now, sales is more important than code quality. As much as I love code quality, that’s why sales, that’s why an executive insight into sales has a 100 percent adoption. Sema’s tool and other executive insight products into code do not have 100% adoption yet. But as the years go by, I think you’ll see getting to it’s pretty close to 100% adoption of a CTO level dashboard for exactly the same reasons that sales uses something like a CRM. With the addition of how much GenAI is changing things, because with the addition of more and more code being written in full or in part by GenAI, you really need automated systems to keep track, because so much of it is outside of, you know, the heads of the individual developers, because they’re relying more and more on GenAI. So the rise of GenAI will be yet another factor leading to the rise of CTO dashboards.

Henry Suryawirawan: Wow, so I think this CTO dashboard definitely all engineering teams would not be like a black box anymore. So like what you mentioned, like Salesforce, right? Everyone is looking at the pipeline, the healthiness of the pipeline, how many leads that go in. So same thing probably that we can do if we have this kind of dashboard, right? How much input that the engineering team is doing, how much output they’re doing. Is there any where that they stall in the process, right? So definitely looking forward to have this.

And even I could imagine if you kind of like make this also free for open source. All the packages, open source packages that are available out there, if they have this credit score or maybe the GBOM score associated with the open source, maybe we kind of like can rely on that to actually know the credibility of the open source software that we’re using. Because at the moment we just look at, I don’t know, number of GitHub stars, maybe number of downloads. But that’s kind of like less meaningful because it doesn’t actually mean sometimes a high enough quality, right?

So is there anything else that you want to probably share that we haven’t discussed so far about, about Sema, about the comprehensive code scan that you’re doing, or technical diligence aspects that we haven’t uncovered so far?

Matt Van Itallie: No, you’ve done such a great job asking questions, Henry. I mean, I will say, if you go to semasoftware.com, you can read a lot more. You can read about the specific components of the credit score. You can try out for free the credit score by entering in your own data. It’s not exact, but it’s pretty close. And you can also try out for free the CTO dashboard. So if any of those things are interesting to your audience, come find us. And if you don’t see what you like on the website, you can send us a contact us and you will hear from me. Cause I love, love interacting with listeners. So feel free to send a note on our website and I most certainly will get in touch with you.

[00:49:48] Integrating with Sema

Henry Suryawirawan: Right. You mentioned in the, I mean, throughout the conversation, uh there’s one point you mentioned the integration, right, should seamlessly integrate with existing tools that developers actually use. Because so many, I don’t know, engineering metrics tool, they’re kind of like opinionated in a sense, right? Like you have to follow their framework, otherwise, you won’t be able to produce the output. Maybe tell us a little bit more here how easy it is to actually integrate with Sema. Like, how can we start by getting the metrics, the dashboard, by integrating our code base?

Matt Van Itallie: Yep, so it connects to… The CTO dashboard connects to GitHub, Azure DevOps, and several other developer tools, including security tools along the way. So it starts with your, whatever engineering tools you are currently using. If we don’t have them, we’re adding them.

And so we are opinionated about the metrics. So you said opinionated, and absolutely. If you come to Sema, we’re going to give you an opinion. It is… This is what an investor, how an investor will view your code, because we think you should hear that opinion because you shouldn’t be surprised. But we are not opinionated on what tool you should use, right?

Back to telling a potter, oh, you should use this clay or make the wheel go fast. That’s for you to decide. We’re not, that’s not our business. Your business is figuring out the right tools and make your code optimal, which does not mean perfect. Perfect is not a good idea, and it’s not feasible. It just means optimal, while you’re building product market fit and driving revenue. Our job is to help you tell that story with whatever set of tools that you decide to use. So that’s a long answer. Short answer is setup takes two minutes, and we’re happy to walk you through it, but it is just connecting to the tools that you have.

Henry Suryawirawan: Right. Pretty seamless, it seems. Like, so for those listeners who would like to give it a try, so maybe you can sign up, right? Or maybe even contact Matt. We’ll put his contact details in. Yeah.

Matt Van Itallie: Exactly right. Free is free. While it’s in beta, we’re offering time limited but free trials. And so keen to have people try it out and give us feedback, because it’s just so, I’m repeating myself, but it is so important to be able to explain code to non-coders, but it’s really hard and it’s really easy to make poor choices, like lines of code increasing. You do sort of have to keep track of how big the code base is. I’m not saying you should never pay attention to lines of code. But it’s so important to do it with nuance, and the more we hear from and the more we get user input, the better we can make it. So I really would love to hear from you, and we’d really love you to try it out. And tell us what you think and how we can make it better.

[00:52:17] 3 Tech Lead Wisdom

Henry Suryawirawan: So Matt, it’s been a pleasant conversation. I learned a lot, especially many aspects that I didn’t know before, right? So thank you so much for that. Before we wrap up our conversation, I have one last question for you, which I normally ask to all of my guests. I call this the three technical leadership wisdom question, right? So think of it just like advice that you want to give to the listeners here, maybe something that you want to convey to the listeners. What would be that wisdom?

Matt Van Itallie: Henry, it’s a great question, and those who are watching, notice I just switched rooms because of a noisy lawnmower. So three pieces of wisdom. Of course, I’m going to start with figure out how to explain code to non-coders. The more empathy you have, regardless of what you want to do for your career, the easier it will be for you to achieve that for your code, for your business, for your team, for you, all of the above. That’s number one.

Number two, really try to treat your career as a series of hypotheses to test. If you’re a team lead, you think you want to be an engineering manager? Well, try it. I mean, give it your best efforts and learn what you do, and do it for two years. The best you absolutely can. And you know, if you’re going to work for 40 years, two years of investment and seeing if you want to be an engineering manager is a really good use of time because that is, if you like it, it’s a really good job. And if you don’t like it, well, you have 38 other years where you’re not being an engineering manager. And you, you’ve ruled it out. You’ve collected data instead of just dreaming about alternatives, and I think that is so important for really high quality career choices.

It’s not imagining how a different job could be, it’s experiencing it. Because, you know, the grass is always greener. If you think, compare your current job to the theoretical best parts of all of the other jobs you might take, well, you’re going to be unhappy. Because no job is better than the theoretical composition. But that’s not the next job you’ll have. The next job you’ll have is a specific job with a specific boss in a specific sector, blah, blah, blah. And so trying it out. And it doesn’t, I’m not telling you to go switch your jobs. I’m like, take on different projects within your organization, etc. But collect data about the things you like to do.

And then, I guess, career tip three, or tech tip three. Most of us are pretty good at thinking about the industry that we’re a part of: gaming, fintech, insurtech, whatever. We’re also pretty good at figuring out the role. You want to be a tech lead. You want to be an individual contributor. You want to be an architect. You want to be a CTO. Those are very, very important questions. But my experience is folks undercount a really important third dimension, and that is the stage of the organization. And just like the stage matters for codebase health, you know, what we do professionally, it’s also true that it matters so much for job satisfaction.

Broadly speaking, there are four stages that feel pretty different. Startup, let’s say zero to one. Fast growth or scale up, you know, one or two to seven or eight. A great organization or good or great that’s staying that way, right, it’s in the eight to ten range. And turnaround, where it used to be good but now it’s low and you’re trying to get back. I know those four stages are different, because I’ve had enough jobs to sit in those different places and I’ve experienced them. And I also know that I’ve never met anyone who was happy, super happy in more than three, and most people are only happy in one or two.

So even if everything else is right about your job title and comp and people and sector, folks who like startups are impatient when an organization just moves so slowly. And folks who don’t like messes don’t like turnarounds. Well, guess what? Some people really love turnarounds, and if that’s like you, well, the world needs more people who can do that. So the last piece of advice is learn, experiment with different stages of organizations because that’s going to be a key, key part of professional happiness.

Henry Suryawirawan: Wow, really lovely, beautifully conveyed as well, and pretty unique. So far I haven’t heard about this kind of advice, right? So definitely the first is try to explain code to non-coders, right? I think that itself is pretty challenging for some of us. And treat your career as a hypothesis, right? So test, do experience rather than, you know, just trying to compare with other careers, potential perfect career out there. And the last one, I think it’s really important to understand the different stage of organization, right? So sometimes if you are into startups, maybe you should try working in a well established organization just to get a feel. Maybe you will do well, maybe you will not do well, but at least, you kind of like understand how it goes, right?

So thank you so much, Matt, for this conversation. If people would love to connect with you, where can they find you online, maybe?

Matt Van Itallie: Sure, I’m on LinkedIn @ mvi, Matt Van Itallie. And again, please feel free to send us, send me a message via semasoftware.com. And, uh, if you ask for Matt in the message, I promise it will be me, not a colleague, not GenAI. It will actually be me replying, so please feel free to reach out.

Henry Suryawirawan: Thank you for such enlightening conversation, Matt. So thank you so much for the software that you’re producing as well. I hope to see it being widely used in the world.

Matt Van Itallie: Henry, thank you so much and thank you for being such a great host and interviewer. I had a really nice time.

– End –