#222 - Closing the Knowledge Gap in Your Legacy Code with AI - Omer Rosenbaum

 

   

“There are things you can’t deduce from code. No AI can look in the code and understand that you didn’t do five things that are not implemented, because you tried them and you knew they were not gonna work.”

What if your most critical systems run on code that no one fully understands?

In this episode, Omer Rosenbaum, CTO and co-founder of Swimm, explains how to use AI to close the knowledge gap in your legacy codebase. Discover the limitations of AI in understanding legacy code and learn novel approaches to automatically document complex systems, ensuring their critical business logic is preserved and understood within the organization. Beyond legacy systems, Omer also shares practical advice for how junior developers can thrive in the AI era and how teams and organizations can conduct more effective research.

Key topics discussed:

  • How junior developers can thrive in the age of AI
  • The danger of shipping code you don’t fully understand
  • Why AI can’t deduce everything from your code alone
  • How writing documentation becomes more critical now with AI
  • How to analyze code that even LLMs struggle to read, like COBOL
  • How to keep your organization’s knowledge base trustworthy and up to date
  • The real danger of letting AI agents run unchecked
  • A practical approach to conducting more effective research  

Timestamps:

  • (02:10) Career Turning Points
  • (05:24) What Juniors Should Do in the Age of AI
  • (11:05) Junior Developer’s Responsbility When Using AI
  • (14:50) AI and Critical Thinking
  • (16:20) Understanding & Preserving Domain Knowledge
  • (18:11) The Importance of Written Knowledge for AI Usage
  • (21:51) Limitations of AI in Understanding Knowledge Base
  • (26:34) The Limitations of LLM in Navigating Legacy Codebases (e.g. COBOL)
  • (32:38) Effective Knowledge Sharing Culture in the Age of AI
  • (34:54) Keeping Knowledge Base Up-to-Date
  • (36:55) Keeping the Organization Knowledge Base Accurate
  • (39:08) Fact Checking and Preventing AI Hallucination
  • (41:24) The Potential of MCP
  • (43:24) The Danger of AI Agents Hallucinating with Each Other
  • (45:00) How to Get Better at Research
  • (53:41) The Importance of Investing in Research
  • (57:18) 3 Tech Lead Wisdom

_____

Omer Rosenbaum’s Bio
Omer Rosenbaum is the CTO and co-founder of Swimm, a platform reinventing the way engineering organizations manage internal knowledge about their code base. Omer founded the Check Point Security Academy and was the Cyber Security Lead at ITC, an educational organization that trains talented professionals to develop careers in technology. Omer has a MA in Linguistics from Tel Aviv University and is the creator behind the Brief YouTube Channel.

Follow Omer:

Mentions & Links:

 

Our Sponsor - Tech Lead Journal Shop
Are you looking for a new cool swag?

Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.

Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.

 

Like this episode?
Follow @techleadjournal on LinkedIn, Twitter, Instagram.
Buy me a coffee or become a patron.

 

Quotes

Career Turning Points

  • You basically have a few years of hands-on experience, which is more similar to a job. When you go to university, either you choose to focus—say you study computer science—you come with an understanding of what this looks like in application in the real world and you want to learn about the science or about specific aspects of it like algorithms. You can also choose to learn other things and broaden your horizons while you already have a lot of knowledge and a firm basis from the hands-on experience you got.

  • It gives the whole learning experience a very different atmosphere. You can approach it not like, “this is my way to get a job,” but “I’m here to learn. I’m here to deepen my knowledge, to understand things better.”

What Juniors Should Do in the Age of AI

  • Juniors need to acknowledge that it’s a challenging time for them because it’s unclear. We can postulate about the impact of AI on junior developers. Different people have different opinions, but we don’t know them for a fact.

  • As a junior, you can’t impact the fact that AI is changing the world. What you can do is acknowledge that and try to find the opportunities.

  • On one hand, it’s really challenging to justify the need for a junior developer for some tasks. If you think of a junior developer as someone who will do the small tasks while gradually learning and becoming better, then now you could let some coding assistants do that for you.

  • On the other hand, I don’t think juniors are going to be extinct from the world. Everyone starts as a junior, and we eventually evolve to being more senior and more experienced. Everyone goes through that.

  • For junior developers, the important thing is to understand what’s important to learn and what is not. Learning with AI and how to use AI effectively is important, and it’s one of the advantages they can get.

  • Junior developers can actually start with AI and feel even more comfortable working with AI tools. On the other hand, they should be super careful about not understanding things.

  • What differentiates experienced, highly qualified, and effective engineers or researchers from the rest is that they understand deeply how things work.

  • The greatest risk with AI is that you ask it to do something, you get some code back, and it does what you wanted it to do. If you don’t understand what it does or the alternative, then you sabotage your own understanding and learning journey. As a junior, that’s the most important thing; you need to learn and improve every day.

  • You might introduce critical bugs without knowing that. There’s the urge to just ship something. Let’s say I got a task. I sent it to one of the AI coding assistants or I use the completion tool or whatever. I got a result. I can just issue the PR. As juniors you should resist the urge to send out anything you don’t completely understand.

  • There is an amazing opportunity here. Before AI, if they worked on something, they could get stuck and needed a senior developer to help them. That’s it. This might be a waste of the senior developer’s time. Maybe I don’t feel so comfortable as a junior developer to approach a senior for help and admit that I’m blocked.

  • AI can actually unblock you from lots of things. You might not know how to run some tests, which is something generic that would take you a long time to learn. But if you ask Claude or GPT or whatever, you’d get an answer really quickly.

  • Also, if you select a specific part of the code that is hard for you to understand, you ask AI to walk you through it step by step. It might be an easier, faster way to learn than the non-AI way of having to understand it yourself, or if you’re stuck, asking for help from someone.

  • It’s like having someone with you. In a sense, it’s the dream of every junior developer to have a senior developer who doesn’t get tired of your questions. You just need to realize it’s not really a senior developer. They don’t really know the codebase. They don’t really understand all the pros and cons. They are limited senior developers with some strengths and some weaknesses, and your goal is to make sure you understand what’s going on.

Junior Developer’s Responsibility When Using AI

  • Let’s start from what you shouldn’t do. I gave a specific coding assistant a small task, and I wanted to see what it does, and I wanted it to save me time. I told it, “Okay, write a test, implement the task, and keep iterating until it passes the test.” Then it says, “Okay, great, everything passes. You can work with it now.” Then I look at the code, and I see that it has an “if” for the specific case of the test and it does something to pass the test specifically. It has a string of like the mock data that I used. If that returns this, okay, the test passes. A junior developer might say, “Okay, I have it, the test passed, and I will ship it.” This would create a really bad impression, of course.

  • Also, think of a less clear example where if you actually understand the code, you understand that it solved a very specific case and not the broader scope. Or maybe there are some other considerations, like security considerations. Maybe there’s something in there which puts a vulnerability, maybe performance considerations, like it works, but it wouldn’t work at scale. In case this code should operate at scale, and performance is important, you must know that.

  • As a junior developer, you can take the task, explain it. The fact that AI makes you explain in natural language what the task is, is great. Whenever I taught people how to program, the first thing I told them is to explain to yourself. Just explain to yourself what you’re supposed to do, what the steps are. If you write these to AI, it’s actually good practice.

  • When you get back a response, you need to go over everything and understand it, and make sure it makes sense in the context of the task. At which point, if you don’t understand something, you can use AI. Ask it: “Why did you do this? Is there another way? What are the pros and cons?” Keep checking that. But never ship anything you don’t completely understand.

  • It would be very challenging for people now. It would require a lot of discipline not to ship something you don’t completely understand, because they’re just like, “why would I even care?” People say, “AI will write code, then AI will fix the code.” At least, for now, it’s very far from the truth. For juniors, you need to understand so you can grow. You won’t grow by delegating all of your tasks to AI.

AI and Critical Thinking

  • There is a risk there, but I do think that people will get burned a lot because they will believe what AI sent them, and then they’ll learn the hard way that they shouldn’t have. So it might make them more willing to check everything they get as a response. I’ll be optimistic and say it would help our critical thinking.

  • If some junior developers are listening, I would advise you to do that elaborately. Make sure you want to improve on your critical thinking rather than get burned and then iterate from there.

Understanding & Preserving Domain Knowledge

  • We’re raising two areas where juniors can grow here. One is the technological area, where it’ll also be with you when you switch jobs or switch business domains, and it’s the fundamental of being an engineer.

  • AI excels at creating demos from scratch, because it’s something generic. It’s getting better. We get other tools that enrich it with context. But AI never really knows your company. It doesn’t understand the broader business logic. It wasn’t in that meeting where the product manager met with the client, understood the needs, and then in the meeting when the product manager told you what they need.

  • One of the responsibilities of every engineer, not just junior engineers, would be to communicate to AI and to the code you’re writing yourself the specific business logic, business rules, and constraints. All of that context that is unique to your organization.

  • Writing it, by the way, is crucial and it’s something we’ve been neglecting as humanity for various reasons. With AI, it can actually be easier because you explain to AI why you did something, and then you can hopefully preserve that knowledge alongside the code.

The Importance of Written Knowledge for AI Usage

  • If you have an organization that values written knowledge and specifically documentation, you get lots of benefits. Before the era of Gen AI and coding assistants, you would say it’s crucial for developers and the communication between them.

  • One of the extremist cases I’ve experienced myself was a crucial real-time system that two separate teams worked on. There was a queue of messages where one team assumed that one was the top priority and 10 was the lowest priority, and the other team assumed exactly the opposite. They didn’t understand why they got random sequences of messages and it wasn’t according to priority. This is just an extreme, clear example of miscommunication.

  • In general, when you write code and it accumulates over time, what you lose is the business logic context and why you did lots of things. There are things you can’t deduce from code. No AI can look in the code and understand that you didn’t do five things that are not implemented because you tried them sometime and you know that it’s not going to work because of this and that. They don’t know that it’s a request by a specific client. When you have such unique knowledge, it’s critical to capture and preserve it.

  • Now, given that we provide the context to AI coding assistants to help us with coding tasks, it’s even clearer that you have clear value from writing. Before that, people were scared or reluctant to write because they would say, “Okay, no one ever reads it.” Now AI is going to read it. AI is not lazy. The coding assistant will read your docs and use them.

  • One of the biggest challenges is actually keeping that knowledge up to date with the code as it evolves. This is actually one of the things we solved first in Swimm, even before the era of Gen AI as it is today. We started by allowing developers to write documentation and automatically ensure it’s kept up to date with the code as the code evolves.

  • With AI, it’s even more crucial because AI can look at a piece of documentation, not know that it’s outdated, and rely on it when generating code or other docs or tests.

  • This is one of the pillars for having a knowledge base that is comprehensive, describes everything you need to know, especially things that you cannot deduce from the code, and is kept up to date as your code evolves, and then reachable to both humans who might need it and AI assistants and agents.

Limitations of AI in Understanding Knowledge Base

  • The difference lies in the small details sometimes. When you give AI a code piece in isolation, and you ask AI, “please explain what’s happening here,” it’ll do an amazing job most of the time.

  • When you talk about knowledge that you might need, what kind of knowledge is it? One type of knowledge is understanding some code in isolation, but there are other things. The things that are not in the code cannot be deduced from the code. Some business logic that is not clearly translated into the code, requirements that stem from some regulation or client requirements, things you ended up not implementing for some reasons. All of that is not something you can deduce from just reading the code.

  • AI is great at writing things in a way that humans can understand. If you provide it with very clear context and all the context needed and you say, “now create, write it in a way that’s easy to understand.” Perhaps translate it to another language because not all of us are native English speakers and not everyone wants their documentation or explanations to be in English. This is amazing for AI.

  • We also need to understand and acknowledge the limitations of AI. One is not assuming it could understand things that are not in the code. Another is that code is not always easily understood, even by AI.

  • If you give AI one function and it’s written in a way that is very clear — the flow is linear or the function name is clear, it’s documented — of course, it’s easy to understand. You take the same function, you rename it, you remove all the comments, and you change the variable names to be something a bit less clear; which might sound like an exercise in bad coding, but I’ve seen enough codebases to know that it happens in many companies. Then all of a sudden, AI starts to make some mistakes when understanding it.

  • When you get to complex flows, you have lots of cases that compilers know to take into account that an AI doesn’t. For example, you have ambiguity resolution. Also, if the code involves looking at external resources, like creating from a database that it might not have access to.

  • AI is a super powerful tool, especially in explaining things to humans and also allowing humans to ask questions in natural language. It’s amazing. But we shouldn’t expect AI to understand the code. It doesn’t really understand the code.

  • When you provide it with complex codebases that are convoluted with mixed conventions, specific domain knowledge that it hasn’t been trained on, sometimes even misleading comments that are not up to date with the code, variable names that sometimes are cryptic and sometimes are even confusing and misleading, AI makes mistakes. I’ve tested it thoroughly. Here at Swimm, we also create documentation automatically from codebases.

  • One of the first hypotheses was, “Okay, let’s just use an LLM.” We tried a lot, and it made us understand it’s an amazing tool, but it has some very fundamental limitations. That’s where you need other techniques like static code analysis and other things you can do that when you join with AI, you get clear, coherent documents or other forms of written knowledge explained in a way that is useful for humans to understand.

The Limitations of LLM in Navigating Legacy Codebases (e.g. COBOL)

  • The most extreme cases we had, in that regard, and it’s also our focus now, is actually legacy codebases, and more specifically, mainframe legacy codebases with COBOL, which is a language that I had never run into before I started working on this problem for those clients. Since then we’ve doubled down on COBOL.

  • It’s a language where most of the code is not available online or there is almost no real COBOL code available online. The code that you have on GitHub doesn’t look like the code that companies run.

  • When GitHub launched, no one was writing COBOL in an organization that wanted to publish their code. If you start a new project today and you work even not with an AI assistant, you start with lots of libraries. You have lots of frameworks, and those frameworks are built on open source. When people developed big codebases in COBOL in the seventies, in the eighties, they didn’t have libraries. They had to reimagine everything themselves. So every organization looks very, very different.

  • LLMs don’t have access to real-world COBOL code and LLMs don’t have access to the specific code of your organization. In COBOL, you have cryptic variable names all the time, and the structure is different than other code languages.

  • It was the most extreme we saw. Clients tried to send some code to an LLM and ask, “Okay, what does it do?” You get super generic and confusing and wrong results part of the times.

  • What we did was we wrote a COBOL parser that actually takes the code, parses it syntactically, and connects the dots together in a way that makes sense for an LLM.

  • In the end, for example, we have a variable and we want to understand what it does. We take the variable name. The variable name can be reused in lots of different places in the codebase. We find only the occurrences that are related to this occurrence of the variable. Then we send all of that context to an LLM and we ask for a summary about this variable.

  • When the LLM gets the right context and only the right context, it does a great job at explaining it in natural language. If you just throw the old code, it starts combining different variables who have the same name. It happens a lot in COBOL. It’s really hard to parse all of that COBOL and understand what really belongs together and what doesn’t.

  • What we do is first analyze the code statically. We build our own internal representation of the code, how things relate, what function calls other functions, what variables are used, where hierarchy of say a flow. Then we slowly build the knowledge by sending small bits to an LLM to explain, after we do a lot of work on cleaning everything.

  • This is an example of combining static analysis or code that analyzes code in a deterministic way, not something probabilistic that an LLM produces. Then using an LLM for what it does best, which is taking text or specific parts of code and explaining it in natural, coherent language.

  • At the last phase, we also asked the LLM to generate parts of the documents we show to the user, because the LLM is great at formulating that in coherent English that is clear, that explains the story after it already has all the context that we built, bit by bit.

Effective Knowledge Sharing Culture in the Age of AI

  • The first thing is to acknowledge that it’s really, really important, especially now that AI will read the knowledge base. So you want to invest in it, and it’s going to be a multiplier effect.

  • Specifically, you want to create a culture, as a leader, that values people who capture knowledge. Some organizations used to look at these people as wasting their time or doing the easy task by just explaining what’s there instead of creating. Now, it’s not a viable argument at all. It’s clear that if you document what’s happening, it will help AI accomplish anything afterwards.

  • The other thing is that you should put effort into finding the tools that will help you with the task of writing this comprehensive knowledge base and documents, and keeping it up to date as the code evolves. Otherwise, you just have misleading and wrong information.

  • Third, to be able to find that information. Both humans and AI should be able to find the information they need when they need it. So you should invest in those tools to help you with that task, and also create a culture that values knowledge creation, preservation, and sharing.

Keeping Knowledge Base Up-to-Date

  • We at Swimm spend a lot of time working on this specific problem. Nowadays, there are basically two approaches.

    • One is to regenerate the documentation every time. You generate documentation automatically. If you regenerate now, it’ll match the code state right now. I think it makes sense in some cases, for example, for API documentation. But what you’re going to lose if you regenerate every time is additional context that is not there in the code. You need a way to preserve that, because that could be the most important piece of knowledge that is written there.

    • Another approach is to somehow track the changes made to the code that is referenced in specific documents and then update those parts of the documents based on the changes and maybe ask for human intervention in case the code changed drastically, for example.

  • This is actually something we provide with Swimm. When you create a document with Swimm, we track the changes made to the code you relied on in the document. We either automatically update the document or if the change is drastic, we tell you as a human, “Please decide what you want to do now. If you want to reselect this part of the code, if you want to rewrite it, maybe this part is no longer relevant. Maybe you need to add some unique information.”

  • The goal here is to understand that a lot of the unique knowledge that only developers have in their minds is what you need to work so hard to preserve. Therefore, you can’t just rely on AI generating documents.

Keeping the Organization Knowledge Base Accurate

  • AI has great promise in the sense of accumulating all of this information from across the organization—from say, Jira, Slack, documentation tools, and others. It’s suddenly possible to just ask a question and get a response from various resources.

  • The key here is to understand that some of these resources are more historic references than actual up-to-date information, which is also sometimes valuable. The Jira ticket can tell you what a product manager wanted you to accomplish at some point. At least, most of the time, it won’t tell you what’s actually happening right now.

  • If you have code documentation software that actually explains what happens in the code and keeps it up-to-date, then you can rely on it.

  • For AI coding assistants or AI tools that help you find information from across the organization, they should always explain what resources they’re using to formulate the responses and perhaps mark some of those messages or snippets of knowledge as how likely they are to be up-to-date.

Fact Checking and Preventing AI Hallucination

  • We worked a lot on it when working on Swimm, when we generate documents to make sure they reflect the accurate state of the code.

  • The most interesting part in terms of what the end user can get from it is that when we generate the documents, you see for everything we write, what we relied on to provide that information. We show you that it’s grounded in this part of the code or that part of the document, and you can validate it yourself.

  • In addition to making sure we do everything we can to avoid hallucinations by all kinds of techniques that are available to eliminate or at least decrease hallucinations, it can always happen. There is some hallucination.

  • As the end user, you can never tell if the LLM made a mistake or maybe the other providers made a mistake. So you should trust tools that show you what they relied on. If you want to incorporate a tool into your organization that gives you answers based on questions, based on your own knowledge base, you should enforce that they give you resources or citations to everything they output.

The Potential of MCP

  • MCP will be a game changer in the sense that you will see lots of information being fed all the time, and it’ll help create a flywheel effect where when you put the effort into generating valuable documents or a knowledge base, then all of the AI assistants would be able to reach it, find the relevant information, and make use of it.

  • Then we would get some mind-blowing things. You can have an AI assistant that analyzes your Jira tickets and provides a summary and, all of a sudden, it knows what’s happening in the codebase. It gets the broader context from a document that was written partially by AI and partially by a human.

  • To get there, we need to make sure we create these explicit knowledge fragments alongside the way. Also that we rely on them. So we guide the AI coding assistant to rely on specific resources.

The Danger of AI Agents Hallucinating with Each Other

  • If you have an agent doing that, you have to somehow constrain it and validate the output that it generates.

  • With an announcement of A2A, we’re talking about new protocols for agents communicating with other agents. At some point, it’s going to be hard for a human to understand what’s going on. That’s where I think we’ll have to stop and think: what’s actually happening here? Where should we have the human in the loop, and where are we not? It’s going to be really exciting times in that sense.

How to Get Better at Research

  • First of all, we need to define what research is. Engineering organizations are usually called R&D. It’s research and development, and the research piece comes first. Usually, you have research as in problem-solving, and you need to find the best way to do something and you need to learn along the way. But it’s part of development.

  • Where I draw the line is if you know the task is achievable and you know the right approach to get there, then it’s development. It’s research when you have a task, you’re not sure if it’s possible or you’re sure it’s possible, but you really don’t know how to get there, because there are so many different options and it’s unclear; that’s when it’s research.

  • When you want to work on a product, and research is something that is blocking. The key difference between development and research is time estimation. It’s notoriously hard to give real time estimations. But usually when developers say, “this will take me a week,” it won’t take a year; usually, there is some link between the estimate and how long it takes. With research, you can sometimes just not know. It’s something to acknowledge.

  • When you have research that is guided by a product, it means you have some problem you want to solve. You need to do a few things. One is to lay the entire flow from beginning to end, even though you can’t solve all the intermediate stages.

  • I don’t know how to find the right components. So for now, I’ll wrap that in a box and I’ll keep going to the next step. Now that I know what components there are, how do I document the component? I actually write that on a whiteboard with boxes and I keep them closed. I don’t want to open the boxes now. I want to make sure I can understand what the process would be.

  • Before I open a box, the thing that makes you want to do is open an interesting box, peek inside, try to solve it. But it might be irrelevant.

  • You need to first make sure you can achieve everything. If everything works, you say, “I assume all the boxes work. Will this work? Yes, okay. What don’t I know now? There is this specific box. I’m not sure it’s possible.”

  • The crucial thing when managing research is to deliberately pause and think about the different directions together. Every researcher will do the research themselves. They will read the code and try to understand what’s happening. But you can help them by stopping and thinking about what their best technique is.

  • When I led a cybersecurity course, we taught reverse engineering. The lesson learned is you don’t always have to reverse engineer by reading through the code. We wanted to teach them that before you jump into one way of solving the problem, stop and consider different solutions.

  • What I usually do when I work with people on research tasks is draw it as a tree. Like, “Okay, we are here. How can we solve it? We have option one, two, and three.” We don’t have a time estimate because we don’t know what we’ll find out. Let’s give it a day and see what happens. I usually call it time-to-live. For how long are we going to work on this before we stop and re-discuss what we found out and whether we should keep pursuing this specific direction or change to another direction of the research.

  • One of the most important things is to make everyone stop and think about the various ways to approach problems. Sometimes, the easy solution is there, but you need to think about it. Sometimes, it’s just clicking help, and you have the solution. You don’t have to read through the code.

  • When you work on a product, you don’t have all the time in the world to just do research. You have to make sure you can provide a product to a user in a timely manner.

  • To summarize, one crucial thing is to understand that time estimation is hard. What you can do is give it time-to-live. How long am I willing to spend on this before I stop and reevaluate?

  • The second thing is to make sure we get the end-to-end process, from the input to the output that the user sees or that the other product takes into account as an input.

  • The third part is pausing and thinking together about the different ways to approach a research task. What characterizes research tasks is that it’s unclear how to make progress. So we need to stop and think together on how to approach this.

The Importance of Investing in Research

  • On a personal level, if you are a CTO, one of your responsibilities is to know how technology can empower your business. This is also internally. As a very recent example, you’re a CTO. You keep yourself informed with what’s happening. You know that there are great AI coding assistants that can help people become more effective. You go to your engineering teams and you introduce them to their tool and you help them adjust or adopt new tools.

  • Some people will always be a bit wary of trying new things. As leaders, one of our responsibilities is to show them, “Look how easy it is, look how useful it could be.” This is like on a personal level and how to incorporate new tools or methodologies or techniques into the organization.

  • Another great way to drive change is by giving talks. Get your company team, group together, and give a live demo of using a cool new tool, for example. That’s more about incorporating new methodologies, techniques, and tools.

  • When we talk about deep research, it’s not for every organization. I’m not going to say that every organization needs a research person or team. But if you do, new technology takes time and a different mindset. You can’t expect a research team to operate the same way as a development team with a clear timeline for every milestone.

  • Make sure that the value for the product is clear and relatively easy to get. Get those people to understand what research means. Make them be professional at managing research, at assigning time-to-live to different directions, at having brainstorms about what the best way to approach a specific issue is.

  • If you don’t have this expertise, it’s fine. Consult with others who do. There are people who are experienced researchers that work differently from people who are not experienced researchers. It’s the same with engineers and anything else. Research is a skill. It’s a skill that people can improve at. It’s a skill you can learn and you can get help from others if you don’t have the experience.

3 Tech Lead Wisdom

  1. Put your time and effort into your people. They make all the difference.

    • It means talking with them about how they are and what can help them. Making sure you help them grow in their position.

    • Finding them is hard. Leading people is sometimes hard, but I think it’s also the most rewarding part of the leader’s job.

  2. In this era, you must be open-minded to trying new things.

    • I don’t think it makes sense for someone in 2025 to work the same as they worked in 2024. Things change really fast. In order not to stay behind, you have to be on top of it. So you have to be open-minded and keep yourselves informed.
  3. In case it’s viable for your organization, put the time into research.

    • Because it can open up new directions, and it can help you in ways that you haven’t dreamt of before.

    • Sometimes spending two days on a research task can make you change your decisions completely. So, I think allocate the time in case it’s relevant for your business case.

Transcript

[00:01:25] Introduction

Henry Suryawirawan: Hello, everyone. Welcome back to another new episode of the Tech Lead Journal podcast. Today, I have with me the CTO and co-founder of Swimm. Swimm.io, right? Omer Rosenbaum. He’s here with me today in the show. So I think we plan to have a lot, a lot of discussions about, you know, learning, research, AI, of course, and also other technical leadership wisdom that Omer will share today with us. So Omer, looking forward for our conversation today.

Omer Rosenbaum: Me too, Henry. Thank you for having me. It’s great to be here.

[00:02:10] Career Turning Points

Henry Suryawirawan: Yeah. Omer, I’d like to invite you, maybe share something about yourself first. Any career turning points that you think we can learn from you.

Omer Rosenbaum: Uh, of course. So I started my career and I had an opportunity to be a part of a special technological unit and actually started with a very special training that kind of changed my life. It was 13 weeks throughout which I learned more than I had in all of my years before that, just professionally speaking. And it was really mind blowing, like how much you can learn in such a short amount of time and also the different methodologies. And it really got me into getting interested in teaching or in learning, which also affected a lot of the things I did later.

Then I went to the university, where I took all kinds of courses in variety of topics, including chemistry, math, linguistics, psychology, and others. And while I was a student, I also taught a lot, wrote a few books, and started a few training programs. One in Singapore, by the way. So yeah, so I got to teach in different places and again, think about teaching methodology as well as research methodology. And in 2019, I co-founded Swimm, and ever since then I’ve been the CTO and the co-founder here.

Henry Suryawirawan: Wow. Thank you for sharing your interesting journey. I didn’t know that you kind of like started a job first before you go into university. What do you think is an advantage of starting the job first before you get into uni?

Omer Rosenbaum: Yeah. So first of all, you don’t have to start a job per se. It’s just that you basically have a few years of hands, hands-on experience, right, which is more similar to a job. And then when you go to the university, either you choose to focus, say you study computer science, right? But you come with understanding what this looks like in application in the real world and you want, you want to learn about the science or about specific aspects of it like algorithms and so on. And you can also choose to learn other things and broaden your horizons while you already have a lot of knowledge and the firm basis based on the hands-on experience you got. So I think it gives the whole learning experience a very different atmosphere. And you can approach it not as like, this is my way to get a job, because I already have my skills, right? I’m here to learn. I’m here to deepen my knowledge, to understand things better.

Henry Suryawirawan: I think that’s really interesting, Omer. So what I think as well, right? If you started kind of like a job first before you go into uni, you kind of like have a good understanding of what kind of roles that you like or don’t like. And probably the hands-on experience will be also quite relatable, right? When you study, because sometimes, I imagine if we go to uni we can’t even relate to some of the subjects, right? So we don’t even know it’s gonna be useful or not. And I think a lot of challenges in parts of the world as juniors, right, is actually to understand whether what I’m learning now is going to be usable or relatable with my job or not.

[00:05:24] What Juniors Should Do in the Age of AI

Henry Suryawirawan: And especially these days, you know, the craze is about AI, right? And there’s a lot of fear for juniors about, you know, getting a job, because some people think AI could replace the need for any kind of juniors. So maybe let’s start our discussion by discussing about this challenge, right? So what do you think juniors should do in the age of AI these days, about getting a good job or about learning new skills.

Omer Rosenbaum: Sure. So first of all, I think juniors need to acknowledge that it’s a challenging time for juniors. First of all, because it’s unclear. I mean, we can postulate about the impact of AI on junior developers. Different people have different opinions. We don’t know them for a fact, right? So clearly, it’s more challenging. And I think, I mean, as a junior, you can’t impact the fact that now AI is changing the world, right? What you can do is acknowledge that and try to find the opportunities. I think, on one hand, it’s really challenging to justify the need for a junior developer for some tasks. If you think of a junior developer as someone who will do the small tasks, right, while gradually learning and becoming better, then now you could let some coding assistants do that for you, right?

On the other hand, I don’t think juniors are gonna be extinct from the world, right? Everyone starts as a junior, and we eventually evolve to being more senior and more experienced. Everyone go through that. And I think for junior developers, the important thing is to understand what’s important to learn and what is not. For example, I think learning with AI and how to use AI effectively is important, and it’s one of the advantages they can get, right? Like if you see, I don’t know, or whenever we see our grandparents operate a computer, we feel the difference, right? Even when you see like a 10-year-old, he is computer native, right? They started with a computer, they know what it’s like. So junior developers can actually start with AI and feel even more comfortable working with AI tools, perhaps. On the other hand, they should be super careful about not understanding things. I think what differentiates experienced, high qualified and effective engineers or researchers from the rest is the fact they understand deeply how things work.

And the greatest risk with AI is that it seems to do, you know, you’re asking it to do something, you get some code back, it kind of does what you wanted it to do, and if you don’t understand what it does or you don’t understand what the alternative is, then you really sabotage your own understanding and learning journey. And as a junior, that’s the most important thing, right? You need to learn and improve every day. And also you might introduce critical bugs without knowing that. So I think the urge to just ship something. Let’s say I got a task. Okay, I sent it to one of the AI coding assistants, or I use the completion tool or whatever. I got a result. I can just issue the PR. I think, as juniors, you should resist the urge to send out anything you don’t completely understand.

On the other hand, there is an amazing opportunity here, before AI if they worked on something, they could get stuck and just they need, a senior developer to help them. That’s it. Which might be a waste of time of the senior developer. Maybe I don’t feel so comfortable as a junior developer to approach a senior to help me and admit that I’m blocked, right? And AI can actually unblock you from lots of things. It might be a generic thing about, you know, as it’s junior developers, you might not know how to run some tests, which is something that generic that would take you a long time to learn. But if you ask Claude or GPT or whatever, you’d get an answer really quickly, right? And also, if you select a specific part of the code that is hard for you to understand it, you ask AI to walk you through it step by step. It might be an easier, faster way to learn than the non-AI way of having to understand yourself, or again, if you’re stuck asking for help from someone. So it’s like having someone with you. I think on, in a sense, the dream of every junior developer to have a senior developer who doesn’t get tired of your questions, right? You just need to realize it’s not really a senior developer. They don’t really know the codebase. They don’t really understand all the pros and cons, right? There are limited senior developer with some strength and some weaknesses, and your goal is to make sure you understand what’s going on. That’s the main thing, I think, of the junior developer you have to do.

Henry Suryawirawan: I think you brought up very good points, right? I specifically like your term, you know, I’m gonna use it like AI native generation, right? So because when they go into the workforce these days, right, they can use AI. It is simply available, right? And I think, looking back at my time back then, right, we… probably the best that we had are books, right? Maybe like Google, Stack Overflow was probably also just starting, right? So like now you have AI natively that you can use straight away. So I think this gives a lot of opportunity for sure, for anyone to upskill and learn about something. And especially, if the workplace also allows AI usage, you know, like AI coding assistance. You can also get up to speed to the codebase asking about specific parts of the code that you don’t understand without bugging the so called the senior developers, right?

[00:11:05] Junior Developer’s Responsibility When Using AI

Henry Suryawirawan: And I think you brought up a good points about, you know, understanding your codebase even though it’s generated by AI before you actually submit or even push it to an environment or production, right? So tell us maybe about this workflow, right? So for example, a junior, you know, specifically has a task and it has to run specific, I dunno, like business logic or something like that, but AI gives them like a code that they don’t understand. What specifically should a junior developer do?

Omer Rosenbaum: As a junior developer, you mean, right? Let’s start from what you shouldn’t do, right? So I gave a specific coding assistant a small task, and I wanted to see what it does, and I wanted it to help me save me time. And I told her, okay, write a test, implement the task, and keep iterating until it passes the test, right? And then it says, okay, great, everything passes. You can work with it now. Then I look at the code, right? And I see that it has an if for the specific case of the test and it does something to pass the test specifically, right? So it has a string of like the mock data that I used. So if that return, this, okay, the test passes. Great, right?

Now, it’s an extreme example, but it happened to me right now, right? I mean, very recently. And a junior developer might say, okay, I have it, the test pass and I will ship it, right? This would create a really bad impression, of course. But also think of a less clear example where if you actually understand the code, you understand that it actually solved a very specific case and not the broader scope. Or maybe there are some other considerations, security considerations. Maybe there is something in there which puts a vulnerability, maybe performance considerations, like it works, but it wouldn’t work on scale. And in case this code should operate in scale, right? And performance is important. You must know that.

So I think as a junior developer, you can take the task, explain it. By the way, the fact that AI makes you explain in natural language what the task is, is great. Because whenever I got to teach people how to program, the first thing I told them is explain to yourself. I wouldn’t say natural language, right? Just explain to yourself what you’re supposed to do, what the steps are. Now if you write these to AI, it’s actually a good practice. And then when you get back a response, you need to go over everything and understand everything and make sure it makes sense in the context of the task.

At which point, if you don’t understand something, you can use AI. Ask it. Why did you do this? Is there another way? What are the pros and cons, right? Keep checking that. But never ship anything you don’t completely understand. I think it would be very challenging for people now. It would require a lot of discipline not to ship something you don’t completely understand, because they’re just like, why would I even care, right? People say like, AI will write code, then AI will fix the code. At least, for now, it’s very far from the truth. And again, for junior, you need to understand so you can grow. You won’t grow by delegating all of your tasks to AI.

Henry Suryawirawan: Yeah, I think that’s a very good reminder, I would say, right? Because sometimes I think even seniors using AI, right? If they think, oh, this code looks okay, just by looking at a glance, right? Sometimes a particular bug could just appear and before you realize it, it makes a, you know, like a production issue or something like that, right? So I think, you know, no matter whether you’re senior, junior, right, always look at the generated code by AI, right? And make sure that you understand, again, like the key point that you emphasize is like, understand exactly what the code is doing and maybe also the design, right, what certain aspects suggested by the AI. I think that’s a very good thing.

[00:14:50] AI and Critical Thinking

Henry Suryawirawan: And I think this brings me to the next question that I’d like to ask, right. But simply just by understanding and maybe asking back, right, or maybe being curious about why certain things are suggested that way. It’s actually like a critical thinking kind of a capability, right?

And there’s a recent research saying that, you know, using AI a lot will actually impact your critical thinking ability. Maybe it’ll reduce it or even make you less critical. So maybe in your point, what’s your view about this? Will AI actually improve our critical thinking or actually reduce our critical thinking?

Omer Rosenbaum: I think that it’s hard to tell, right? I’m not a prophet. I do think there is a risk there, but I do think that people will get burned a lot, because they will believe what AI sent them, like the responses they get. And then they’ll learn in the hard way that they shouldn’t have. So it might make them more willing to check everything they get as a response. And so I’ll be optimistic and say it would help our critical thinking. If some junior developers are listening, I would advise you to do that elaborately, make sure you want to improve on your critical thinking rather than get burned and then, uh, iterate from there.

Henry Suryawirawan: Yeah, I think like any kind of technological advancement, right, it makes your life easier, right? And by making life easier, sometimes we get, you know, lazy, so to speak, right? So don’t forget, like, you have to be critical, right? That’s the first thing. Understanding fundamentals.

[00:16:20] Understanding & Preserving Domain Knowledge

Henry Suryawirawan: And I think the other aspect that is really important, especially for junior, right, it’s actually understanding the domain knowledge, right? Maybe it’s the business aspects of it, it maybe, I don’t know, other aspects of the codebase that not necessarily just technological.

So maybe in your view, what’s your take about juniors, you know, upskilling themselves in domain knowledge? Because yeah, sometimes you can ask AI, maybe if it’s like generic domain knowledge. But there are specific things that are in the organization that probably is not easy for AI to suggest something.

Omer Rosenbaum: Of course. So I think we’re raising two areas where juniors can grow here. One is the technological area, where it’ll also be with you when you switch jobs or switch business domains, right? And it’s the fundamental of being an engineer, I think. So it’s a must. But you’re also raising another very valid point, which is AI excels as creating demos from scratch, right? Because it’s something generic. It doesn’t excel as much. It’s, it’s getting better. We get other tools that enrich it with context. But AI never knows your company, really. It doesn’t understand the broader business logic. It wasn’t in that meeting where the product manager met with the client, understood the needs, and then on the meeting when the product manager told you what they need, right?

So I think one of the responsibilities of every engineer, not just junior engineers, would be to communicate to AI and to the code you’re writing yourself the specific business logic, business rules, constraints. All of that context that is unique to your organization. And I think writing it, by the way, is crucial and it’s something we’ve been neglecting as humanity for various reasons. But I think, with AI, it can actually be easier because you explain to AI why you did something, and then you can hopefully preserve that knowledge alongside the code.

[00:18:11] The Importance of Written Knowledge for AI Usage

Henry Suryawirawan: Yeah. So I think, writing, I’ve been hearing a lot of times, right, in so many different episodes, like writing skills, very crucial, right? Leaders especially, uh, you can’t just lead by talking to people. You have to write more, right? And writing here also means like some kind of knowledge base, right? And I know Swimm.io is, you know, kinda like dealing with knowledge problem, right? So tell us how can we actually learn better by, you know, having a knowledge base within the company or the practice itself, right? Having more writing, more documentations within the organization.

Omer Rosenbaum: Right. So I think if you have an organization that values written knowledge and specifically documentation, you got lots of benefits. Before the era of Gen AI and coding assistance, you would say it’s crucial for developers and the communication between them, right? I think one of the extremist cases I’ve experienced myself was a crucial real time system that two separate teams worked on. And there was a queue of messages where one team assumed that one was the top priority and 10 was the lowest priority. And the other team assumed exactly the opposite, and they didn’t understand why they get random sequences of messages and it’s not according to priority, right?

So this is just an extreme clear example of miscommunication. But in general, when you write code and it accumulates over time, what you lose is the business logic context and why you did lots of things. There are things you can’t deduce from code. No AI can look in the code and understand that the way you didn’t do five things that are not implemented, they’re not there, because you tried them sometime and you know that it’s not gonna work because of this and that. And they don’t know that it’s a request by a specific client.

So when you have such unique knowledge, it’s critical to capture it and preserve it. And I think now given that we provide the context to AI coding assistants to help us with the coding tasks, it’s even more clear that you have clear value from writing. Before that people were scared or reluctant to write, because they would say, okay, no one ever reads it. Now AI is gonna read it. AI is not lazy. AI is gonna read it, right? The coding assistant will read your docs and use them. And then one of the biggest challenges is actually keeping that knowledge up to date with the code as it evolves.

And this is actually one of the things we solved first in Swimm, even before the era of Gen AI as it is today, when it was commonly used. We started by allowing developers to write documentation and make sure automatically that it’s kept up to date with the code as the code evolves. And with AI, it’s even more crucial because AI can look at a piece of documentation, not know that it’s outdated, and rely on it when generating code or other docs or tests and so on. So this is one of the pillars for having a knowledge base that is comprehensive, describes everything you need to know, especially things that you cannot deduce from the code and kept up to date as your code evolves, and then reachable to both humans who might need it and AI assistants and agents.

Henry Suryawirawan: Wow! I think I laughed when you mentioned that before we were not so sure anyone would read the documentation, right? So sometimes we are lazy to write some documentation, because we think nobody is gonna read it. But now I think you make a fair point, right? The AI will be the first audience, especially if the AI tool has access to your documentations, right?

[00:21:51] Limitations of AI in Understanding Knowledge Base

Henry Suryawirawan: So I think you have dealt with this kind of challenge, knowledge sharing and also understanding documentations with your company, right? So tell us how maybe some use cases of how AI actually improves this kind of knowledge sharing and also documentation. Because all we know, I mean, we are layman people, we think we just feed to AI, AI will summarize. AI will tell you what to do. I think there’s some danger here about hallucination, you know, the context, right? How much context should you give to AI? Do you actually also share everything within the organization to AI? So tell us a little bit more about this, because this, these are the nuances that I think some of us might not understand.

Omer Rosenbaum: I agree, and I think the difference lies in the small details sometimes. Because when you give AI a code piece in isolation, and you ask AI, please explain what’s happening here. It’ll do an amazing job most of the time. However, when you talk about knowledge that you might need, anything, what kind of knowledge is it? So one type of knowledge is understanding some code in isolation, right? But there are other things. The things that are not in the code cannot be deduced from the code, right? Obviously. So again, some business logic that is not clearly translated into the code, requirements that stem from some regulation or client requirements, things you ended up not implementing for some reasons. All of that is not something you can deduce from just reading the code.

And I think AI is great at writing things in a way that humans can understand, right? So if you provide it with very clear context and all the context needs and you say, now create, write it in a way that’s easy to understand. Perhaps translate it to another language because not all of us are native English speakers and not everyone wants their documentation or explanations to be in English. Translate that. This is amazing for AI. But we also need to understand and acknowledge the limitations of AI. One is not assuming it could understand things that are not in the code. But another is that also code is not always easily understood, even by AI.

Just for given a few examples here. If you give AI one function and it’s written in a way that is very clear, the flow is linear or the function name is clear, it’s documented, of course, it’s easy to understand, right? You take the same function, you rename it, you remove all the comments, and you change the variable names to be something a bit less clear, which might sound like an exercise in bad coding, but I’ve seen enough codebases to know that it happens in many companies, okay? Then all of a sudden, AI is starting to make some mistakes when understanding it. And then you get to complex flows. And when you have flows, you have lots of cases that compilers know to take into account that an AI doesn’t. For example, you have ambiguity resolution. So let’s say you have a call to a function called find, and you have the find of, I don’t know, JS find. But you also have three different find functions in your codebase. And you call find. And if AI picks the wrong implementation of find, it can get the whole flow wrong, right? So ambiguity resolution is one case. And also, if the code involves looking at resources that are external, like creating from a database that it might not have access to.

So in short, I think AI is a super, super, super powerful tool. And especially in explaining things to humans and also allowing humans to ask questions in natural language. It’s amazing. But we shouldn’t expect AI to understand the code. It doesn’t really understand the code. And when you provide it with complex codebases that are convoluted with mixed conventions, specific domain knowledge that it hasn’t been trained on, of course, sometimes even misleading comments that are not up to date with the code, variable names that sometimes are cryptic and sometimes are even confusing and misleading, AI makes mistakes.

And I’ve tested it thoroughly, okay. Here at Swimm, we also create documentation automatically from codebases. And one of the first hypothesis was, okay, let’s just use an LLM. Let’s try that. And we tried a lot, okay? And it made us understand it’s an amazing tool, but it has some very fundamental limitations. And that’s where you need other techniques like static code analysis and other things you can do that when you join with AI, you get clear, coherent documents or other forms or written knowledge explained in a way that is useful for humans to understand.

[00:26:34] The Limitations of LLM in Navigating Legacy Codebases (e.g. COBOL)

Henry Suryawirawan: Yeah, I think so for some people who have used AI coding assistants a lot, right? Especially working with like a bigger codebase, very complex, you know, written by so many developers. I think that’s also one thing, right? Because you can see the amount of inconsistency or like not so coherent kind of a code from one as… one module to the others, right? And variable naming as well, right? Some people like to use certain terms, the others use other terms. And they can be duplicates, but mean different things. So I think that there’s really a big challenge here if you just rely on LLM, right? Because LLM will just take it words by words.

And I think you mentioned a very good point about combining it with, you know, maybe like static code analysis or other kind of, I don’t know, like compiler’s ability or something like that. Because computers are also good at that, right? Not AI. Maybe tell us how do you actually combine these results? Like for example, any specific study or maybe a customer case that you have solved using these kind of techniques?

Omer Rosenbaum: I think the most extreme cases we had, in that regard, and it’s also our focus now is actually legacy codebases. And more specifically, mainframe legacy codebases with COBOL, which is a language that I had never run into before I started working on this problem for those clients. And since then, we, we’ve doubled down on COBOL, right?

But it’s a language where most of the code is not available online, or there is almost no real COBOL code available online. The code that you have on GitHub doesn’t look like the code that companies run. And it stems from multiple reasons, but I think the most important one is that when GitHub launched, no one was writing COBOL in an organization that wanted to publish their code, right?

Let’s say you start a new project today and you work even not with an AI assistant, you start with lots of libraries. You start with Python, you have your libraries for Python and frameworks, you start with JavaScript, whatever, right? You have lots of frameworks. And those frameworks are built on open source. When people developed big codebases in COBOL in the seventies, in the eighties, they didn’t have libraries. They had to reimagine everything themselves. So every organization looks very, very, very different. So LLMs don’t have access to real world COBOL code and LLMs don’t have access to the specific code of your organization. And in COBOL, you have cryptic variable names all the time. And the structure is different than other code languages. So it was, I think, the most extreme we saw, and we, clients try to send some code to an LLM and ask, okay, what does it do? You get super generic and confusing and wrong results part of the times.

And there, what we did was we wrote a COBOL parser that actually takes the code, parses it syntactically, and connect the dots together in a way that makes sense for an LLM. So in the end, for example, we have a variable and we want to understand what it does. We take the variable name. The variable name can be reused in lots of different places in the codebase. So we find only the occurrences that are related to this occurrence of the variable. And then we send all of that context to an LLM and we ask for a summary about this variable. So when the LLM gets the right context and only the right context, it does a great job at explaining it in natural language.

If you just throw the old code, we said it, it starts combining different variables who have the same name. It happens a lot in COBOL. And you know, I’m not blaming it. It’s really hard to parse all of that COBOL and understand what really belongs together and what doesn’t.

So what we do is first analyze the code statically. We build our own internal representation of the code, how things relate, what function calls other functions, what variables are used, where hierarchy of say a flow. And then we slowly build the knowledge by sending small bits to an LLM to explain, after we do a lot of work on cleaning everything. And this is an example I think of combining static analysis or code that analyzes code in a deterministic way, not something probabilistic that an LLM produces, right?

And then using an LLM for what it does best, which is taking text or specific parts of code and explaining it in natural, coherent language. At the last phase, we also asked the LLM to generate parts of the documents we show to the user. Cause again, the LLM is great at formulating that in coherent English that is clear, that explains the story after it already has all the context that we built, uh, bit by bit.

Henry Suryawirawan: Wow! I think it’s a very novel approach, so to speak, right? And especially, you dealt with the most extreme codebase available, I think, COBOL. I think I also didn’t have experience, uh, with COBOL, right? I can only imagine like the difficulty dealing with such legacy codebase. And I think you brought up like a very good realization for people. Because I’m sure a lot of tech leaders or senior executives think, okay, now we have AI, any kind of codebase, it can understand and explain to us. So we probably don’t need to be concerned so much about, you know, losing the ability of understanding the codebase. And we can even probably hire some, you know, maybe less good developers and just use AI to fix all the problems we have. I think maybe for simple cases or maybe more up to date libraries and programming languages, you can do that. But if you look back, we have so, so many legacy systems, right, written so many years ago, with the people also leaving, the business knowledge probably is also changing a lot, right? So I think this is one task that probably AI will not be able to do, and maybe combining different kind of approach, like static code analysis, uh, would make AI works much better.

[00:32:38] Effective Knowledge Sharing Culture in the Age of AI

Henry Suryawirawan: So maybe in terms of knowledge base, right? With the ability of AI these days, any kind of medium size or maybe large organizations, what will be your advice of doing some practices or maybe cultural things that actually can fit into AI and help, you know, the knowledge sharing aspect or knowledge based aspect becomes much more effective and maybe even like a multiplier effect within the organization, right? Any kind of practice and cultural things that you can share?

Omer Rosenbaum: Yeah. So I think the first thing is to acknowledge that it’s really, really important, and for all the reasons we said before, right, especially now, AI will read the knowledge base, right? So you want to invest in it, and it’s gonna be a multiplier effect, as you said, Henry. I think, specifically, you want to create a culture, as a leader, that values people who capture knowledge. Some organizations used to look at these people as wasting their time or doing the easy task by just explaining what there is instead of creating. I don’t think it’s a viable argument anymore. I didn’t think it was a viable argument back then, but let’s say, it’s arguable. I think, now, it’s not a viable argument at all. It’s clear that if you document what’s happening, it will help AI accomplish anything afterwards.

The other thing is that you should put effort into finding the tools that will help you with the task of writing this comprehensive knowledge base and documents with keeping it up to date as the code evolves. Cause otherwise you just have misleading and wrong information.

And third, to be able to find that information. Again, both humans and AI should be able to find the information they need when they need it. So you should invest in those tools to help you with that task, and also create a culture that values the knowledge creation, preservation, and sharing.

Henry Suryawirawan: Wow! I like the emphasis you put into like valuing people who actually do the so-called… I would say it’s a hard job actually, to actually capture knowledge, distill it, summarize it for other people to understand. It’s actually not an easy job. I would say it’s maybe becoming more valuable now, because you can fit it into AI as a context and, you know, everyone can benefit just by one simple writing, right? And maybe it can be reused multiple times.

[00:34:54] Keeping Knowledge Base Up-to-Date

Henry Suryawirawan: So I think the other challenge about, you know, knowledge base, documentation, and all that, keeping it up to date, right? Any documentation that you have within organization, I’m sure most of them are still not up-to-date. Maybe even wrong when you read it again. So tell us maybe some good practice that we can do to actually make it up to date.

Omer Rosenbaum: Right. So I think there are two kind of ways to approach it. And we, at Swimm, spend a lot of time working on this specific problem. So very emotionally attached to it, I would say. I would say that nowadays, there are basically two approaches.

One is regenerate the documentation every time. You generate documentation automatically. Just regenerate it all the time. If you regenerate now, it’ll match the code state right now. I think it makes sense in some cases. For example, for API documentation, it could make sense. But what you’re gonna lose if you regenerate every time is additional context that is not there in the code. And you need a way to preserve that, because that could be the most important piece of knowledge that is written there.

So another approach is to somehow track the changes made to the code that is referenced in specific documents and then update those parts of the documents based on the changes and maybe ask for a human intervention in case the code changed drastically, for example.

So this is actually something we provide with Swimm. When you create a document with Swimm, we track the changes made to the code you relied on in the document. And we either automatically update the document or if the change is drastic, we tell you as a human, please decide what you want to do from now. If you want to reselect this part of the code, if you want to rewrite it, maybe this part is no longer relevant. Maybe you need to add some unique information. But the goal here is to understand that a lot of the unique knowledge that only developers have in their minds is what you need to work so hard to preserve. Therefore, you can’t just rely on AI generating documents.

[00:36:55] Keeping the Organization Knowledge Base Accurate

Henry Suryawirawan: Yeah, so I think looking in the past, right? When I have difficulties finding knowledge, right? Maybe sometimes the knowledge is there, but we just don’t know where to find, right? I think that’s one thing. And keeping it up to date, right? Because sometimes, oh, I find this piece of documentation, I read it. Well, if we assume it’s correct, but it’s wrong, right? It’s also quite dangerous, right? And I think we have also multiple tools within the organization, which is in like silo, right? So for example, some information may be in our tracking, ticketing tracking system, right? Uh, maybe let’s say in Jira, some in, you know Confluence, some in Slack, some in email. How do you actually build these kind of linkages, references? And again, like probing people to actually, hey, these parts of the knowledge base is not up to date. I think this is like a real world challenge if we can solve it, right?

Omer Rosenbaum: Actually, AI has a great promise in the sense of accumulating all of this information from across the organization, right, from say, JIRA, Slack, documentation tools, and others. And I think it’s suddenly possible to just ask a question and get a response from various resources. The key here is to understand that some of these resources are more historic references than provide actual up-to-date information, which is also sometimes valuable, right? Like the Jira ticket can tell you what a product manager wanted you to accomplish at some point, right? At least, most of the times, it won’t tell you what’s actually happening right now, right? But you have a code documentation software that actually explains what happens in the code and keeps it up-to-date, then you can relate on it.

So I think for AI coding assistants or AI tools that help you find information from across the organization, they should always explain what resources they’re using to formulate the responses and perhaps mark some of those messages or snippets of knowledge as how likely they are to be up-to-date.

Henry Suryawirawan: Yeah. So I think, uh, it’s very challenging, right, in the first place, accumulating. So I think what, provided that we can give AI the tools capability to maybe like, I don’t know, like crawl our knowledge base and, you know, get the context and all that.

[00:39:08] Fact Checking and Preventing AI Hallucination

Henry Suryawirawan: I think we all know one danger of AI, LLM is actually the hallucination part, right? You mentioned about providing references and all that. But assuming that we have like large knowledge base, how do you actually ensure that it is not hallucinating? Because sometimes, the hallucination could happen in a very small part of the summary that it generates. Especially now these days we have like a deep research tool, where it can do, on its own, in hours, whatever that is, and provide you a summary. But always the challenge is like, how do you fact check it, right? How do you know which part is hallucinating? Or maybe they provide statistics. How do you know it’s actually correct statistics? So do you have any experience in, you know, doing this fact checking and preventing hallucination to actually make your decisions wrong?

Omer Rosenbaum: Yeah, so we worked a lot on it when working on Swimm, when we generate documents to make sure they reflect the accurate state of the code. And for that, we do lots of things, but I think the most interesting part in terms of what the end user can get from it is that when we generate the documents, you see for everything we write, what we relied on to provide that information. We show you that it’s grounded in this part of the code or that part of the document, and you can validate it yourself.

And in addition to, of course, making sure we do everything we can to avoid hallucinations by all kinds of techniques that are available to eliminate or at least decrease hallucinations, it can always happen, right? There is some hallucination. And as the end user, you can never tell if the LLM made a mistake or maybe the other providers made a mistake. So I think you should trust tools that show you what they relied on. And if you want to incorporate a tool into your organization that gives you answers based on questions, based on your own knowledge base, you should enforce the fact that they give you resources or citations to everything they output.

Henry Suryawirawan: Yeah, I think it’s pretty dangerous, if you do not have the citations, you know, the references, right? And even if these days you do have the citation, sometimes I, from my experience right, they give you a citation, but sometimes the summary itself can still hallucinate a little bit. So I think that’s very interesting, uh, experience as well.

Omer Rosenbaum: Yeah, I agree.

[00:41:24] The Potential of MCP

Henry Suryawirawan: The other thing, these days people are crazy talking about, you know, this agentic capability of AI and also maybe the MCP protocol, right? Maybe tell us the next evolution do you think you can see using AI, you know, with all these cool things. And especially, in the context of documentation and knowledge sharing, is there anything that you can see up and coming?

Omer Rosenbaum: So I think MCP will be a game changer in the sense that you will see lots of information being fed all the time, and I think the, it’ll help create a flywheel effect where when you put the effort into generating valuable documents or a knowledge base, then all of the AI assistants would be able to reach it, find the relevant information, and make use of it. And then we would get some mind blowing things, right? You can have an AI assistant that analyzes your Jira tickets and provides a summary and, all of a sudden, it knows what’s happening in the codebase. And it gets the broader context from a document that was written partially by AI and partially by a human. And I think that’s what we want to get, right? But to get there, we need to make sure we create these explicit knowledge fragments alongside the way. And also that we rely on them. So we guide, say, the AI coding assistant to rely on specific resources.

Henry Suryawirawan: Yeah, I think that providing all these bits of information, again, coming back to what you said, right, the value of writing, you know, capturing the knowledge, I think will become a key. And then the next part is actually to expose that, right? Maybe in an agentic manner, right, using these MCP protocols. I’m actually really excited about this MCP capability, especially when doing coding, right? You can communicate with different tools and ask it to do certain things just by natural language. That can be really super powerful, right? And I think I can be certain that once we see more and more agentic capability may be provided by different companies and tools, we can see these multiplier effect.

[00:43:24] The Danger of AI Agents Hallucinating with Each Other

Henry Suryawirawan: Although the danger will be even worse, right? Because if let’s say some of these agents hallucinate and they all hallucinate with each other, like we probably lose track of what kind of things they used to deduce the decision, right? Any take on this from you?

Omer Rosenbaum: You know, just this week, I saw a friend of mine posted that he used Claude and I think it was Cursor. And Cursor ran rm -rf and deleted lots of his valuable information by mistake, of course. And, you know, it’s like a funny example, but those things will happen.

So I think if we go back to the beginning of our discussion about junior developers taking code that they don’t fully understand and committing it to the codebase, right? If you have an agent doing that, you have to somehow constraint it and validate the output that it generates.

And I think, as the next step, you know, this week with an announcement of A2A like Agent2Agent, so we’re talking about new protocols for agents communicating with other agents. And at some point, it’s gonna be hard for a human to understand what’s going on. And that’s where I think we’ll have to stop and think, right? Like what’s actually happening here? Where should we have the human in the loop, and we’re not? I think it’s gonna be really exciting times in that sense.

Henry Suryawirawan: Yeah, I think a stop-and-think will be a point in time where we all realize, okay, we probably hallucinate ourselves thinking that AI will solve a lot of problems. So yeah, probably one day we will have to build guardrails, you know, constraints such that AI won’t lead us to like dangers, right?

[00:45:00] How to Get Better at Research

Henry Suryawirawan: So I think these days people talk about AI, I’m sure every team, every organization also want to integrate AI. Somehow build capability, you know, build something on top of AI model, LLM, whatever that is. And for that they need to do some kind of research, right? Some companies have that capabilities, but most of the companies, they don’t have this knowledge and capability, right? And doing research is partly something that, you know, some organizations find it challenging, maybe finding the time, finding the resource. And you brought up a good point before our discussion thing here. But, you know, as a product company or maybe like a business organization, if you wanna do research, what’s the best way to approach it, right? Maybe you can share a little bit so that people who want to build capability by doing research can do it more effectively.

Omer Rosenbaum: Sure. So I think, first of all, we need to kind of define what research is. Engineering organizations are usually called R&D, right? So it’s research and development, and the research piece comes first. But I think in most teams, there is no pure research. And that’s fine, right? Usually you have the research as in like problem solving and you need to find the best way to do something and you need to learn alongside the way. That’s fine. That’s all fine. But it’s part of development, right? And I think where I draw the line is if you know the task is achievable and you know the approach, the right approach to get there, then it’s development. It’s research when you have a task, you’re not sure if it’s possible or you’re sure it’s possible, but you really don’t know how to get there, cause there are so many different options and it’s unclear, that’s when it’s research.

So for example, if I have to, it could be even a really hard development task, of course, right? Let’s say I would want to implement, I don’t know, VS Code from scratch, right? There are lots of things I don’t know. I would need to learn along the way. I would need to design the architecture. I’ll have to work hard on it, right? But it’s all development. I know it’s, I know what the output looks like and I know that it would involve a lot of engineering. Whereas research, let’s say, I have a COBOL codebase and I need to generate useful documents. I’m not even sure what documents are useful at first, right? I need to learn that. And then I need to find different ways. Should I go with generative AI all the way? Should I go with static analysis? Combine them? Where? And that’s more of a research.

So I think when you want to work on a product, and research is something that is blocking. For example, I don’t know if it’s possible to achieve this, right? And the key difference between development and research is time estimation. In development, I mean, it’s notoriously hard to give real time estimations. But usually when developers say, this will take me a week, it won’t take a year, right? Usually. Like there is some link between the estimate and how long it takes. With research, you can sometimes just don’t know, right? Like, I don’t know. Maybe it’ll take me two days because there is an easy win and maybe I’ll get into a world which would be much harder to pass. So I think it’s something to acknowledge.

And when you have a research that is guided by a product, it means you have some problem you want to solve and you need to do a few things. One is to lay the entire flow from beginning to end, even though you can’t solve all the intermediate stages. For example, in the example of I take a COBOL repository and I need to generate useful documents automatically. The first thing I will do is take a COBOL repository to play with and generate the documents by hand manually for myself and get feedback on them. Are these documents really valuable? Is this where I’m heading to, right? Once I know, that’s what I want, say, okay, what do I need to do? So I need to, say, parse the COBOL repository, and I need to find, say, a few components, for example. Okay, I don’t know how to find the right components. So for now, I’ll wrap that in a box and I’ll keep going to the next step. Now that I know what components there are, how do I document the component? And I actually write that on a whiteboard with boxes and I keep them closed. I don’t wanna open the boxes now. I want to make sure I can understand what the process would be. And then before I open a box, because the most, the thing that makes, you would want to do is open an interesting box. Peek inside, try to solve it, right? But it might be irrelevant. So you need to first make sure you can achieve everything. If everything works. You say, I assume all the boxes work. Will this work? Yes, okay. What don’t I know now? There is this specific box. I’m not sure it’s possible. I don’t know if an LLM can read a COBOL program and describe it. Okay.

And then the key thing, and this is I think the crucial thing when managing research, is to deliberately pause and think about the different directions together. Cause every researcher will do the research themselves, right? Let’s say you give someone a task of understanding what a program does, okay? They will read the code and try to understand what’s happening. But you can help them by stopping and thinking what their best technique is.

When I led a cybersecurity course, we taught reverse engineering. And one of the exercises we would give was we would give, as part of this course, right? So the students would reverse engineer more and more applications and they, they would get a game. And the question was, what are the rules of the game? And after an hour, we would stop them and we would show how to approach it correctly, which is you open the game, you click on help, explain, and you have a textual description of the instructions, right? And the lesson learned is you don’t always have to reverse engineer by reading through the code, right? And what we wanted to teach them is that before you jump into one way of solving the problem, stop and consider different solutions.

So what I usually do when I work with people on research tasks is draw it as kind of a tree. Like, okay, we are here. How can we solve it? We have option one, two, and three. Okay. We don’t have a time estimate because we don’t know what we’ll find out. If the LLM can just read an entire codebase and give us great documentation, okay, we’re done here, okay. Let’s give it a day and see what happens. And I usually call it time-to-live. Like for how long are we going to work on this before we stop and rediscuss what we found out and whether we should keep pursuing this specific direction or change to another direction of the research.

And one of the most important things is to make everyone stop and think about the various ways to approach problem. Sometimes, the easy solution is there, but you need to think about it. Sometimes, it’s just clicking help. And you have the solution. You don’t have to read through the code, right? And I have many, many, many examples for that. Also not from courses, right, from real life where people, you know, in retrospect they say, oh right, we should have done this, right? And when you work on a product, you don’t have all the time in the world to just do research. You have to make sure you can provide a product to a user in a timely manner.

So to summarize, I think one crucial thing is to understand that time estimate is hard. What you can do is give it time-to-live. How long am I willing to spend on this before I stop and reevaluate? The second thing is to make sure we get the end-to-end process, from the input to the output that the user sees or that the other product takes into account as an input and so on. And the third part is pausing and thinking together about the different ways to approach a research task. Because, again, what characterizes research tasks is that it’s unclear how to make progress. So we need to stop and think together on how to approach this.

Henry Suryawirawan: Wow! I think, uh, so many good nuggets, I would say, right? Because doing research by itself is kind of like unpredictable, right? So like what you mentioned, right? You don’t know the time estimation required, the effort required. Sometimes it could be easy, right? If, let’s say, some, you know, one day you find, oh, there’s a library that you can use. But most of the times, you know, like you don’t have the skills, you have to gather a lot of knowledge, maybe ask expertise and things like that. So I think that you have given some good things. Uh, I would just call out a few things that I could remember, right? So the first is try to analyze the process, the workflows, the direction that you’re going into, right? Because sometimes we can go into rabbit hole easily, uh, you know, especially playing with technologies, right? So techies, we all love playing with technologies. We keep digging and digging, but maybe we go to the wrong direction. The time-to-live, I think it’s also very crucial, right? You can’t spend all the time just doing research that goes nowhere.

[00:53:41] The Importance of Investing in Research

Henry Suryawirawan: And I think many people find it difficult to juggle or even like for example, justify the value of doing research. And we all know these days, especially again, like bringing up the point of the age of AI, right? If you don’t do research about capability of AI that could help your business, maybe you will lose out in the next, I don’t know, the time span is can be really short these days, right? So how can maybe business leaders or maybe executives have this in mind, right, to spend some time doing research, even though it’s difficult, unpredictable, maybe cannot justify the profits and the revenue coming out of the research. Maybe can you give us some examples here?

Omer Rosenbaum: So I think on a personal level, say, if you are a CTO, I think one of your responsibilities is to know how technology can empower your business. And this is also internally, right? So as a very recent example, you’re a CTO. You keep yourself informed with what’s happening. You know that there are great AI coding assistants can help people become more effective. You go to your engineering teams and you introduce them to their tool and you help them adjust or adopt new tools. Some people will always be a bit wary of trying new things, right? And I think, as leaders, one of our responsibilities is to show them, look how easy it is, look how useful it could be. So this is like on a personal level and how to incorporate new tools or methodologies or techniques to the organization. So another great way to drive change is by giving talks. So get your company team, group together, and give a live demo of using a cool new tool, for example.

So that’s more about incorporating new methodologies, techniques and tools. When we talk about deep research, I think it’s not for every organization. I’m not going to say that every organization needs a research person or team, right? But if you do, a new technology takes time and a different mindset. You can’t expect a research team to operate the same way as a development team with a clear timeline for every milestone. So you can adjust people there, assign them to research tasks, make sure that the value for the product is clear and easy to get relatively easy, at least. And get those people to understand what research means. Make them be professional at managing research, at assigning time-to-live to different directions, at having brainstorms about what the best way to approach a specific issue is. If you don’t have this expertise, it’s fine. Consult with others who do. There are people who are experienced researchers that work differently from people who are not experienced researchers. It’s the same with engineers and anything else, right? But research is a skill. It’s a skill that people can improve at. It’s a skill you can learn and you can get help from others if you don’t have the experience.

Henry Suryawirawan: I think that those are really, really great advice, right? I particularly like about the aspect of changing your mindset, right? Because people think doing research is straightforward, right? You do research, you get something out of it, and you can use it straight away. Uh, especially people think, okay, now with AI you can even get more intelligence, right? You can speed up the research, whatever that is, right? But I think especially doing something that you are not capable of in terms of capability within the organization is something tricky, right? You can’t sometimes justify the effort. So hopefully people today learn a lot of things, you know about knowledge base, research, using AI for documentations and all that.

[00:57:18] 3 Tech Lead Wisdom

Henry Suryawirawan: So Omer, as we reach the end of our conversation, I have one last question that I’d like to ask you, which I ask to all my guests. I call this the three technical leadership wisdom. If you can think of it, just like an advice to us, what advice do you wanna give to us today?

Omer Rosenbaum: Okay. So I think the first one would be put your time and effort into your people. I mean they make all the difference, right? And it means talking with them about how they are and what can help them. And making sure you help them grow in their position. Finding them is hard. Leading people is sometimes hard, but I think it’s also the most rewarding part of the leader’s job.

The second thing is, in this era, you must be open-minded to trying new things. I don’t think it makes sense for someone in 2025 to work even the same as they work in 2024. And it sounds almost childish, right? I used to make fun of people saying things like that. But nowadays, it doesn’t make sense. Things change really fast. And in order not to stay behind, you have to be on top of it. So you have to be open-minded and keep yourselves informed.

And the third thing, in case it’s viable for your organization, put the time into research. Because it can open up new directions and it can help you in ways that you haven’t dreamt of before. Sometimes spending two days on a research task can make you change your decisions completely. So, um, I think allocate the time in case it’s relevant for your business case.

Henry Suryawirawan: Yeah, specifically about being open-minded. I myself also quite concerned, you know, with some of my habits and skills that I learned in the past, right? Whether that it can still be relevant, especially the pace of changes these days is so rapid, right? Every day you would probably hear, oh, there’s a new way of doing things. There’s new tool that can help you do what. Being open-minded and willing to try and willing to be challenged, I think is also another thing that I feel, sometimes, as a senior, right? We think we know the problem well, we can solve it by heart. But sometimes, yeah, there are new ways of doing things these days.

So thank you so much for sharing those wisdom. So, if people want to connect with you or you know, asking you more things, is there a place where they can find you online?

Omer Rosenbaum: Uh, sure. So you can reach out to me via e-mail. It’s omer, O-M-E-R, at swimm, that’s S-W-I-M-M .io. Also on LinkedIn. Though I don’t really use social media as much, so if you send me a message, and I don’t get back, I apologize but I probably didn’t see it. I do answer to e-mails. And I’ll be happy to stay in touch.

Henry Suryawirawan: Thank you again Omer for spending the time today. I think we all learned a lot about using AI and building knowledge base and doing research as what you advised just now. So thank you again.

Omer Rosenbaum: My pleasure. Thank you for having me, Henry.

– End –