#165 - Learning to Program in the Era of Generative AI - Leo Porter & Daniel Zingaro

 

   

“As software engineers, only a fraction of your time is spent coding. A lot of your time is spent thinking. And I’m not seeing LLMs taking that away from us anytime soon, at least, for now."

Can AI help you learn to code? Will AI take your developer job? Join me discussing these topics with Leo Porter and Daniel Zingaro, the co-authors of “Learning AI-Assisted Python Programming”.

In this episode, we discuss the impact of AI assistants on how we learn and approach programming, particularly for students and educators. We examine the shifting skillset of developers, emphasizing the importance of code reading, specification, testing, and problem decomposition over syntax and library semantics.

We also confront critical questions like the ethical implications of AI, the potential impact on developers’ job, and whether it can help lead us to a more equitable society.  

Listen out for:

  • Career Journey - [00:01:11]
  • AI Assistant - [00:07:55]
  • How AI Assistant Affect Student - [00:11:04]
  • Problem Decomposition Skill - [00:16:46]
  • How LLM Works - [00:19:47]
  • Prompt Engineering - [00:23:36]
  • Automating Tedious Tasks - [00:29:29]
  • AI Ethical Issues - [00:33:30]
  • AI Replacing Developers - [00:40:08]
  • A More Equitable Society - [00:47:34]
  • 3 Tech Lead Wisdom - [00:55:58]

_____

Leo Porter’s Bio
Leo Porter is a Teaching Professor in the Computer Science and Engineering Department at UC San Diego. He is best known for his award-winning research on the impact of Peer Instruction in computing courses, the use of clicker data to predict student outcomes, and the development of the Basic Data Structures Concept Inventory. He co-wrote the first book on integrating LLMs into the instruction of programming with Daniel Zingaro, entitled “Learn AI-Assisted Python Programming: With GitHub Copilot and ChatGPT”. He also co-teaches popular Coursera and edX courses with over 500,000 enrolled learners. He is a Distinguished Member of the ACM.

Daniel Zingaro’s Bio
Dr. Daniel Zingaro is an award-winning Associate Teaching Professor of Mathematical and Computational Sciences at the University of Toronto Mississauga. He is well known for his uniquely interactive approach to teaching and internationally recognized for his expertise in active learning. He is the co-author of “Learn AI-Assisted Python Programming” (Manning Publications, 2023), author of “Algorithmic Thinking” 2/e (No Starch Press, 2024), co-author of “Start Competitive Programming!” (self-published, 2024), and author of Learn to Code by Solving Problems (No Starch Press, 2021).

Follow Leo & Daniel:

Mentions & Links:

 

Our Sponsor - Manning
Manning Publications is a premier publisher of technical books on computer and software development topics for both experienced developers and new learners alike. Manning prides itself on being independently owned and operated, and for paving the way for innovative initiatives, such as early access book content and protection-free PDF formats that are now industry standard.

Get a 40% discount for Tech Lead Journal listeners by using the code techlead24 for all products in all formats.
Our Sponsor - Tech Lead Journal Shop
Are you looking for a new cool swag?

Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available.

Check out all the cool swags available by visiting techleadjournal.dev/shop. And don't forget to brag yourself once you receive any of those swags.

 

Like this episode?
Follow @techleadjournal on LinkedIn, Twitter, Instagram.
Buy me a coffee or become a patron.

 

Quotes

AI Assistant

  • An AI assistant as a piece of software that helps you get work done more efficiently. But the key to it is the way you communicate with it.

  • Typically, when people think about using computers, they think you need to be very rigid in how you communicate with a computer. And that’s what we would say when we teach programming courses. Every symbol matters, every space matters, every key press that you make is important. And that’s the language of computing. It’s very precise.

  • What an AI assistant allows you to do is communicate in English or any natural language. And there are way more people in the world who know natural languages compared to programming languages. And so the hope and the goal is that people be able to use their own language and have the computer translate that. Cause computers still can’t run or do anything with these languages. Like English, for example. Still, it has to be translated into something a computer can work with.

  • That’s what we’re trying to automate with these AI assistants. It’s like making it so that people can communicate in their language and have it automatically translated over to lower-level stuff that the computer understands.

  • What’s great about these AI assistants is it’s almost a step in the natural progression of making programming and interacting with computers easier for humans.

  • We’ve seen this evolution from you have to push buttons on a machine to make do things. And then writing the Assembly code was a huge improvement of the stored program computer. And then we moved to the point that we could start writing in higher-level languages that were more English readable than Assembly, compiled down to Assembly, and actually then the Assembly ran. And then over time, we’ve developed more and more advanced languages that become easier to express our goals in the language available to us.

  • What’s unclear to us is whether these LLMs are gonna be just the next language in which we interact. Right now, you can interact with them and get working code fairly often, but it’s not always correct. And so it’s not quite the same as a compiler, which is deterministically gonna be correct. But it does seem like the next step in a really nice evolution of making it easier and more accessible to write software.

How AI Assistant Affect Student

  • We’ve been experimenting with how do you teach new learners how to program in the presence of LLMs? And I’ll be upfront that we don’t have all the answers yet by any means. I think that’s gonna be probably a decade before we actually really know from the research what the best way is to teach students, now these new tools are available. Students very quickly recognize how powerful the tools are.

  • So the question then is what are we teaching the students? What changes here? There’s still a lot to teach students. It’s just the scoping has shifted and what the skills are shared. We also have to say what our goal is. Are we training the next software engineers or are we training someone who works in business or data science or accounting to be able to write software that does something useful for them?

  • We’ve kind of realized that the skills that you need to interact with an LLM are actually fundamental programming and software development skills.

  • The general workflow in working with an LLM is you have some desired function that you wanna write, you want to accomplish some small tasks. And you then describe that task and then the LLM is gonna generate code for you.

  • Now, the code it’s gonna generate may not be right. It may not even be close to kinda address what you want. If you read through it, you can quickly recognize this isn’t what I want. And then you can pull up basically a window of saying, are there other solutions that are good for me? And so already students need to know how to read code, understand what it’s doing, potentially be able to trace code, and be able to pick from multiple code examples which one’s gonna work.

  • Then the next step for them, once they’ve kinda picked one that they think works, is because you can’t trust the LLM. And this is actually a point that I think is a really encouraging piece for new students, is testing has been a point where new students struggle. They tend to write code, and by definition, assume it’s right, which is basically the opposite of what we all do as software engineers. You write the code and you gather tons of evidence for it being correct before you have any faith in it. And so the students, because it’s coming from a machine that they know that makes mistakes, they are actually more willing to test.

  • Then you test the code. Writing good test is super important and something that we haven’t taught as well. And then once they’ve tested it, now they know that piece of code is working.

  • The one catch is sometimes the code doesn’t work and sometimes you can’t get the LLM to give you the exact right answer. And so there’s still this last step of being able to debug. And so you have to teach explicitly how do you modify code that’s slightly buggy to do what you want. And so we’re still teaching debugging skills and that also is a fundamental skill.

  • What’s different here? There is a shift to reading and modifying and testing code away from looking at a blank screen and writing code from scratch, which is what we used to do the most of.

  • The syntax of a programming language is something that takes some students weeks and some students just don’t get past it. It’s extremely stressful for students and also artificial. The only reason we need all this syntax is because the compiler has to be able to unambiguously understand what your goal is.

  • And if you’re getting stuck on syntax, there are many people who just want to do something with the code. They don’t need to know what exactly each piece of syntax is doing. What’s super exciting for me and Leo about LLMs is I think for the first time, we see a future where people who don’t know how to program could be afforded some of the same benefits as people who do.

Problem Decomposition Skill

  • You can’t just give a large task to an LLM in one go. And that’s actually a super important skill that we’re now teaching our classes that candidly we did not in the past.

  • What we used to do in the past is students are given essentially just a function. The function’s basically perfectly described, because you need to auto grade it. And so every possible case is covered in that description. And then they have to just fill in the code for that function.

  • Now, LLMs do that incredibly well. The shift now is if I give a fairly vague task, like a large project to work through, how do they break apart that large project into smaller tasks that the LLM can then help them solve?

  • This is problem decomposition. This is probably one of the important skills we learn as software engineers. And it used to be that we didn’t teach that to new students learning how to program until much later in their careers. And now it’s actually front and center, incredibly important to learn in your first programming class, because that’s how you have to interact with these LLMs.

How LLM Works

  • We all joke about shutting your computer off, turning it back on if the thing you’re trying to do doesn’t work. But I think many of us go under the assumption, in our day-to-day computing lives, that computers are deterministic. If you do something, you’re gonna get the same response.

  • If you write a program and you run it, you’re gonna expect that if you run it a second time, the same thing’s gonna happen. And if it doesn’t, you probably start thinking about maybe I have a memory allocation bug or some sort of like transient behavior problem in my program.

  • LLMs are inherently non-deterministic. So you ask for some code and you get the code. And then you ask a second time, and you’ll get a different code. And you ask a third time, and you’ll get a different code.

  • This is, first of all, kind of challenging as an instructor. It makes it very difficult to plan sometimes, because typically our lectures are sort of scripted in some ways where we need to demonstrate specific things in our lecture. And it’s very hard to do that when you don’t know what the LLM is going to respond in real time in class.

  • On the other hand, there’s also a benefit, believe it or not, of being non-deterministic. And that is because these things can make mistakes. Imagine how frustrating it would be if you asked it for some code and they gave you some code and it was wrong.

  • In this case, you don’t want it to be deterministic, right? ‘Cause then you’re just gonna get the same wrong code every time. So the fact that it is not deterministic means that you have a chance, even if the most probable response is wrong, maybe you can ask again or look at maybe the top 5 or top 10. And maybe you could pick out the correct code from that list.

  • And this is a skill that our students or learners did not need before, but they do now. Because the first response may not be correct, students have to know how to go through the list of potential solutions and figure out which ones are perhaps not correct immediately, but which ones are worth further testing, further consideration.

  • We’re not yet at the point where you can give English or whatever natural language instructions to the LLM, and get back your language. You’re still getting back Python code. And so learners still need to understand and work with Python, not at the syntax level. We’re spending less time on the low level syntax details, but they still need to understand Python. And one of the reasons is so that when this non-determinism is happening, they can look at and evaluate a bunch of different solutions for which one may be correct.

Prompt Engineering

  • Prompt engineering is just the task of writing a prompt in such a way that the LLM will give you back a response. And what is tricky there when we’re teaching people who don’t know how to program to start is the LLMs do very well if you describe problems in a technical language.

  • I’ve really specified the behavior, and the LLM does a much better job for that. The problem is, I’m using keywords that you have to teach learners. And so there is still this task of teaching them how we would speak about these functions.

  • They are reading from large code bases and they’re learning from those large code bases. So in a sense, what we’re trying to get them to do is just give that function header that a human would’ve written to describe their function. And if you can generate one that’s very close to the behavior you want, the model’s gonna find something very similar in training set, then generate code that’s gonna be paired with that.

  • The other thing you can do with prompts is there’s a whole bunch of ways in which you can basically tell the LLM to behave in a particular way.

  • People have started to catalog the different ways of interacting with the LLM. It kind of reminds me of the object-oriented patterns.

  • One of them is what if you don’t know what information the LLM needs to perform your task? You have to be very precise sometimes in your natural language. Hopefully not as precise as you do with programming, but you still need to know a lot of the terminology that you might not know. And so one thing you can do is you can use this flipped interaction pattern where you ask the LLM to ask you questions.

  • There’s another pattern that I find a kind of interesting which is called the persona pattern. And a lot of educators are using this to good effect right now where you can ask the AI to act like a specific kind of person. And so what educators right now are doing is they’re using the persona pattern, and they’re saying, okay, LLM, you are a CS-1 instructor. And that conveys a lot of information, like, don’t use advanced programming concepts that have not been taught yet.

  • You try to scope the types of responses that you get. You try to change the types of responses from the default ones, because there are a lot of situations where the default responses might include code that students have not seen before, might use types of code that you don’t want them to see yet, like not in the scope of the course. It’s kinda amazing to me how much leverage you can get by just telling the LLM how to behave in the upcoming interactions.

Automating Tedious Tasks

  • A lot of people are used to asking ChatGPT or GitHub Copilot for code. And that’s a great use case. Something else Leo and I have learned is you can also use these tools to ask for libraries or modules that you might be able to use to make your task easier. The chapter we have in our book, it’s called automating tedious tasks. And it’s amazing to me how many libraries in Python are available to help you if you didn’t know about these libraries.

  • And I think it speaks to the resilience of the students. When they learn this way, when they’re working with a whole bunch different libraries, working whole bunch of different examples from the LLM, I think they become more resilient. Then we give them to small code snippets that perform specific tasks and specific domains. Jumping from domain to domain, I think helps them a lot in building kinda a robust understanding.

AI Ethical Issues

  • The first is the ownership of that code base that they used to learn from. I don’t think we as a society have figured out, first, how we should view this ethically, and second, how should we view this legally. We are obviously building tools that can help empower people. And so in some light, we would say this is a good ethical thing. But we do have to ask how are the tools built and who aren’t benefiting from their code potentially being taken or things like this. So that’s kinda the first concern.

  • The second concern would be a copyright. Are the LLMs commonly parroting code which might be under someone’s ownership? It is hard to assess, particularly for the kind of small pieces of code that LLMs tend to be able to generate well.

  • Occasionally, in my interaction with Copilot, it will generate an author name, like in its recommendations to me. It does give you some doubts about where this came from and whether we have ownership of it.

  • Feel free to use these tools kind of for your own use, but if you were to go try to build a company off the software that you’re writing you should be a little careful until these laws get resolved.

  • And then the third piece I’d say in terms of the ethics is models. And we’ve seen this across artificial intelligence. Models reflect biases within society.

  • What I think is important to do, since I don’t think we’ve worked out these issues as a society, is to bring in the readers for our book and bring in the students in our class into this conversation and say these are the ethical concerns of these models. And have that a direct conversation about it. And be frank about what we know and what we don’t know.

  • The fear is if we kinda pretend these models don’t exist, and we try not to let the students use the models, and they go on to use them on their own, they’re gonna run into these issues. And so it’s better for us to teach them up upfront than to just leave them the blind on it.

  • To me as a teacher, it seems a little dishonest to not show students these tools. We cannot pretend that these issues don’t exist.

  • There are people who try to pretend these tools don’t exist and ban them so the students can’t use them in their courses. And I totally get why. It’s a very upsetting thing that has happened, in terms of upsetting the status quo how courses are taught.

  • And it’s very tempting to just try to pretend these tools are away. But the tools are out there and our students are going to be using them. More importantly, they’re going to be using them when they get their next co-op position or their next industry job. Or at the very least, they’re gonna be asked about these at future companies and asked about their opinions of these tools.

  • We need to be teaching these ethical concerns. We may not have solutions for them, but I don’t think a solution is to try to scare students away from using these tools. Or somehow I’m trying to prevent them from using these tools, cause it’s never gonna happen.

  • It’s more useful if we teach the tools along with the concerns that we have. We have a lot of work to do. There’s a reason that they’re at the beginning of our book and not at the end. This is a big deal.

  • One of the worst things we can do is to introduce students to these tools and then not help them understand what the costs are. Cause I even think once students understand what’s going on, they’ll be on the lookout for this. And they won’t just accept whatever the LLM tells them as the correct answer.

  • We’re trying to balance the fact that they’re out there and students are gonna be using these tools, with also training students to understand the deficits. And who knows, our students might be the ones who end up in positions where they can make these kinds of improvements. Students are potentially a couple of years away from graduating and being able to inform how these tools are deployed and how these tools are used.

AI Replacing Developers

  • I think if any of your listeners spend a little time with Copilot, their fears will be quickly taken away. These tools are fantastic. They do great things, but they make mistakes. And you realize very quickly there are still essential skills that are required to use them properly. And so I don’t think we as programmers are gonna go away. And so that’s kinda the first takeaway.

  • The second is as software engineers, only a fraction of your time is spent coding. A lot of your time is spent thinking, how should I lay out the interfaces? How do I work with the other software within the company? How do I make sure I’ve got really clear requirements for my code? Like all of these things are the really big problems that still humans have to wrestle with. And I’m not seeing LLMs taking that away from us anytime soon. At least, for now.

  • If you look back at computing evolution, I wonder if people have had the same discussion when Visual Basic came out, where you could drag-and-drop components into a form. Probably, people back then were saying the same sorts of things.

  • These advances, I don’t know if they lead to more or fewer jobs, but I think it’s likely that it’s gonna be a steady state and perhaps will be more productive with what we’re able to do. I don’t think that they make jobs in programming go away.

  • Most of what we’ve been talking about and reading has been for introductory programming. I don’t know if we know what the impact will be on industry level projects. We know people are using these tools in industry and we know they’re more productive. But I don’t know if we know whether there are more or fewer jobs or if there will be in the future. I have a feeling that the results will be that the existing programmers are just gonna be more efficient.

  • Like with all the other major advancements in technology, we’ve just adjusted and done bigger and better things as the technology got better. And so naively, I think that’s the case.

  • There are maybe people saying that it improves their productivity, maybe like, 30-40%. The gap between junior and senior might be lesser now, because the juniors might be able to take on more advanced and complicated problems.

  • Writing code is not the only job for software developers. They still need to understand requirements. And we know in the industry, a lot of times, requirements are vague or not well specified. It’s the software developer’s job to actually translate that into a good design, proper design.

  • Also, don’t forget about evolving the code. Writing it in such a maintainable way. Writing it in such a way it can scale. We have to be able to live with it, leverage with it, to improve your productivity so that we can move on to solve bigger and bigger problem.

  • It’s not great at writing efficient code. At least in my experience, it hasn’t done very well. For the advanced code, there’s a lot of room for us, as software engineers, developing those ourselves.

  • What I find a kind of interesting about the discourse right now is, because it’s so new that people want to be able to make these claims. Like LLMs are crap or LLMs are amazing. It’s very early so people are gonna make these kinds of claims right now. But I guess I’m more interested in what happens when the dust settles. All polarizing opinions right now, I don’t think any of them are gonna end up being what actually happens. Is every software engineer gonna be fired? No. Are we going to have a different number of software engineers? Probably. I don’t know if it’s gonna be more or less.

  • We tend to overestimate the effects of technology in the short term and underestimate it in the long term.

  • We’re still in the throes of this thing, where it’s very difficult right now to separate hype from what’s actually happening.

  • One thing is clear, for sure: if you rely too much on LLM, I think still we are not there yet. In your book, you also mention it is not an expert. It is actually trained from existing code bases. If you-wanna solve a new problem, maybe quantum computing, let’s say, it may not be able to even give you a proper solution. Let’s not forget about that. We still need to use our judgment as well, as a human, to actually apply what LLM is suggesting to us into our software.

A More Equitable Society

  • People who already have prior programming experience, they tend to do better in introductory CS courses. If they had more opportunities in high school, for example, or their parents had access to some maybe computing or courses or they directed the students into this field, then they tend to perform better.

  • I wouldn’t necessarily have a problem with this except that these opportunities are not evenly distributed. And so they’re made more accessible to dominant groups. And so then this gap in prior experience leads to a gap across different types of students, which is obviously not okay.

  • We’re hoping is that because there’s a reduced emphasis on syntax using LLMs, we’re hoping that prior experience, while it will certainly still exist, the gaps will still exist, perhaps the gaps in prior experience will not lead to the gaps in outcomes that we’ve been seeing in introductory CS courses.

  • There are a lot of caveats here. One of them is that maybe the students with privilege are going to be using LLMs earlier than other students. And then they’ll have prior experience using LLMs too. And that may convey an advantage just like a prior programming experience does right now.

  • Our hope stems from the fact that learning syntax is so difficult and it’s such a barrier for so many students. And maybe these LLM skills, maybe the gap can be made smaller more quickly.

  • The kinda status quo of how we assess students in computer science classes, and it’s solving these really small functions that aren’t particularly exciting, to be quite frank. And there’s been a whole bunch of work within the community that has shown that students from demographic groups that are currently underrepresented computing tend to want to see that their work is gonna help society. It’s gonna be for the societal good. And they wanna see that computing can serve that good.

  • When we move to LLMs, unless you wanna do these outdated assignments that the LLM solve for you, you have to move to these kind of open-ended large project and then they can pick the domain that matters to them. And then it can be something that’s meaningful to them personally. And I think if you can do that, I think we’re gonna bring in a broader audience of people who are interested in computing, cause they see how it matters to them as people. So that’d be the first reason for optimism.

  • And then the second one is there’s been a whole bunch of research kinda already started by members of our community. That’s really interesting in terms of how can we turn these AI assistants into tutors. Essentially, intelligent tutoring systems. How could we help through prompt engineering, through really careful crafting of the introductory prompts? How can we make these, make it such that when the student’s struggling, they don’t have to wait for the next office hours of instructor? They can just have a quick conversation, and they’re gonna get mostly correct answers.

  • You’re gonna get encouraging answers once they’ll encourage them to keep trying. How can we get them the help they need when they need it? Cause if there’s a gap in terms of how much support different groups need, making sure everyone has lots of support will help everyone, but it will help disadvantaged groups more.

  • It seems that these techniques, like active learning, disproportionately help students who are underprivileged. And so it helps everyone. A kind of boat raises all waters, but the folks who are struggling are raised more. And you see a larger impact for those struggling groups.

  • People are already referring to LLMs as like one-on-one tutors. And I’m not willing to go there quite yet, but I think that’s the dream. The dream is that they can reduce the time delay between having a question and getting an answer. Because if we can reduce that to zero, like just imagine that any question a student has could be answered immediately, that bodes well for students to catch up.

  • A lot of the times, I think the limiting factor is just resources. If a student gets stuck before, maybe they have to wait for me to get them unstuck and maybe they can get unstuck sooner with LLMs, and then catch up.

  • You’ll hear folks say we can’t change what we’re teaching in our introductory courses right now, because students are learning the fundamentals. And they’ll start kinda hammering on how great their current CS-1 class is. But the evidence is introductory students finishing an introductory programming class, the majority of them can’t find the average of positive numbers in a list.

  • That’s like a super easy task for computer scientists, like for software engineers. And the majority of students can’t do that at the end of CS-1. So we, we need to make sure that we’re very clear about what we’re comparing against. What’s happening now isn’t successful for everyone.

  • The second piece, we’ve done a whole bunch of research out of my lab finding that both students and tutors have significant incentives to essentially just give away the answer, and just fix the problem for the student right there. Basically act as human debuggers without actually teaching the process.

  • When we imagine that human tutor interacting with a student, we imagine the great teacher like Dan. Like sitting down and going back to like step one and diagnosing the problem and giving them the right instruction, the right time to address their misconceptions. When the reality is, it’s mostly students giving this kind of tutoring help. And they’re maybe not giving the best instruction. And so we have to be honest with ourselves about what are these AI assistants being compared against? And then we can actually do a fair comparison.

3 Tech Lead Wisdom

  1. From a leadership perspective, that it’s important to be good, it’s important to be fast, it’s important to be able to be productive, but it’s just as important or even more important to know where you’re going. And so I spent a lot of time with my group and with my lab, and making sure we have a vision for where we’re going, it’s that we are going in the right direction.

  2. Always test assumptions or always be aware of assumptions that people are making.

    • We’re at the beginning of this. And a flood of research and commentary that’s gonna come out about LLMs. Always take the time to understand where the writer is coming from or where your own assumptions are coming from.

    • Especially now, I just want to caution that people are going to be making sweeping statements about LLMs. There are so many assumptions that are baked in to the experiments that people are doing right now. We can’t even agree on the right skills that we want students to have when they’re working with LLMs anymore.

    • We need to dig beneath the headlines to see exactly what’s going on, especially in a new area like LLMs, where there are so many assumptions that have not even been written down yet that people might be making.

    • It’s not even that anybody who’s involved is being deceptive. I think everybody’s being super honest about what’s happening. But the assumptions, I think, are so new that we’re not even necessarily writing them down. If we’re not being careful enough, we might be making assumptions about LLMs.

    • For example, I could just think in my head, okay, students still must know syntax. And maybe that’s true, maybe it’s not true, but it might be so obvious to me one way or the other that I just might not even take it into account in my research. And this is one of the most dangerous things for researchers.

    • It’s like an assumption that is apparently super obvious that you don’t even question it. Or even worse, you don’t even write it down. As a community, we’re at risk of doing this right now because everything is moving so quickly.

  3. I believe very fervently in the notion of empowered teams. You wanna make sure your teams are empowered to be able to do the work they want to do and solve important problems.

    • As PhD advisors in empowering our PhD students to find their own path is probably one of best things we get to do as faculty. Of watching them not really knowing what they wanna study initially, and us really being close to them on every project they do. To six years later, five or six years later, when they are now essentially running their own research program, and we’re just giving occasional advice.

    • For the tech leaders out there who’ve been listening to Marty Cagan’s message of Empowered Teams, it applies more than just software engineering teams. I think you empower all the people who work with you, and you end up in a better place.

    • You can’t match a small team, just empowered to do great work.

Transcript

[00:01:04] Introduction

Henry Suryawirawan: Hello, Dr. Dan, Dr. Leo. Good to see you in Tech Lead Journal podcast. Welcome to the show.

Leo Porter: Oh, thank you for having us. We’re excited to be here.

[00:01:11] Career Journey

Henry Suryawirawan: So I always love to start my conversation by asking my guest to actually share a little bit more about your career. If you can maybe mention your highlights or turning points that we all can learn from, that will be great.

Daniel Zingaro: Thanks, Henry. I’d be happy to. I think the beginning might not be too surprising. So I started in university in computer science and I did my undergrad degree in computer science. And then, I started a grad program. So I started a master’s program and I was doing something called formal methods. So like normally, when people want to get confidence in their programs, they run them with a bunch of test cases, like, you know, as many test cases as they can come up with.

But formal methods is different. Formal methods, you try to prove mathematically that the program is correct. And I was pretty interested in that, although I could kind of tell as I was working on it that it was just extremely difficult for me. And I don’t mean to say, oh, if, you know, if you’re not getting something right away, you just give up.

Like, that’s not what I mean. But I, I mean, there were some extremely impressive people in this field and I was happy to be part of it. But I also just sort of realistically and realized that I wasn’t gonna be able to make a huge impact in that area, but I was still having a great time with it. And I guess that’s all that mattered to me.

But then my supervisor one day happened not to be able to teach. I think he wasn’t feeling well, so he called me sort of last minute and said, can you cover my class for me? It was a compiler’s class. And I was worried cause I hadn’t taught a class before, but I, you know, I gave it a shot. And that was the turning point for me going from this kind of abstract research to education research. So I taught that lecture and then it was over for me. I was just, how can I start teaching more and studying education? So it was a complete shift at that point. I think to this day my supervisor would probably say that like his biggest mistake was not getting himself to class that day, because we were planning on working together further in that area. And it didn’t happen.

And then, maybe in 2010 or 2011 or so, another big career change happened, which is I met Leo at a conference. We were at both at an education conference and just sort of met up there and immediately just had a lot in common. You know, not just about our research, but just hobbies and sports and video games and I think just like worldview. And so we connected immediately and just we’ve worked on dozens of papers since then and our, our book most recently, and it’s a career highlight. It’s a real honor to be working with him in so many capacities.

Leo Porter: So I, I have a bit of a non-traditional career path in that I did my undergrad in computer science, actually switched into computer science as a major. And then I did four years as an officer in The United States Navy. I was a navigator, a guided missile destroyer. And so a lot of my lessons about leadership and building teams and ethics actually come from my time there. After I finished my time in the Navy, I went back for a PhD in computer architecture. Did lots of processor design very similar to kind of how Dan describing formal methods. I enjoyed the work.

But it was really when I started teaching and be in front of the classroom that I got the most excited. Right towards the end of my PhD it started shifting into to computer science education, research, and it started really with a colleague, Beth Simon, kinda introducing me to Dan. And as Dan kind of pointed out, it was just an incredibly fortuitous meeting, because I didn’t know the methods of computing education at the time. Dan was in an education PhD. And it was really through the two of us finding our research direction together that we did tons of really productive work. And I’m super appreciative to Dan for all we’ve done.

I mean, Dan’s been, in terms of the kinda the big stuff we’ve done in our careers, we’ve investigated how effective this pedagogy called peer instruction is in computer science classes and done most of the main research on that topic, at least up to like 2017 or so. And then other people kind of took over the research from there. We built this assessment of how people learn basic data structures that’s actually validated and is used by the community called the Basic Data Structures Inventory. The two of us use machine learning to predict which students are likely to succeed and fail. And we became pretty accurate about predicting student success very early in a quarter.

Daniel Zingaro: And just in terms of our book that happened because I happened to hear about some of these generative AI tools like ChatGPT and GitHub Copilot. And I tried them out and I immediately was worried about our introductory programming courses. And I thought that everybody’s gonna be panicking and it did end up happening like that. Everybody was panicking trying to figure out what to do. And so I had got on a call with Leo and I, I said, Leo, there are these tools we have to look at these. Maybe we should write a book, somebody has to. And, uh, it didn’t take Leo very long, maybe 15 minutes or so of playing with the tools himself before he agreed. He was like, yeah, I mean, somebody’s gotta do this for the community. Just to give some direction, some perspective on what was happening.

Leo Porter: And then Dan, as Dan said, he said, hey, I, I started playing with these LLM things and they are terrifying for trying to teach programming cause they’re solving essentially all the tasks we used to give them. And so we’re gonna need to change how we teach introductory programming. And as Dan said, it took me about probably 15 minutes playing with this thing, before I went, oh my gosh, we’re in serious trouble. And we then sat down and started figuring out how are we gonna build a class that would adjust to the fact these amazing tools are available. And then as we kept going with that, we said, well, wait a second. We need a book to help structure the class. And that’s when Dan kind of suckered me into writing a book. And then we, we had a lot of fun writing after that. So it was great.

Henry Suryawirawan: Thank you for sharing your story. I think one thing that I really picked out is like, how did you start writing this book, right? Learn AI-Assisted Python Programming. I think you just thought about it 15 minutes, that’s really cool. So today we are gonna talk little bit more about what you’ve done in terms of done in terms doing the research and also from your experience playing around with this LLM AI assistant and also maybe the impact to people learning about programming.

[00:07:21] Sponsor

Henry Suryawirawan: Hey, thank you for being part of the Tech Lead Journal community. This show wouldn’t be the same without your ears, and you are the reason this show exists. If you’re loving TLJ and want to see it keep on growing, consider becoming a patron at techleadjournal.dev/patron or buying me a coffee at techleadjournal.dev/coffee. Every little bit helps fuel the research, editing, and sleepless nights that go into making the show the best it can be. Thanks for being the best listeners any podcast could ask for!

And now, let’s get back to our episode.

[00:07:55] AI Assistant

Henry Suryawirawan: So let’s start probably in the beginning to just level set our understanding. What is actually AI assistant? You know, some people heard about Copilot. But maybe if you can maybe describe what is AI assistant?

Daniel Zingaro: So I guess I would define an AI assistant as a piece of software that helps you get work done more efficiently. But the key to it is the way you communicate with it. So typically when people think about using computers, they think you need to be very rigid in how you communicate with a computer. And that’s what we would say when we teach programming courses, right? Every symbol matters, every space matters, every, you know, key press that you make is important. And that’s the language of computing, right? It’s very precise.

What an AI assistant allows you to do is communicate in English or any natural language. And there are way more people in the world who know natural languages compared to programming languages. And so the hope and the goal is that people be able to use their own language and have the computer translate that. Cause computers still can’t run or do anything with these languages. Like English, for example. It’s the only one I know. So I, I keep talking about it. But still it has to be translated into something a computer can work with. That’s what we’re trying to automate with these AI assistants. It’s like making it so that people can communicate in their language and have it automatically translated over to lower level stuff that the computer understands.

Leo Porter: And so I would expand and just say what’s great about these AI assistants is it’s almost a step in the natural progression of making programming and interacting with computers easier for humans. And so we’ve seen this evolution from having to write Assembly code, or actually before that, you have to push buttons on a machine to make do things. And then writing Assembly code was a huge improvement of the stored program computer. And then we moved to the point that we could start writing in higher level languages that were more English readable than Assembly, compiled down to Assembly, and actually then the Assembly ran. And then over time, we’ve developed more and more advanced languages that become easier to express our goals in the language available to us.

Now, what’s unclear to us is whether or not these LLMs are gonna be just the next language in which we interact. Right now, you can interact with them and get working code fairly often, but it’s not always correct. And so it’s not quite the same as a compiler which is deterministically gonna be correct. But it does seem like the next step in a really nice evolution of making it easier and more accessible to write software.

Henry Suryawirawan: Right. So I think that’s a very interesting thing. I’ve not played around with all these Copilot tools a lot. I mean, in my day-to-day role, because of the nature of my job. But one thing that I think very interesting when I heard developers using it, right? It’s like it seems to improve their productivity. So just like what you said, right, it could be the next evolution of how we write software.

[00:11:04] How AI Assistant Affect Students

Henry Suryawirawan: And I think in your lecturing role as well, I think it will be different now that you teach programming to new students. Maybe a little bit, you know, how do you find the difference now that there is this LLM AI assistant, and students now learning programming, does that become much easier or does it actually make it harder? Is there any kind a stark difference that you can tell?

Leo Porter: Oh geez. This is kinda a long answer, so I’ll give it a go. So we’ve been experimenting with how do you teach new learners how to program in the presence of LLMs. And I’ll be upfront that we don’t have all the answers yet by any means. I think that’s gonna be probably a decade before we actually really know from the research what the best way is to teach students, now these new tools are available. But I can say a few things, which is that students very quickly recognize how powerful the tools are. I actually demoed it in my class, the very first class, and they all were, I got a gasp from the crowd. They couldn’t believe that I was just basically writing the code for them.

And so the question then is what are we teaching the students? What changes here? And I think there’s still a lot to teach students. It’s just the scoping has shifted and what the skills are shift. We also have to say what our goal is. Are we training the next software engineers or are we training someone who works in business or data science or accounting to be able to write software that does something useful for them? And I think those are actually slightly different audiences in terms of what want to teach. What we’ve done is we’ve kind of realized that the skills that you need to interact with an LLM are actually fundamental programming and software development skills.

So the general workflow in working with an LLM is you give it some, you, you have some desired function that you wanna write, you know you want to accomplish some small tasks. And you then describe that task and then the LLM is gonna generate code for you. Now, the code it’s gonna generate may not be right. It may not even be close to kinda address what you want. If you read through it, you can quickly recognize this isn’t what I want. And then you can pull up basically a window of saying, are there other solutions that are good for me. And so already students need to know how to read code, understand what it’s doing, potentially be able trace code, and be able pick from multiple code examples which one’s gonna work.

And then the next step for them, once they’ve kinda picked one that they think works is write tests. Because you can’t trust the LLM. And this is actually a point that I think is a really encouraging piece for new students is testing has been a point where new students struggle. They tend to write code, and by definition, assume it’s right, which is basically the opposite of what we all do as software engineers, right? You write the code and you gather tons of evidence for it being correct before you have any faith in it. And so the students, I think because it’s coming from a machine that they know that makes mistakes, they’re actually more willing to test. And so that’s a research question we haven’t, I don’t have the data to support that yet, but I, I have suspicions that they’re more willing to test, when it’s coming from a tool that they know can make mistakes.

So then you test the code. Writing good test is super important and something that we haven’t taught as well as we should have candidly in the past. And then once they’ve tested it, now they know that piece of code is working. However, the one catch is sometimes the code doesn’t work and sometimes you can’t get the LLM to give you the exact right answer. And so there’s still this last step of being able to debug. And so you have to teach explicitly how do you modify code that’s slightly buggy to do what you want. And so we’re still teaching debugging skills and that also is a fundamental skill, right? So, so far, your listeners are probably saying, well, what’s different here? But there is a shift onto reading and modifying and testing code away from looking at a blank screen and writing code from scratch, which is what we used to do the most of.

Daniel Zingaro: Yeah. And Leo, maybe I, I can just say, and this may not be obvious to, um, people who have been programming for a while, but the syntax of a programming language is something that takes some students weeks and some students just don’t get past it. It’s extremely stressful for students. And also artificial, right? Like the only reason we need all this syntax and, you know, maybe for this next little bit of discussion, people can think about some horrendous syntax. Like the way that like C function pointers are defined or something. Like the only reason it’s like that is because the compiler has to be able to unambiguously understand what your goal is.

And if you’re getting stuck on syntax, some of us do program for the sake of programming. I get that. A lot of people probably listening to this podcast just love programming, present company included. But there are many people who just want to do something with the code, right? For example, many of us may not care how our appliances work. Like if my microwave makes my popcorn for me, I’m happy. And that’s how some people are with code, right? They don’t need to know what exactly each, you know, piece of syntax is doing. What’s super exciting for me and Leo about LLMs is I think for the first time, we see a future where people who don’t know how to program could be afforded some of the same benefits as people who do.

Henry Suryawirawan: Thanks for the explanation, right, of how it changes the dynamics now for people to programming languages. I think the skillset that you mentioned, you know, like still people need to be able to the code, needs to be able to test it, need to be able to debug it, and maybe also to be able to express the task that they want to solve, right? Because LLM cannot just solve everything in one go. You will probably to play around, do a little bit back and forth before you come to a perfect solution, so to speak.

[00:16:46] Problem Decomposition Skill

Henry Suryawirawan: But one thing I think I wanna highlight about this tool is that there’s a risk of not getting it right the first time, right? I mean, even if you ask the same questions, it might spit out different answers, right? Maybe a little bit of the underlying why. Why the tool doesn’t seem to be deterministic? That’s the first thing. And maybe a little bit about LLM, how it works for people to understand.

Leo Porter: So I really appreciate you pointing out the you can’t just give a large task to an LLM in one go. And that’s actually super important skill that we’re now teaching our classes that candidly we did not in the past. So what we used to do in the past, and this is common across pretty much all of computing education, and I bet many of your listeners had this in their classes, is students are given essentially just a function. The function’s basically perfectly described, because you need to be auto grade it. And so every possible case is covered in that description, right? And then they have to just fill in the code for that function. Now, LLMs do that incredibly well. And so the shift now is if I give a fairly vague task, like a large project to work through, how do they break apart that large project into smaller tasks that the LLM can then help them solve?

Now this for your audience and all of us as software engineers, this is problem decomposition. This is probably one of the most important skills we learn as software engineers. And it used to be that we didn’t teach that to new students learning how to program until much later in their careers. And now it’s actually front and center, incredibly important to learn in your first programming class, because that’s how you have to interact with these LLMs. And so in my class this last fall, I had students doing things that they would’ve never been able to do in a previous CS-1 class. I gave projects like find a data set on Kaggle, ask a question of the data, and then write the software to answer that question.

That’s way beyond the scope of what we’d ever ask in a CS-1. Especially given the scope of what the many of my students did where they often did really nice visualizations, they pulled in really interesting data sets from the domains that they cared about. Some of them had interactive programs where they were, you could interact with it and ask, I want to see the relationship between… it was a stroke data set, and you could say, I wanna see the relationship between age and stroke, and it would actually plot age against stroke in really clever ways. These are things, again, you’ve never seen a CS-1, it’s really open-ended and the students are doing all the problem decomposition on their own. And so we’re really excited about teaching this skill. And if I’m a little bit reflective, I’m a bit disappointed that we as a community stopped prioritizing that so early in the careers. It really should be a first order of priority.

Henry Suryawirawan: You mentioned about problem decomposition, right. I think that’s really important skills for any programmers with or without AI assistant, right? It’s like be able to breakdown problems or even requirements into small tasks, into modules, design, classes and things like that. And decompose that such that we can make a good software, right, that is maintainable rather than just one big function that does everything in one go. So I think problem decomposition definitely is very important.

[00:19:47] How LLM Works

Henry Suryawirawan: And I wanna come back to the question earlier about LLM, because it’s non-deterministic so far. Maybe if you can explain why is that non-deterministic? How does it work underlying so that people actually understand that? I mean, it’s not gonna replace programmers in one day, right?

Daniel Zingaro: Yeah. Thanks, Henry. It’s a really important point and it’s also counterintuitive for a lot of computing people, because, you know, we all joke about shutting your computer off, turning it back on if the thing you’re trying to do doesn’t work. But I think many of us go under the assumption, in our day-to-day computing lives, that computers are deterministic. If you do something, you’re gonna get the same response. I mean, there are always counter examples of this and race conditions and stuff, but overall that’s how we feel about computing. If you write a program and you run it, you’re gonna expect that if you run it a second time, the same thing’s gonna happen. And if it doesn’t, you probably start thinking about, oh, I, I, maybe I have a memory allocation bug or some sort of like transient behavior problem in my program.

But, Henry, as you mentioned, LLMs are inherently non-deterministic. So you ask for some code and you get the code. And then you ask a second time, and you’ll get a different code. And you ask a third time, and you’ll get a different code. And this is, first of all, kind of challenging as an instructor. Leo reported recently that it makes it very difficult to plan sometimes, because typically, you know, our lectures are sort of scripted in some ways where we need to demonstrate specific things in our lecture. And it’s very hard to do that when you don’t know what the LLM is going to respond in real time in class. So there is that.

But on the other hand, there’s also a benefit, believe it or not, of being non-deterministic. And that is because these things can make mistakes, imagine how frustrating it would be if you asked it for some code and they gave you some code and it was wrong. And then you were like, well, okay, now what do I do? Do I try it again? And in this case, you don’t want it to be deterministic, right? ‘Cause then you’re just gonna get the same wrong code every time. So the fact that it is not deterministic means that you have a chance, even if the most probable response is wrong, maybe you can ask again or look at maybe the top 5 or top 10. And maybe you could pick out the correct code from that list.

And this is a skill, right? This is a skill that our students or learners did not need before, but they do now, right? So because the first response may not be correct, students have to know how to go through the list of potential solutions and figure out which ones are perhaps not correct immediately, but which ones are worth further testing, further consideration. And so for that reason, Leo and I are very careful to continue to teach the programming language. So it’s true our book and Leo’s students are working with LLMs throughout the course. But at the same time, we are still teaching the students Python, because right now Python is the language that we teach in the introductory computing courses, and it’s still a important part of the loop.

And so we’re not yet at the point where you can give English or whatever natural language instructions to the LLM, and get back your language. You’re still getting back Python code. And so learners still need to understand and work with Python, not at the syntax level, like we said. We’re spending less time on the low level syntax details, but they still need to understand Python. And one of the reasons is so that when this non-determinism is happening, they can look at and evaluate a bunch of different solutions for which ones may be correct.

[00:23:36] Prompt Engineering

Henry Suryawirawan: So I think the interactions that you mentioned, right? Asking back and forth, you know, if you got the first solution probably not quite right, you ask again, and you ask again back and forth until you find the right solution. I think this comes to the term prompting, right? So I think many people would have heard now. Prompt engineering is kind of like a new job even. So tell us about this prompting, right? I think it’s a skillset. It’s a new skillset that everyone needs to learn in order to get the best out of AI assistant. Maybe a little bit about prompting. Like what do you feel about this new skillset?

Leo Porter: So prompt engineering is just the task of writing a prompt in-such-a way that the LLM will give you back a response. And what is tricky there when we’re teaching people who don’t know how to program to start is the LLMs do very well if you describe problems in a technical language. So if you say, I want this function to find the largest value in this list, right? Like that I’m using terms that we know as computer scientists, right? I wanna make sure I describe it as a list if it’s-in Python or an array if it’s in Java or things like that, right? And I’m saying I wanna find the maximum value, so I’m specifically saying exactly the behavior I want. I may even describe it even better to say something like, I write a function that returns the largest value in the parameter list. Now I’ve really specified the behavior and the LLM does a much better job for that. The problem is I’m using keywords that you have to teach learners. And so there is still this task of teaching them how we would speak about these functions.

And I think, I’m by no means an expert on how LLMs work. However, they are reading from large code bases and they’re learning from those large code bases. So in a sense, what we’re trying to get them to do is just give that function header that a human would’ve written to describe their function. And if you can generate one that’s very close to the behavior you want, the model’s gonna find something very similar in its-training set then generate code that’s gonna be paired with that.

The other thing you can do with prompts is there’s a whole bunch of ways in which you can basically tell the LLM to behave in a particular way. And Dan has a lot more experience with that. So I’ll let him take it from there.

Daniel Zingaro: Yeah, we had a good time, um, near the end of our book, sort of going into these other prompt interaction patterns. People have started to catalog the different ways of interacting with the LLM. It kind of reminds me of the object oriented patterns, you know, Henry, like observer pattern and model-view-controller pattern and visitor pattern, and like all these patterns that people have identified and like documented over the past several decades. And people are starting to do that, now with LLMs. And I probably could have kept going and going about it in our textbook that I managed to control myself. And I only, I think I only talked about a couple. But they’re very interesting.

And so for example, one of them is what if you don’t know what information the LLM needs to perform your task? And this links back to what Leo just said. You have to be very precise sometimes in your natural language. Hopefully not as precise as you do with programming, but like Leo said, you still need to know a lot of the terminology that you might not know. And so one thing you can do is you can use this flipped interaction pattern where you ask the LLM to ask you questions.

I think the example we have in the book is you want a function to validate a password. And you might not know how to ask for such a function. And so what you can do is you can ask the LLM to ask you for all the information it needs, and once it’s done asking you, it will write the function. So one of its first questions to you might be, okay, what are the parameters? And you might be like, well, I don’t know what the parameters are. And then it will tell you what the parameters are, and hopefully then you can be able to answer that question. I mean, there is a risk here of you going down a rabbit hole that you don’t understand. And so we need to balance these prompt patterns against teaching the fundamentals of programming.

But there’s another pattern that I find kind of interesting which is called the persona pattern. And a lot of educators are using this to good effect right now where you can ask the AI to act like a specific kind of person. And so what educators right now are doing is they’re using the persona pattern, and they’re saying, okay, LLM, you are a CS-1 instructor. And that conveys a lot of information, like, don’t use advanced programming concepts that have not been taught yet. Or the persona pattern could be like, you are a student in an introductory computer science course. Things like that, so that you try to scope the types of responses that you get. You try to change the types of responses from the default ones, because there are a lot of situations where the default responses might include code that students have not seen before, might use types of code that you don’t want them to see yet, like not in the scope of the course. And, um, it’s kinda amazing to me how much leverage you can get by just telling the LLM how to behave in the upcoming interactions. So it’s definitely an ongoing area of research, and I’m definitely listening carefully to what’s going on there.

Henry Suryawirawan: Right. Very interesting to hear some of the patterns, right? I’ve read your book as well. So I find it also very, very interesting. So for people who might have applied AI assistant like with ChatGPT or Bard, right? They would have seen this pattern as well. Some people also share, if you wanna do something, here are the catalogs of prompts that you can use to solve the problem, right? I think same thing applies for programming. So I mean, flipped interaction pattern, persona pattern, those are definitely interesting. So it’s not one way we ask question, and the AI assistant will just give a solution. You can also be creative and sometimes use it differently, right? So I think, thanks for mentioning the patterns. I would love to see more patterns in the future. So I think we’ll leave to those creative people to come up with patterns, right?

[00:29:29] Automating Tedious Tasks

Henry Suryawirawan: So maybe if you can share from your experience so far using the tool, you know, cracking the code, right? Are there any techniques that probably is little bit less utilized for now, but for people to try so that they can actually see the true power of LLM in their day-to-day workflow?

Daniel Zingaro: It’s a good question, Henry. So I I think a lot of people are used to, if they’ve played with these tools at all, they’re kind of used to asking ChatGPT or GitHub Copilot for code. And that’s a great use case. Something else Leo and I have learned is you can also use these tools to ask for libraries or modules that you might be able to use to make your task easier. So the chapter we have in our book, it’s called automating tedious tasks. And it’s amazing to me how many libraries in Python are available to help you if you didn’t know about these libraries, like, I’ll just pick a random example.

We have an example in the chapter about automating the tedious tasks where you’ve got two huge directories of images. And the backstory in the book is maybe they came from different phones. So like your partner has a bunch of pictures on their phone and you have a bunch of pictures on your phone and there are duplicates, because you’ve been sending them back and forth. And I think everybody listening kind of knows what kind of mess you can get yourself into. And so the idea is, we want to remove the duplicate pictures.

And this I think sounds like a super daunting task until you realize that if you ask Copilot or ChatGPT, “Hey, here’s a task I wanna perform. Is there a Python module or library that I can use?” It will come back and tell you about the libraries that are available that might help you. For example, something that tells you if two pictures are the same picture, like all the pixels are the same. And then you can ask Copilot for clarifications. You can say, is this module built in? Is it something I have to install? Are there other alternatives? And actually I want to throw it over to Leo for a second, because in your class, at the end of 2023, I think you managed in one lecture to do an example of adding up all the word counts in like a ton of documents. And this is something I’m assuming you would not have done in a normal CS-1, I guess, is, you know, like, is that accurate?

Leo Porter: Absolutely! I, I, I think it’s hard for students to work with new libraries. And so the LLM having that conversation is really clean. It gives you nice examples. And so I meant for this to be a whole lecture. And what happened was is I ran long from the previous lecture. And we just had the quick conversation with the students about, I’m asking Copilot, what’s a good library for me to find out how many words are in a Word document? It gave me a great answer. It gave me some starter code to work with. And within, I think it was about 15 minutes I actually spent with my students and we, we’d solved the problem. And that’s just way beyond the scope of what we normally teach in a CS-1. I completely agree with you, Dan.

Daniel Zingaro: Yeah. And again, it gets at that higher level of abstraction, right? Like a lot of listeners have probably had to dig into like API docs, reading about how functions are called. Oh, this thing takes five parameters. The first one is pointer two a pointer two a string. The second one is blah, blah, blah, blah, blah. And all you want to do is just use this thing. And the sample code is out there, and these LLMs can access it. And I just found the example really cool, Leo, that something we would not have even attempted in a previous introductory computing course. And we wouldn’t have attempted it, because we would’ve wasted too much time learning or showing students how to use the library, how to call these functions correctly. And now we can just do it

Leo Porter: And I think it speaks to the resilience of the students, right? When they learn this way, when they’re working with a whole bunch different libraries, working whole bunch of different examples from the LLM, I think they become more resilient. Then we give them just small code snippets that perform specific tasks and in specific domains. Jumping from domain to domain, I think helps them a lot in building kinda a robust understanding.

[00:33:30] AI Ethical Issues

Henry Suryawirawan: Right. So I think another important thing about AI assistant, right? I think Leo mentioned it a little bit. It is trained from a large code base. So if you’re using Copilot, I think maybe most likely it is trained from GitHub repositories, right? I think there’s a risk about copyright. There’s a risk about just bluntly copying from those repositories. Any kind of risk that you have seen so far from the introduction of these tools?

Leo Porter: Henry, that’s a fantastic point about the ethics of using these tools. And I think there’s few directions we can take this discussion. And the first is the ownership of that code base that they used to learn from. I don’t think we as a society have figured out, first, how we should view this ethically, and second, how should we view this legally. We are obviously building tools that can help empower people. And so in some light, we would say this is a good ethical thing. But we do have to ask how are the tools built and who aren’t benefiting from their code potentially being taken or things like this. So that’s kinda the first concern.

The second concern would be a copyright. Are the LLMs commonly parroting code which might be under someone’s ownership? It is hard to assess, particularly for the kind of small pieces of code that LLMs tend to be able to generate well. But occasionally, I mean I’ve seen it in my interaction with Copilot, occasionally, it will generate an author name, like in its recommendations to me. And then clearly, like, I don’t know if that’s the author or if it’s just doing next word prediction, and it happened to say author and then predicted some words after that. But it does give you some doubts about where this came from and whether or not we have ownership over it. And so we said fairly early on in our book, this hasn’t been resolved legally yet. Feel free to use these tools kind of for your own use, but if you were to go try to build a company off the software that you’re writing you should be a little careful until these laws get resolved.

And then the third piece I’d say in terms of the ethics is models. And we’ve seen this across artificial intelligence. Models reflect biases within society. And so, if you ask for a list of names, it will probably give you a list of Caucasian male names just on first try. And you have to ask a question, why would it do that? Like why is that its default? And it’s obviously learning from a code base that probably has those more representative, but that’s not a good sign for students who are coming in not from those groups.

What I think is important to do, since I don’t think, again, we’ve worked out these issues as a society, is to bring in the readers for our book and bring in the students in our class into this conversation and say these are the ethical concerns of these models. And have that a direct conversation about it. And be frank about what we know and what we don’t know. I think the fear is if we aren’t, if we kinda pretend these models don’t exist, and we try not to let the students use the models, and they go on to use them on their own, they’re gonna run into these issues. And so it’s better for us to teach them up upfront than to just leave them the blind on it.

Daniel Zingaro: It also just sort of seems, just to keep going off what Leo just said, it also seems a little, to me as a teacher, it seems a little dishonest to not show students these tools. As soon as we do, then what Leo said comes into the picture, right? Then we cannot pretend that these issues don’t exist. But there are people who try, you know, to pretend these tools don’t exist and ban them so the students can’t use them in their courses. And I totally get why. It’s a very upsetting thing that has happened. I don’t mean upsetting in terms of, you know, like making me sad. I mean in terms of like upsetting the status quo how courses are taught.

And it’s very tempting to just try to pretend these tools away. But the tools are out there and our students are going to be using them. And I think more importantly, they’re going to be using them when they get their next co-op position or their next industry job. Or at the very least, they’re gonna be asked about these at future companies and asked about their opinions of these tools. And I just have to super agree with Leo on this. We need to be teaching these ethical concerns. We may not have solutions to them, but I don’t think a solution is to try to scare students away from using these tools. Or somehow try to prevent them from using these tools, cause it’s never gonna happen.

And I think it’s more useful if we teach the tools along with the concerns that we have. Like I think it goes without saying, but we have a lot of work to do, right? Like the issues that Leo just mentioned are not small. There’s a reason that they’re at the beginning of our book and not at the end, right? Like, these are not like a, oh by the way, you know, these things are gonna reproduce like cultural norms. Like no! This is a big deal, right? We can’t just say it. This, oh, this is like, look at appendix A for the problems. This is not an appendix A stuff. This is like chapter one stuff. So Leo, you know, in your course you talk about these early on. But I don’t think that that means we can’t use these tools. I think actually it makes it more likely that our students will use these tools appropriately.

I think one of the worst things we can do is introduce students to these tools and then not help them understand what the costs are. Cause I even think once students understand what’s going on, they’ll be on the lookout for this. And they won’t just accept whatever the LLM tells them as the correct answer, right? So we’re trying to balance the fact that they’re out there and students are gonna be using these tools, with also training students to understand the deficits. And who knows, our students might be the ones who end up in positions where they can make these kinds of improvements. Like, you know, students are potentially a couple of years away from graduating and being able to inform how these tools are deployed and how these tools are used. So I definitely think that this is a very important part and a new component of an introductory computer science course.

Henry Suryawirawan: So thank you for highlighting this potential risk of using AI. I think it’s not just for coding or programming, right? I think it’s a bigger conversation. Responsible use of AI, copyright, and for example, ownership as well, bias. I think all this is, it’s a new thing, right? So people are trying to grasp, some countries also try to come up with the guidelines, right? But I think you are right. Maybe banning it altogether might not be the wise idea. We have to adapt with this tool. And I think we all as a user of this AI assistant, right? At the end of the day, when you use the code and apply it to your system, right? It is also your responsibility to actually make sure that the thing that you applied is correct, right? Because it might potentially affect other people’s lives as well.

[00:40:08] AI Replacing Developers

Henry Suryawirawan: So I think one related question about using this tool, right, in our day-to-day life is definitely people are afraid being replaced. Many people think that, oh, we don’t need so many developers anymore. We can probably cut down the number people that we have in the companies, right? The potential is there for people to think that may not need so many, you know, developers anymore. What’s your take about this? I know it’s probably hard to know the actual impact. But what’s your take about some people being afraid of, okay, AI is gonna take over the world and, you know, replace so many people?

Leo Porter: Okay. I think if any of your listeners spend a little time with Copilot, their fears will be quickly taken away. So I mean, these tools are fantastic. They do great things, but they make mistakes. And you realize very quickly there are still essential skills that are required to use them properly. And so I don’t think we as programmers are gonna go away. And so that’s kinda the first takeaway. The second is, out of our jobs as software engineers, and you know this far better than I do, but as software engineers, only a fraction of your time is spent coding. A lot of your time is spent thinking, how should I lay out the interfaces? How do I work with the other software within the company? How do I make sure I’ve got really clear requirements for my code? Like all of these things are the really big problems that still humans have to wrestle with. And I’m not seeing LLMs taking that away from us anytime soon. At least, for now.

Daniel Zingaro: And Leo, I guess just to add, I think if you look back at computing evolution, I wonder if people have had the same discussion when Visual Basic came out. You know, where you could drag and drop components onto a form. I wonder if back in 1995, people were saying, oh, well that’s it. We have like rapid application development. Remember that term? It was called RAD, I think. And I, I wasn’t around really. I was like a kid having a good time. But I think probably, people back then were saying the same sorts of things, right? Like, oh, look at this. We can develop these applications by dragging and dropping.

And I think these advances, I don’t know if they lead to more or less jobs, but I think it’s likely that it’s gonna be a steady state and perhaps we’ll be more productive with what we’re able to do. Just to reiterate what Leo said, I don’t think that they make jobs in programming go away.

I should also add, and Leo, I wonder what your opinion is on this, most of what we’ve been talking about and reading has been for introductory programming. I don’t know if we know what the impact will be on industry level projects. We know people are using these tools in industry, um, and we know they’re more productive. But I, I don’t know if we know whether there are more or fewer jobs or if there will be in the future. I just, I have a feeling that the results will be that the existing programmers are just gonna be more efficient.

Leo Porter: I agree. I suspect there’s gonna be, I think there’ll be a shift to, like with all the other major advancements in technology. When Python came out, we didn’t say, oh okay, well we need fewer people to write code. It was a, oh geez, we can write larger software or more quickly do data analysis or now deal with the influx of big data. Like we’ve just adjusted and done bigger and better things as the technology got better. And so naively, I think that’s the case. But I do think there’s gonna be a bunch of research on this topic in the next 10 years probably in the software engineering community.

Daniel Zingaro: Mm-Hmm.

Henry Suryawirawan: Yeah. Maybe one few things that I pick from the industry point of view, right? I mean there are maybe people saying that it improves their productivity, maybe like, 30-40%. Maybe the gap from junior and senior might be lesser now, because the juniors might be able to take on more advanced and complicated problems. But I agree with Leo that writing code is not the only job for software developers, right? So they still need to understand requirements. And we know in the industry, a lot of times, requirements are vague or not well specified, right? So I think it’s the software developer’s job to actually translate that into a good design, proper design.

And also don’t forget about evolving the code, right? Writing it in such a maintainable way. Writing it in such a way it can scale. I think those things I still haven’t heard that the LLMs can do for us. For example, you tell them, build me a few microservices that can interact with these kind of APIs. I think that will be too much task for LLMs to solve. But maybe one day it would. Happy to see that future. But for now, I think my take as well that we have to be able to live with it, leverage with it, to improve your productivity so that we can move on to solve bigger and bigger problem, just like what Leo said, right?

Leo Porter: And I’d add to that, it’s not great at writing efficient code. So if you’re, you say, no, this is, this is an inefficient algorithm. Could you use dynamic programming to solve this? At least in my experience, it hasn’t done very well. And then I did try, I teach a really specialized class on writing high performance software that’s architecture aware. So like, knowing about caches and like, super high efficient code, extracting cache locality, things like that. And it did terrible. Like I asked it to write like a ‘blocked’ matrix, matrix multiply, and it could not do that in any way. So I think there’s still, for the advanced code, there’s a lot of room for us, as software engineers, developing those ourselves.

Daniel Zingaro: Yeah, I guess what I find kind of interesting about the discourse right now is, because it’s so new that people want to be able to make these claims. Like LLMs are crap, right? Or LLMs are amazing. And, you know, it’s very early so people are gonna make these kinds of claims right now. But I guess I’m more interested in what happens when the dust settles. And I think all polarizing opinions right now, I don’t think any of them are gonna end up being what actually happens, right? Like, is every software engineer gonna be fired? No. Are we going to have a different number of software engineers? Probably, right?

Like, it’s to some extent. I don’t know if it’s gonna be more or less. But I think there are many, many opinions right now. Leo what’s that statement uh, you have about overestimating the effects of technology?

Leo Porter: Oh, there’s this famous quote, lemme see try to track it down. But there’s a famous quote, which is that we tend to overestimate the effects of technology in the short term and underestimate it in the long term.

Daniel Zingaro: Yeah, and so perhaps that’s what’s happening. And that’s the kind of stuff that grabs headlines too, right? So, I mean, we’re still in the throes of this thing, where it’s very difficult right now to separate hype from what’s actually happening. I guess I look forward to maybe getting to the point where we have more research backing, because until then it’s fun and everything, but it is just people talking about things that we don’t really know the answer until the research gets done.

Leo Porter: And I’d add it’s Amara’s Law, is the name of it. So Amara’s Law is the “we tend to underestimate the effects in the long run, but overestimate in the short term.”

Daniel Zingaro: Mm-Hmm.

Henry Suryawirawan: Yep. So I think, yeah, one thing clear, for sure, right? If you rely too much on LLM, I think still we are not there yet, right? I think in your book, you also mention it is not an expert. It is actually trained from existing code bases, right? So for example, if you-wanna solve a new problem, maybe quantum computing, let’s say, it may not be able to even give you a proper solution, right? So let’s not forget about that. I think we still need to use our judgment as well, as a human to actually apply what LLM is suggesting to us into our software.

[00:47:34] A More Equitable Society

Henry Suryawirawan: So maybe one last point I would like to ask since you are also part of the university, teaching students, right? You mentioned about equity, opportunity, probably last time, for people to learn about programming, computer science, right? There are only limited number of people. Now with this introduction of AI assistant, potentially, more people will be able to get into computer science and learn about programming. Maybe about the syntax problem will soon becomes lesser of a challenge. So what is your take on this, creating a more equitable kind of society for people to learn computer science?

Daniel Zingaro: Yeah, thanks, Henry. So this is something that Leo and I have been thinking a lot about and we’re excited by the possibilities here. But we don’t want to say anything too early, because, again, we don’t know what’s gonna end up happening. But just to summarize for everybody, the deal is that people who already have prior programming experience, they - I mean it’s unsurprising - but they tend to do better in introductory CS courses. So if they had more opportunities in high school, for example, or, you know, their parents had access to some maybe computing or courses or they directed the students into this field, then they tend to perform better.

And I wouldn’t necessarily have a problem with this except that these opportunities are not evenly distributed. And so they’re made more accessible to dominant groups. And so then this gap in prior experience leads to a gap across different types of students, which is obviously not okay. And so what we’re hoping, and the research is ongoing, we’re hoping is that because there’s a reduced emphasis on syntax using LLMs, we’re hoping that prior experience, while it will certainly still exist, the gaps will still exist, perhaps the gaps in prior experience will not lead to the gaps in outcomes that we’ve been seeing in introductory CS courses.

So again, there are a lot of caveats here. One of them, for example, is that maybe the students with privilege are going to be using LLMs earlier than other students. And then they’ll have prior experience using LLMs too. And that may convey an advantage just like a prior programming experience does right now. I guess our hope stems from the fact that learning syntax is so difficult and it’s such a barrier for so many students. And maybe these LLM skills, maybe the gap can be made smaller more quickly. I mean I want to ask Leo to jump in here too, cause this is a question that’s definitely worthy of multiple discussion points.

Leo Porter: Oh, absolutely. I think you’ve summarized the issue really well. I think there’s a couple of other reasons for optimism. And again, I’m being very cautiously optimistic, as Dan pointed out, we have to do the research. But I mentioned earlier the kinda status quo of how we assess students in computer science classes, and it’s solving these really small functions that aren’t particularly exciting, to be quite frank. And there’s been a whole bunch of work within the community that has shown that students from demographic groups that are currently underrepresented computing tend to want to see that their work is gonna help society. It’s gonna be for the societal good. And they wanna see that computing can serve that good.

I think when we move to LLMs, you end up, unless you wanna do these outdated assignments that the LLM solve for you, you have to move to these kind of open-ended large projects, which is what we were using in our class, and then they can pick the domain that matters to them. And then it can be something that’s meaningful to them personally. And I think if you can do that, I think we’re gonna bring in a broader audience of people who are interested computing, cause they see how it matters them as people. So that’d be the first reason for optimism.

And then the second one is one where I’m also kind of cautiously optimistic. And that’s there’s been a whole bunch of research kinda already started by members of our community, that’s really interesting in terms of how can we turn these AI assistants into tutors? Essentially, intelligent tutoring systems. How could we help through prompt engineering, through really careful crafting of the introductory prompts? How can we make these, make it such that when the student’s struggling, they don’t have to wait to the next office hours of instructor? They can just have a quick conversation, and they’re gonna get mostly correct answers, which is hard with LLMs, right? You gotta get correct answers. You’re gonna get encouraging answers, ones that will encourage them to keep trying. How can we get them the help they need when they need it? Cause if there’s a gap in terms of how much support different groups need, making sure everyone has lots of support will help everyone, but it will help disadvantaged groups more.

Daniel Zingaro: Yeah. And Leo, it’s not impossible that this happens, like in case people are skeptical out there. Leo and I, of course, are disinterested as well slash skeptical, because we’re scientists. But there is precedent for something good to happen here. And Leo, specifically, I’m thinking about the way that we teach our classes. So for example, using student discussion in classes through something called peer instruction, seems to be able to reduce this gap.

Leo Porter: Yeah, it seems that these techniques, like active learning, disproportionately help students who are underprivileged. And so it helps everyone, kind of boat raises all waters, but the folks who are struggling are raised more. And you see a larger impact for those struggling groups.

Daniel Zingaro: Yeah. And, and that’s because the new supports are there, right, like other students, in the case of peer instruction it’s, I think, perhaps partially a community aspect. So now they have more students who can help them kind of catch up. And so this is the hope, right? So people are already referring to LLMs as like one-on-one tutors. And I’m not willing to go there quite yet, but I think that’s the dream, right? Like Leo said, the dream is that they can reduce the time delay between having a question and getting an answer. Because if we can reduce that to zero, like just imagine that any question a student has could be answered immediately, that bodes well for students to catch up, right?

It’s a lot of the times, I think the limiting factor is just resources, right? Like, I only have office hours once a week, for example. So if a student gets stuck before, maybe they have to wait for me to get them unstuck and maybe they can get unstuck sooner with LLMs, and then catch up. So again, this is just kind of the hope right now. Maybe in a few years we can revisit this and say, yes, we were right, or no, we were not. But for now, it’s definitely something we’re interested in.

Leo Porter: I think that brings up a really good point of kind of comparison groups, which is where I’ve kind of shifted my thinking about how are we comparing. And so I’ll give kind of three examples here. One is you’ll hear folks say, we can’t change what we’re teaching in our introductory courses right now, because students are learning the fundamentals. And they, they’ll start kinda hammering on how great their current CS-1 class is. But the evidence is introductory students finishing an introductory programming class, the majority of them can’t find the average of positive numbers in a list. That’s like a super easy task for computer scientists, like for software engineers. And the majority of students can’t do that at the end of CS-1. So we, we need to make sure that we’re very clear about what we’re comparing against. What’s happening now isn’t successful for everyone.

The second piece, and this is what you made me think of for the tutors was, we’ve done a whole bunch of research out of my lab finding that both students and tutors have significant incentives to essentially just give away the answer, and just fix the problem for the student right there. Basically act as human debuggers without actually teaching the process. And so I think when we imagine that human tutor interacting with a student, we imagine the great teacher like Dan. Like sitting down and going back to like step one and diagnosing the problem and giving them the right instruction, the right time to address their misconceptions. When the reality is it’s mostly students giving this kind of tutoring help. And they’re maybe not giving the best instruction. And so we have to be honest with ourselves about what are these AI assistants being compared against? And then we can actually do a fair comparison.

Henry Suryawirawan: Right. I think it’s like what you said, right? We can be cautiously optimistic about this kind of equitable future, right? So I think really looking forward for more chances, more opportunities for people.

[00:55:58] 3 Tech Lead Wisdom

Henry Suryawirawan: It’s been a great conversation so far, right? So I think we will have a lot more topics if we don’t cut it short for now, right? I have one last question before we wrap up, which normally I ask for all my guests. I call this 3 technical leadership wisdom. You can think of it just like advice as well for people to learn from you. Maybe if you can share your version of three technical leadership wisdom.

Leo Porter: Henry, I, I love that question, and if you don’t mind, I’ve got slightly long answer for my first one. And it’s I had a really close colleague who is just a fantastic cyclist. His name is Allan Snavely here at UC San Diego, and he was part of a race. And he was a fantastic cyclist, and they were in the second pack. Anyone who follow a cycling knows that you’re in packs. And the first pack is up ahead of them, they can’t catch them. But at one point along the race, like the front pack seems to go the wrong direction. And Allan pretty quickly realizes, wait, that’s actually not the direction to the finish. What are they doing? And so he steers the second pack towards the finish, and it’s the one race that he ever got to win, because the main pack went off in the wrong direction. And whenever he tells that story, I, I always get kick outta it.

But it-reminds me from a leadership perspective, that it’s important to be good, it’s important to be fast, it’s important to be able to be productive, but it’s just as important or even more important to know where you’re going. And so I spent a lot of time with my group and with my lab, and making sure we have a vision for where we’re going, it’s that we are going the right direction.

Daniel Zingaro: Thanks, Leo. That’s a powerful one, especially for researchers like us to remember. I have another one I think that relates to research too, which is actually even more important now I think for people who are not researchers as well, because of the LLM discourse right now. And that is always test assumptions or always be aware of assumptions that people are making. And I, I bring this up specifically now, because I think we’re at the beginning of this, the flood of research and commentary that’s gonna come out about LLMs. I mean, obviously this applies to everything, right? Always, you know, take the time to understand where the writer is coming from or where your own assumptions are coming from.

But especially now, I just want to caution that people are going to be making sweeping statements about LLMs. And Leo and I read a lot of research around LLMs, and often, you know, if you’re a busy researcher, a busy professor, you can get some summary of the paper by reading the abstract. It’s not a great practice, but if you’re very busy, you can get a sense of what the paper’s doing. I don’t think this necessarily works for LLM papers. There are so many assumptions that are baked in to the experiments that people are doing right now. We can’t even agree on the right skills that we want students to have when they’re working with LLMs anymore.

And so I think we’re seeing a lot of papers that like there are the headlines like LLMs suck or LLMs are amazing or whatever. But I think we need to dig beneath the headlines to see exactly what’s going on, especially in a new area like LLMs, where there are so many assumptions that have not even been written down yet that people might be making.

Leo Porter: That’s a brilliant point, Dan! And we see with the new papers coming in, they’re coming in very quickly. And because people are trying in such a race to get the research done, it’s really important for us to go to the methods and actually read the paper fully. And I know, I know you’re fantastic at that. And so I, I, hope all the other practitioners, all the people teaching programming do the same thing, spend their time making sure they understand the studies that have been done.

Daniel Zingaro: Yeah, and it’s not even that anybody who’s involved is being deceptive. I think everybody’s being super honest about what’s happening. But the assumptions, I think, are so new that we’re not even necessarily writing them down. Like, if we’re not being careful enough, we might be making assumptions about LLMs. Like, so, for example, I could just think in my head, okay, students still must know syntax. And maybe that’s true, maybe it’s not true, but it might be so obvious to me one way or the other that I just might not even take it into account in my research. And this is one of the most dangerous things for researchers, right, Leo. It’s like an assumption that is apparently super obvious that you don’t even question it. Or even worse, you don’t even write it down. And I think we’re, as a community, we’re at risk of doing this right now. Because everything is moving so quickly.

Leo Porter: Exactly. We’ve been studying how to teach programming for the last 40 years. And so we have so many assumptions built in about that. I think just even the assumption of what is the end goal? Like, is syntax an end goal of a intro programming class? We don’t know. Like I think there’s gonna be a whole bunch of discussion about that.

Daniel Zingaro: Yeah. Or like does it make sense to compare what students learn with LLMs against what they learn without LLMs? Like what do you compare, right? What’s important? Like I don’t think we know the answers to these questions. So I guess I’m asking more questions than I’m answering, which I don’t think was what Henry wanted for this section. Yeah, yeah. And Leo, you have one more I think you wanted to share.

Leo Porter: Oh yeah, absolutely! So the last piece and this is gonna be me honestly kind of just taking from the great wisdom that Henry’s already shared previously with some of his guests. And that’s I believe everything’s done with people. Like if Dan and I work fantastically together, I love working with my lab.

And so I believe very fervently in the notion of empowered teams. And the message from Marty Cagan really resonates with me. I first heard it from Monty Hammontree at Microsoft. And just a really powerful message of you wanna make sure your teams are empowered to be able to do the work they want to do and solve important problems.

And I think Dan and I have both seen this, as PhD advisors in empowering our PhD students to find their own path is probably one of best things we get to do as faculty. Of watching them not really knowing what they wanna study initially, and us really being close to them on every project they do. To six years later, five or six years later when they are now essentially running their own research program, and we’re just giving occasional advice. And so I think for the tech leaders out there who’ve been listening to Marty Cagan’s message of Empowered Teams, um, I think it applies more than just software engineering teams. I think you empower all the people who work with you, and you, you end up at a better place.

Daniel Zingaro: Yeah, it’s like tutoring, right, Leo? It’s like one-on-one work is really the most powerful work you can do. I’ll take my classes of 300 or 400 or whatever. I’ll, you know, I’ll do my best, but you can’t match a small team just empowered to do great work, so I totally agree.

Henry Suryawirawan: So, yeah, I think for people who want to learn more about this AI assistant from your book, uh, is there any resources or a place where they can find you online?

Leo Porter: Yes, if people are interested in getting our book and trying to learn how to write software with the aid of an AI assistant, they can just look for our book on Amazon. It’s really available in all countries. And we candidly are very open to feedback. This is a very, very new space. As readers work through it, we would appreciate the emails or comments on LinkedIn that would let us know how they’re appreciating the book and what we-can-do better for a second edition.

Daniel Zingaro: Thanks for organizing this for us, Henry.

– End –