Table of Contents
- I. Intro
- II. Lessons from Meta
- III. Joining Anthropic
- IV. The Origins of Claude Code
- V. Boris's Claude Code Workflow
- VI. Parallel Agents
- VII. Code Reviews
- VIII. Claude Code's Architecture
- IX. Permissions and Sandboxing
- X. Engineering Culture at Anthropic
- XI. Claude Cowork
- XII. Observability and Privacy
- XIII. Agent Swarms
- XIV. LLMs and the Printing Press Analogy
- XV. Standout Engineer Archetypes
- XVI. What Skills Still Matter for Engineers
- XVII. Book Recommendations
Intro
GERGELY: You were the first ever TypeScript book with O'Reilly.
BORIS: Yeah, I found that book translated in Japanese in this little town in Japan. That was just the coolest moment. And then I realized I don't remember TypeScript at all.
Now we're at the point where Claude Code writes, I think something like 80% of the code at Anthropic on average. I wrote maybe 10–20 pull requests every day. Opus 4.5 and Claude Code wrote 100% of every single one. I didn't edit a single line manually.
Andre Karpathy posted that he's never felt as much behind as a programmer as he is now. This is something I really struggle with. The model is improving so quickly that the ideas that worked with the old model might not work with the new model.
One metaphor I have for this moment in time is the printing press in the 1400s because there was a group of scribes that knew how to write. Some of the kings were illiterate who are employing the scribes. And if you think about what happened to the scribes, they ceased to become scribes, but now there's a category of writers and authors. These people now exist. And the reason they exist is because the market for literature just expanded a ton.
GERGELY: What happens when you join one of the top AI labs in the world and your first pull request gets rejected? Not because the code was bad, but because you wrote it by hand. This is exactly what happened to Boris Cherny when he joined Anthropic. Boris is the creator and engineering lead behind Claude Code. Before joining Anthropic, he spent 7 years at Meta where he led code quality across Instagram, Facebook, WhatsApp, and Messenger, and was one of the most prolific code authors and code reviewers at the company.
In today's episode, we cover how Claude Code went from a side project to one of the fastest growing developer tools and the internal debate at Anthropic whether to release it at all. Boris's daily workflow of shipping 20–30 pull requests a day with zero handwritten code and how code review works when AI writes everything. Why Boris believes we're living through a time as transformative as the printing press and which engineering skills matter more now and which ones do not.
If you want to understand how one of the people closest to AI coding agents actually builds software today and what that means for the rest of us engineers, this episode is for you.
GERGELY: How did you get into tech, software engineering, and coding in general?
BORIS: It starts a while back. I think there was kind of like two parallel paths that crossed. So, when I was maybe 13 or something like this, I started selling my old Pokemon cards on eBay. And I realized that on eBay, you can actually like write HTML. And I was looking at other people's Pokemon card listings and I realized like some of them have like big colors and fonts and stuff like this. And then I discovered the blink tag. And if I put the blink tag on it, I could sell my card, you know, for like 99 cents instead of 49 cents or whatever. So I kind of learned about HTML this way. Then I got an HTML book and kind of learned about HTML.
And then the second thing was this was also I think sometime in middle school. We had these old TI-83 graphing calculators and we used them for math. And what I realized is I can get a better answer on the math test if I just program the answers to the math test into my calculator. And so I wrote these little programs to just program the answers and then the test got harder, so then I had to program solvers instead of the actual questions because I didn't know what the coefficients and stuff would be ahead of time and then the math got more advanced like the next year and so I had to drop down from BASIC to assembly to just make the program run a little bit faster.
GERGELY: Oh wow. So like in high school you dropped down to assembly.
BORIS: I think this is like middle school or high school maybe like 8th or 9th grade or something like this. Then the thing I realized is everyone in my class was starting to realize that I had the solver and they got kind of jealous and so I bought this little serial cable so I can give it to them too. And then the next math test, everyone in the class just got A's. And the teacher was like, what's going on? And then eventually she realized it. She was like, okay, you get away with it once and knock it off.
But for me, it was very practical. So, you know, in school I studied economics. I actually dropped out to do startups and I never thought that coding would be a career at all. It was always very practical to me. Coding is a means to build things and to make useful things.
This startup — the first one was I think it's like my friends and I were trying to get weed and so we started this like weed review startup. We made a website. We called kind of different dispensaries and then we just tried to get weed samples so we could review it for them. And it actually kind of blew up. And then I actually got more interested in at the time no one was like testing this stuff and so I got into kind of the chemical testing, chemical analysis and then after this I kind of did a bunch of other startups and then I joined YC actually pretty early and I was the first hire of this YC startup up in Palo Alto after.
GERGELY: How did you decide to go to one startup after the other?
BORIS: Kind of vibes, vibes I'd say because you know, startups it's never a linear path. You always kind of pivot, pivot, pivot. You have to figure out what the market wants and what users want. And it's never the thing that you think. You always try a thing, but the idea is always a hypothesis and then almost always you have to pivot once, twice, three times.
You know, at this medical software company, this is called Agile Diagnosis. This was kind of an early YC company. This was back in maybe 2011, 2012, something like that. It was medical software for doctors. And the idea was there's these like clinical decision protocols. They vary a lot hospital to hospital. And our idea was there was one hospital in Chicago that had a really great protocol specifically for cardiac symptoms. And so we're like, wouldn't outcomes be great if every hospital in the US would use the same protocol? And so we tried to standardize it. And we made this like decision tree software for doctors to use. And I wrote, you know, some of the software. The team was like — it was just a few of us. It was a pretty small team. And I wrote the software. It was in a web browser. And I remember this was back in the like the Internet Explorer 6 days, that's what hospitals were using and I wrote this SVG renderer because it was this visual decision tree and we launched it and then we had a DAU chart and the DAUs were flat and couldn't figure it out. And we were piloting it with a few hospitals at the time and at the time we were based in Palo Alto we were piloting it with a few hospitals including UCSF and I rode a motorcycle at the time so I rode my motorcycle up to UCSF and I shadowed doctors for a couple days just to see how they actually use it.
And I realized that actually doctors don't have time to sit down and use a computer because you're seeing a patient then you have maybe 5 minutes until the next patient and in those 5 minutes you have to walk down the hall you have to go to the computer station you have to open up this totally legacy computer. By the time it boots up that's like 3 minutes. Then you open up Internet Explorer 6, that takes like 30 seconds. Then you have to open up this app that we built. You have to sign in and your 5 minutes are up — you don't even have time to use it.
And so we rewrote everything to run on Android and they still weren't using it. And the thing we realized is doctors are walking around with a bunch of residents behind them. In this kind of situation, it's like a social situation, right? Like the thing that matters is they're seen as an authority. They don't want to be seen on their phones. And then we pivoted again. So at that point, we were like, okay, so maybe the doctor isn't the target user. Actually, we wanted to be used by maybe nurses or X-ray technicians or something like this. At that point, I left because I was like, "This is actually pretty far off from kind of what I wanted to do."
This is like the most fun thing for me is finding this product market fit because it's always surprising. You can't have one big idea because the idea is probably going to be wrong. So, you kind of form hypotheses, you follow it down and you see what's right.
GERGELY: Also, I find it so interesting how you're telling us this story because I feel behind a lot of startup success stories, we hear the success story. We hear the path of how it went. But first of all, a lot of startups are like this. And second of all, what struck me is you were hired as a software engineer, right? And this was back before product engineers or anything was a thing which we're now talking about. But you just like you rode your motorbike and you went there and you shadowed the people and you understood how they're using it, why they're not using it, getting ideas. I feel, you know, this is what makes a great software engineer back then and even today, right? You weren't — doesn't seem to me that you were focused on a technology. You were focused on the outcome, though.
BORIS: Yeah. I mean, look, there's different kinds of engineers and there's different ways to do it. And you know, I even on our team right now, I look at an engineer like Jared Sumar and he's just incredible technical mind. He understands systems better than anyone I've met. And you know you need people like this. You need people with this kind of depth. For me engineering has always been a practical thing. And you know for me I've always been a generalist and like it doesn't matter if I'm doing design or engineering or user research or whatever.
My first job I ever had, I was like, I think I was 16 and I just wanted to buy an electric guitar. And so what I did was I started freelancing. And so I was like, "Okay, I guess I'll make websites." And I think Fiverr was not a thing back then. So there were some other freelancing websites. So I just started — I put up a website. I started bidding on stuff. And my first paycheck, I just spent the entire thing on an electric guitar. But it was very practical, right? Because it's like when you're in this kind of setup, you have to do the engineering, you have to do kind of the accounting, you have to do the design, you have to talk to customers. It's just always been like that for me.
Lessons from Meta
GERGELY: After a couple of these startups, you ended up at Facebook, now called Meta. And there you spent seven years there. Can you just talk us through what you've worked there, what you've learned there? You've also had a very remarkable career growth in terms of four promotions over seven years. And what did you take away from that experience?
BORIS: Yeah, so I started on Facebook Groups. That was the first time I worked on — Vlad Klesnikov hired me. I think he's actually still at Facebook. I think he's on some other team now. And it was cool actually. There's a big group of people that I worked with that were these kind of early JavaScript people too. And you know, like I did a bunch of JavaScript stuff. And it's funny like I kept crossing paths with these people. And so Vlad, he worked on Bolt.js, which was the framework that powered Ads Manager which later became React.js. I kept crossing paths with these people and later on there was a bunch more people like this.
But anyway so I was working on Facebook Groups. I was really excited about it because of this mission of connecting people to their community. This is the thing that drew me in. And at the time I was a big Reddit user. I became a Reddit user back when I was a teenager because I didn't know anyone else that coded. Even in college, I didn't really know anyone that coded. And honestly, I was always kind of embarrassed about it because I thought it was this nerdy thing. And I thought it was kind of this thing that I knew how to do, but I wanted, you know, I wanted to be like a cool kid and like I couldn't tell people that I coded. It was very nerdy.
And at some point I discovered some programming community on Reddit and I was just shocked — like there's other people that are into this thing. It's like such a weird hobby. It's so niche and it was just so exciting to find like-minded people like this and get this connection. And so I just wanted to work on this. I wanted to kind of contribute to this in some way.
So I worked on Facebook Groups for a while. And then you know there's a bunch of different projects — I have to get into details for any of these. Eventually I became the tech lead for Facebook Groups and kind of grew into this and the org grew. The work really changed. It changed from kind of building to a lot of like doc writing and coordination and kind of delegating to others. The culture was changing at the time. So you know this early Facebook culture was disappearing. The docs were coming in. The alignment meetings were coming in. There was a lot more work around this kind of foundational stuff like privacy, security, things like this that I think honestly early on a lot of corners were cut in order to grow. But at some point you just have to pay that debt and that was the time when that happened.
Then I spent a few years at Instagram after, and that was also a funny story. My wife got a job offer and she was just really excited about it and she came to me and was like, "Hey, like I got this offer but we're going to have to move. Is that okay?" And I was like, "Yeah, that's fine." You know, like I work in tech, we can work remotely anywhere. Where's the job? And she was like, it's Nara. And I was like, where's that?
GERGELY: Nara is like rural Japan. And this was — different time zone as well.
BORIS: Different time zone. Yeah. It was like 12 hours or something different. Yeah, this was like 2021. And then I tried to kind of find a team that would sponsor me because there were these kind of arcane HR rules about the time zone you have to be in and the team you have to be collocated with and so on. And so there was a little nascent team for Instagram in Tokyo and Will Bailey was running this team. He was also the guy that made Instagram Stories and so he was my manager for a while and so we decided to grow that team together and I worked remotely from Nara and then most of the team was in Tokyo.
And during this time I started hacking on Instagram and the stack was just insane. Like Facebook was the single best web serving stack in the world. The way that everything is optimized — from the Hack language to the HHVM runtime to GraphQL as the transport layer to the client libraries like Relay and React — it was just amazing. There's no other dev stack in the world that was this good and it's just fully optimized.
And then I went to Instagram and it's like, you know, Python where the type checker didn't work and click-to-definition didn't work and it was this hack-together Django and then a bunch of the Cython runtime and just nothing really worked.
So I came to Instagram, I joined the Labs team in Japan and the idea was to find the next big thing for Instagram. We tried some stuff but what I very quickly realized is that I was just not effective at working on the stack because it was such a terrible stack and so I just went and started working on Dev Infra because we needed to fix it.
And there's a few projects that we worked on. So one was migrating from Python to the big Facebook monolith. Another one was migrating from REST to GraphQL. And these projects, they're actually in progress, you know, like these are things that involve — it takes hundreds of engineers many years to do this. It's a big codebase. It's a big migration.
GERGELY: Now it's much faster.
BORIS: Yeah. With these tools that we have, the AI tools — and migrations are a pretty good use case for them.
GERGELY: Yeah. It's like the perfect use case for it.
BORIS: And then I just started getting kind of deeper into this. And by the end, by the time I left Instagram, I was working on dev infra and kind of leading a bunch of these migrations. That's also where I intersected with Fiona Fun who is now the manager for the Claude Code team. I just worked with her and she was just such an amazing leader, this incredible depth and kind of history in tech. And I just thought like there's no better manager for this team.
And then I also started working on code quality. And so the work on Instagram kind of expanded a bit. And by the time I left, I was leading code quality for all of Meta. And so I was responsible for the quality of the codebases across Instagram, Facebook, Messenger, WhatsApp, Reality Labs, kind of all these codebases.
At Meta, it was this program called Better Engineering. And the idea was I think it's sort of like 2016 or 2018 or something, but Zuck mandated that every engineer at the company 20% of their time has to be spent fixing tech debt.
GERGELY: Oh, interesting.
BORIS: And we called this Better Engineering. And some of this is kind of bottom up where a team knows best the tech debt that they have to fix and then some of it is top down where you need to do very big migrations — you need to migrate to new language features, new frameworks, things like this. And at Facebook scale, there was tens of thousands of these migrations every year.
And so I just started leading all this and I realized very quickly that it just needed a little bit more order to it. There was no goals. No one knew kind of like what the outcomes were. There wasn't any tracking. And so we developed a bunch of stuff. One of the ideas was a centralized way to prioritize the different code quality efforts. The second thing was figuring out the impact of code quality on engineering productivity which turned out to be significant.
GERGELY: How did you measure — what did you find there?
BORIS: There was a bunch of stuff. I think some of this has been published. I don't know if all of it has, but essentially you try to do causal analysis and causal inference. This is the methodology. You try to figure out what are the factors that make it so engineers are more productive. Some of it is code quality, some of it is outside of code quality. So for example, Meta went back to return to office instead of work from home. That was partially driven by this because we just found some fairly strong correlations that we thought were causal. But quality actually contributes like double digit percent to productivity.
GERGELY: It turns out even at the biggest scale. It's kind of comforting to hear because I think it's rare to have a place where you actually measure this, but I think we feel it like when you have a clean codebase, modular — it can get easier to work with. And I think, you know, reasoning — could it also be easier for an LLM to work with it and my hint would be yes it should be, right? But I think there's just very little data, but that's a feeling that I would have.
BORIS: Yeah, I think a lot of the big companies have published about this. Like I think Facebook published something. Microsoft publishes a bunch about this, Google does. But yeah, totally. If every time that you build a feature, you have to think about do I use framework X or Y or Z — these are all options that you can consider because the codebase is in a partially migrated state where all of these are around the code somewhere. As an engineer, you're going to have a bad time. As a new hire, you're going to have a bad time. As a model, you might just pick the wrong thing and then the user has to course correct you. So actually the better thing to do is just always have a clean codebase, always make sure that when you start a migration you finish the migration. And this is great for engineers and nowadays it's great for models too.
Joining Anthropic
GERGELY: And then you joined Anthropic and I've heard this story which you can confirm or give more color to it — that your first pull request was rejected by Adam Wolf.
BORIS: He was my ramp buddy. So I joined Anthropic. I was trying to figure out kind of like what to do next and you know I met a bunch of people at all the different labs and Anthropic was just the obvious choice for me because of the mission. This is the thing that personally I know that I need the most. And also just kind of seeing all this change that's happening. It's important to have some sort of framework to think about this and to think about our role in it.
I'm also a really big sci-fi reader. Like that's definitely my genre. I'm a big reader. I have like, you know, giant bookshelf at home and stuff and I just know how bad this thing can go and I just felt like this is a place that has serious thinkers. People are taking this very seriously and thinking about what can we do to make this thing go better.
So when I joined Anthropic, I did a bunch of ramp up projects — just various stuff that I was hacking on — and I wrote my first pull request by hand because I thought that's how you write code.
GERGELY: That used to be how you write code.
BORIS: That used to be how you write code. But even at the time at Anthropic, there was this thing called Clyde and it was the predecessor to Claude Code. It was super janky. It was Python, it took like 40 seconds to start up. It was research code. It was not agentic. But if you prompt it very carefully and hold the tool just right, it can write code for you. And so Adam rejected my PR and he was like, "Actually, you should use this Clyde thing for it instead." And I was like, "Okay, cool." It took me like half a day to figure out how to use this tool because you have to like pass in a bunch of flags and use it correctly. But then it spat out a working PR. It just one-shotted it.
GERGELY: Oh, and this was like 2024.
BORIS: This was like September 2024, August, something like that. And I think for me, this was my first "feel it" moment at Anthropic because I was just — oh my god, like I didn't know the model could do this. Like I was used to these kind of tab completions, line level completions in an IDE. I had no idea that it could just make a working pull request for me.
The Origins of Claude Code
GERGELY: And then when you joined Anthropic, we've covered this in a deep dive, but we could recap briefly on how Claude Code came to be out of what seemed like a side project or just a cool hack.
BORIS: So yeah, I started hacking on a bunch of different stuff. I was working on some things in product. I worked on reinforcement learning for a little bit just to kind of understand the layer under the layer which I was building. This is still advice that I give to a lot of engineers is always understand the layer under. It's really important because that just gives you the depth and you kind of have a little bit more levers to work at the layer that you actually work at. This was the advice 10 years ago. It's still the advice today. But the layer under is a little bit different now. You know, before it was like understand the Java VM if you're writing JavaScript, understand the JavaScript VM and frameworks and stuff. Now it's like understand the model.
So I was hacking on a bunch of different stuff. Something shipped, some things didn't ship. And at some point I just wanted to understand the public Anthropic API because I'd never used it before. And I didn't want to build a UI. I just wanted to hack something up quite quickly because we didn't have Claude Code back then. We were still writing code by hand. And I wrote this little batch tool that all it did was it hit the Anthropic API and it was essentially like a chat-based application but just in the terminal because that's what AI used to be.
And you know, I still think about it like — engineers are the first adopters. And so when we started to move out of conversational AI to agentic AI, it took a little bit, but engineers understood it pretty quickly. And I think now when you ask non-engineers about what is AI, they would say it's this conversational AI, it's like a chatbot or something. And that's why I'm actually very excited for co-work, this new product that we launched, because it's going to bring the same thing that engineers saw very early to everyone else.
But when I think about co-work, I think back to this moment that we're talking about — very early on, Claude Code originally wasn't Claude Code. It was a chatbot because that's what I thought AI was. But we had to kind of figure out what is the next thing. And so at the time I built this chatbot. It was somewhat useful, but it was just a chatbot.
And the next thing that I tried was I wanted it to use tools because tool use just came out and I didn't know what it was and I was like let's experiment. And I gave it a single tool which was the bash tool and I didn't know what to do with the bash tool and so I asked it — I actually didn't know if it could even do this — but I asked it like "what music am I listening to?" and it just wrote a little AppleScript program using osascript or whatever to open up my music player and then query it to see what music it's listening to and just one-shotted this with Sonnet 3.5.
This is actually my second "feel it" moment, very quickly after the first one. And the model just wants to use tools — that's just what I realized. Like this thing, if you give it a tool, it will figure out how to use it to get the thing done.
And I think at the time when I think about the way that people were approaching AI and coding, everyone essentially had this mental model of — you take the model and you put it in a box and you figure out like what is the interface, how do I want to interact with this model? What do you need it to do? Essentially, it's like if you have a program, you stub out some module, stub out some function, and you say, "Okay, this is now AI." But otherwise, the rest of the program is just a program.
And so this is just not the way to think about the model. The way to think about it is the model is its own thing. You give it tools. You give it programs that it can run. You let it run programs. You let it write programs, but you don't make it a component of this larger system in this way.
And I think there's just like, you know, this is a version of the bitter lesson. There's the bitter lesson is a very specific framing, but there's many corollaries to it. This is one of the corollaries — just let the model do its thing. Don't try to put it in a box. Don't try to force it to behave a particular way.
GERGELY: One of the first ways you saw it was giving it tools, giving it access to the bash and then later to the file system and then to more tools. Right.
BORIS: That's right. Yeah, we give it bash then — I say "we," it was just me the first three months — but then the team grew. So it was bash, it was file edit, that was the second one.
GERGELY: And one of the interesting things we talked about last time for the deep dive is when you built it and it started to actually write code with the tools that you had, you've had an internal debate inside Anthropic — should we just keep it to ourselves because it's making — suddenly it spread across engineering and it was making all of you a lot more productive, right?
BORIS: Yeah, that's right. In the end, the decision was to release so that we can study safety in the wild. Because when you think about safety — and you know, I keep talking about the word safety. The reason Anthropic exists as a lab is safety. This is the reason it was founded. This is the reason it exists. If you ask anyone at Anthropic why they chose it, it's because of safety.
And so if you think about model safety, there's different layers at which to think about it. There's kind of alignment and mechanistic interpretability. This is at the model layer. Then there's evals and this is kind of like putting the model in a petri dish and synthetically studying it. And then you can study it in the wild and you can see how it actually behaves. You can see how users talk about it. You can see what are the risks in the wild and you actually learn a lot this way. And by doing this we've been able to make the model much safer. So in hindsight it was totally the right decision.
GERGELY: It's amusing to hear about it from your perspective because from the outside what I saw and what a lot of engineers saw is like — oh, Anthropic released Claude Code, wow! For the first release with — I believe it was with Sonnet 4, was it — did it come out with Sonnet 4 originally or Sonnet 4.5?
BORIS: I think it was — that was the general availability in February. But I think it was research preview before that.
GERGELY: Yeah, but when it came out my feeling was like oh this thing can write code pretty well. And over time it became a lot more capable. So from our perspective it was like this really capable coding tool that we just started to adopt and use and use for all sorts of increasingly productive parts. And it has become I believe one of the fastest growing developer tools. And I'm always surprised to hear the story that it actually comes from research and the goal to understand how people use the model because on the other hand like some startups have been trying to build developer tools deliberately to get adoption and yet this research tool is getting a lot more adoption.
BORIS: I mean this is — Anthropic, we're a research lab, we're a safety lab, and product is this kind of thing tacked on to the side. Product exists so that we can serve research better and so we can make the model safer. And this is kind of how we think about everything.
There was also this funny moment early on when we had this launch review and we were deciding whether to launch it. I remember this moment because we were in the room. I think there was Mike Krieger, there was Dario, there were some other folks in the room and we were deciding what should we do. We were looking at the internal adoption chart which was just vertical — it was just insane. Nowadays vertical is 100% — just 100%. Like nowadays every technical employee at Anthropic uses Claude Code every day. It's pretty much 100%.
For nontechnical employees it's also getting quite close to 100%. It's increasing very quickly — like half the sales team uses Claude Code. And I think that's increasing. It's just crazy.
Dario had this question about like how did it grow this fast? Are you forcing people to use it? And I was like, no, we offer this tool, people vote with their feet and just let people use the tool that they prefer.
GERGELY: Yeah, they chose it. You don't seem like the person who's actually forcing people to use your tool.
BORIS: Yeah. I mean the way we did it, we just launched the thing and then we just listened to the users and we talked to people, we saw how they use it, we followed up, we made it better. And yeah, I mean now we're at the point where Claude Code writes I think something like 80% of the code at Anthropic on average. And it writes all of my code for sure.
GERGELY: Yeah. And this started for you — it started the first time you mentioned I think it was in November when it started to write all of your code. When did that switch come and what happened to make you trust it to write your code, or how much you trusted? How much do you review that code, for example?
BORIS: So the switch was instant when we started using Opus 4.5. This was before it came out — we were dogfooding it for a little bit — and it was just right away. It's such a more capable model. I just found that I didn't have to open my IDE anymore. I just uninstalled my IDE because I just didn't need it at that point. I actually did that like a month later because I just didn't even realize that I wasn't using it anymore.
GERGELY: Yeah, a lot of us had similar experiences once Opus 4.5 was out in the public and especially over the winter break. I had a similar experience. I just realized that this thing — it actually writes, if I'm being honest with myself, as good code as I would have written in the stack that I'm very familiar with and my codebase, my side projects where I know it — and just a lot better than what I could for codebases that I'm not as familiar with or technologies I'm not as familiar with.
BORIS: Yeah. I'll be honest, it writes better code than I do.
GERGELY: I don't want to go there. I still like to keep my pride, but probably true.
BORIS: Yeah. I realized this because also in December, I was traveling a little bit. I was on a coding vacation. We were talking about this before, but I went to Europe. We were just in a different time zone kind of nomading around. And it was so fun because I was just coding all day every day, which is my favorite thing to do. And I wrote maybe, you know, like 10–20 pull requests every day, something like that. Opus 4.5 and Claude Code wrote 100% of every single one. I didn't edit a single line manually. And I realized at the end of that month, Opus introduced maybe two bugs whereas if I had written that by hand that would have been, you know, like 20 bugs or something like that.
Boris's Claude Code Workflow
GERGELY: Can we talk about your development workflow? You have written threads about this which is awesome — it's on social media on Threads and on X. But can you tell us how you use today Claude Code in terms of parallelism and tips and tricks that you and the team have kind of learned and share across the team?
BORIS: Yeah, I mean look there's no one right way to use Claude Code. So I can share some tips and things but I think the wrong conclusion to draw would be to just copy these and use it. The way we build Claude Code is we build it to be hackable because we know every engineer's workflow is different. There's no one way to do things. There's no two engineers that have the same workflow.
GERGELY: It's just — every engineer is same with workstation setup, right? Like keyboards, monitor placement, all that. Everyone has it differently.
BORIS: Yeah. It's like we're crafts people, right? Like you choose your tools. We care deeply about it. So there's no one right way to do it.
So for me, the way that I do it generally is I have five terminal tabs. Each one of them has a checkout of the repository. So it's five parallel checkouts. And usually I'll kind of round-robin and start Claude Code in each one. Almost every time I start in plan mode. So that's like Shift+Tab twice in the terminal.
And I also overflow as I run out of tabs because there's only so many terminal tabs. I used to use web a lot for this — like claude.ai/code, that's the place that I overflow to. Nowadays I actually use the desktop app. It's more convenient. So Claude Code, it's been in our desktop app for many months. It's just a code tab in the Claude app. And I actually really like it because it has built-in work tree support. So that's existed for a while. And that's quite nice for parallelism. So you have multiple — you don't need multiple checkouts. You just have one and then we automatically set up Git work trees for you. So you get this kind of environment isolation.
The reason I do that is I actually just really hate fiddling with Git work trees on the command line because it's kind of fiddly.
GERGELY: Like you need to know the — CD, get work tree — for those who are not as familiar with it, it's when you can check out instead of having a separate local folder, it's almost like checks out a separate branch, right? And then you can work on it separately but not have the complexity only at merge time.
BORIS: That's right. Imagine that you have a folder but you have maybe like — Git makes five copies of that folder in a way that's very cheap and kind of easy to throw away. So you get this kind of isolation. It can work in parallel and the Claudes don't interfere.
GERGELY: Yeah. So you now have support for this which I think you recently added like native support — but for your workflow you just stuck with the old one of checking out on separate folders, right?
BORIS: Yeah, exactly. I actually find over time I'm using the desktop app more and more for this. Just because I don't need these separate checkouts and I just have a bunch of Claudes running in parallel and I don't have to think about it.
The other surprise hit is the iOS app for me. Every day I start — like I wake up and I just start a few agents on my phone.
GERGELY: Oh, the native one.
BORIS: Yeah, the native one. It's just the Claude app. It's the code tab in the Claude app and it's the same exact Claude Code. Yeah, except it runs in the cloud.
GERGELY: It runs in the cloud. Yeah. So you have to kind of configure the environment.
BORIS: Luckily, our environment is pretty simple. And we just use hooks for it. So you just use the session start hook and configure it. This is kind of one of the benefits of making Claude Code really hackable — it's very easy to do this kind of configuration.
And this is something honestly I would never have predicted because I code on a computer. If you told me six months ago I'd be writing — I don't know, a third, I haven't pulled the data, maybe like a third or half, something like this — of my code on a phone. That's crazy. But that's what I'm doing today.
Parallel Agents
GERGELY: And you're using parallel agents. At what point did you start using them and how has it changed your work? Because one thing that I notice on myself, I don't really use that many parallel agents. I maybe like two at a time, but I'm someone who — well, I like to be in charge and especially with Claude. Claude is a tool that you can follow it along. It tells you what it's doing. You can also have for example learn mode which was shipped a lot earlier where you can actually follow along. It gives you tasks. I feel that like staying in one tab and following along the model is pretty fast as well. I can kind of keep in touch. I'm assuming at some point you must have done this but then what happened when you changed to parallel and do you feel you're losing any control or it doesn't really matter that much?
BORIS: Yeah, I think there's kind of like two modes to think about or two workflows to think about. So when you're new to a codebase, highly recommend learn mode. It's awesome. Highly recommend it for people that are onboarding to the Claude Code team, people that onboard to Anthropic. The thing that we recommend is — so you do, for people that haven't tried it, you do /config in Claude Code, you pick the output style and you can do "learn" or "explanatory." We usually recommend explanatory because that tends to be better for new codebases that you kind of haven't been in before.
For me, once you're familiar with the codebase you just want to be productive, right? Like you just want to ship as much as you can and you want to kind of be effective doing that. So the role really switches. I don't really go deep into tasks anymore. I start a Claude in plan mode. I'll have it kick something off.
With Opus 4.5, I think it got there. With 4.6, it just really really does it. Once there is a good plan, it just — it will one-shot the implementation almost every time. So, the most important thing is to go back and forth a little bit to get the plan right.
So, what I do is I start one, I enter plan mode, I give it a prompt. As it's chugging along, I'll go to my second tab and I'll start the second Claude also in plan mode. Get it chugging along. Then go to the third tab, go to the fourth one. Then maybe I'll go back to the first one when I get notified that it's done.
GERGELY: And then — do you have notifications on or do you turn them off?
BORIS: I actually operate in both modes. Sometimes I do like focus mode on the Mac. So I just have it off, but also sometimes I use the system notifications.
GERGELY: And you're very very productive with PRs. I mean, I think it was very visible. Even around the holiday breaks on social media, you actually were responding to — I think someone reported a bug or a feature request. I'm not sure which one it was. And then an hour or two later it was done because you did it. You've also talked about like number of pull requests you've done on a day — not to show off but just as context. What does a pull request typically involve in terms of complexity? Are some super trivial or are some actually larger pieces of work as well?
BORIS: Yeah, pull requests — each one varies a lot. Sometimes it's a few lines, sometimes it's a few hundred or a few thousand lines. They're all just very very different.
It's changed so much. Like back when I was at Instagram, I think I was one of the top two, maybe top three most productive engineers at Instagram just by volume of code written.
GERGELY: Oh wow.
BORIS: So I've always — for me I've always just coded a lot. Like this is a way that I can express myself and it's just a way that my brain thinks also. And so now I just get to do it. But I think with Claude Code the kind of code that you write — if you are very productive — the number of PRs sort of under-tells what's happening because I think people that used to be very productive in the old days before AI assistants, a lot of the code maybe was like code migrations or something like this. So people that shipped 20–30 PRs every day, a lot of it was pretty much like a one-liner or kind of migrating A to B or whatever.
Nowadays I ship 20–30 PRs every day but every PR is just completely different. Some of them are thousands of lines, some of them are hundreds, some of them are dozens, some of them are one-liners. None of these are kind of code migrations because actually Claude just does those and I don't need to be part of that.
Code Reviews
GERGELY: Shipping this much code or being this productive — the obvious question that comes up for any software professional is the review. The way teams used to work — and I'm not sure if Instagram did this but a lot of other companies did — is you make a pull request, you put it up there, there's a mandatory human reviewer. At Google there's actually two because there's one on code quality as well. How has this workflow changed? How does the Claude Code team think about code review and how has it changed over time?
BORIS: Yeah, I'll start by talking about how code review used to work for me. So the way that I used to do it is — I also used to be one of the most prolific code reviewers.
GERGELY: Oh, okay. So both.
BORIS: Yeah. Right. And that's actually one of the benefits of being in a different time zone. Like I'm not super human. I just didn't have any meetings.
And the way that I approach code review is every time that I would have to comment about something, I would drop it in a spreadsheet and I would describe the issue. So, let's say someone named a parameter in a function badly, I would put that in a spreadsheet. If someone did some bad React pattern or something, I would put that in a spreadsheet. And then over time I would just kind of tally up the spreadsheet and anytime that a particular row had more than three or four instances, I would write a lint rule for it. So just automate it with an op.
And so that's what it used to look like for me. I've always tried to automate myself away because there's just so many things to do. And this is one of our superpowers as engineers — we are able to automate all of the tedious work. There's very few other fields where you're able to do this thing. This is a thing uniquely that we're able to do. And this is a thing that I've just always enjoyed because it gives me more free time and I get to do the work I actually enjoy.
And so today the way this looks is a little different, but it mirrors this a little bit. So when Claude Code writes code, it generally will run tests locally. And this is something Claude just often decides to do when it's relevant or it'll write new tests. So you kind of do this verification. When we make changes to Claude Code, Claude will also test itself. So it'll launch itself kind of in a subprocess. It'll verify itself and it'll test itself end to end.
GERGELY: This is for the your internal Claude Code implementation. So you have like this test suite so it can test itself.
BORIS: Yeah, that's right. That's right. But it'll literally launch itself just in a bash process and kind of just see like — hey, do I still work?
GERGELY: Wow. Okay.
BORIS: So it'll do this and this is something that we just didn't code in — like it just with Opus 4.5 especially, it just sort of spontaneously started doing this. It just wants to check.
So we do this and then we also run Claude in CI. So this is the Claude Agent SDK in CI. So every pull request at Anthropic is code reviewed by Claude Code. And that actually catches maybe like 80% of bugs, something like this. And it's the first round of code review. Claude will automatically address some of these. Some of them it'll leave to a human because it's not sure what to do. There's always an engineer that does the second pass of code review. And there always has to be a person in the loop approving the change.
GERGELY: So on the team before anything goes into production, if you will, an engineer does look at it.
BORIS: Yes.
GERGELY: As you're thinking of code review — would you do this for every type of project or is this specifically because you now know that this actually has real world impact, people depend on it? You know there's a lot of users — let me put it the other way around — can you see places where you would just not have an engineer review code? What situations would that be in?
BORIS: I think it depends how it's used. Yeah, I'd agree with that. But you know if you're building some personal side project like you can just yolo straight to main, you know — like even before AI you would have not reviewed. You just trust yourself or ship to production or SSH into production and do some changes.
GERGELY: That kind of stuff. Right.
BORIS: Exactly. The very first versions of Claude Code that were internal — I committed straight to main. But then as soon as you have users — and for Anthropic our main customer base is enterprises — this is what we care about the most. For safety reasons, security is really important, privacy is important. These are all related. It's also very important for our customers. And so because this is an enterprise product, it has to be secure. We have to make sure that it meets a certain bar. So we definitely use a lot of automation, but at least for now, there has to be a human in the loop just to make sure.
GERGELY: One thing that is just known about LLMs is they're nondeterministic. And by putting the LLM as a reviewer — Claude doing a review — it will give good feedback, but how do you deal with the fact that you can't be sure if it's always giving the feedback — you cannot be sure that even if it's capable of catching an issue that it will necessarily catch it. Are you doing anything in this loop to do deterministic things? For example, linting is very deterministic as you well know. Have you thought of marrying some of these ideas or are you using for example linters on the codebase or you found no need for it?
BORIS: Yeah, absolutely. Yeah, we have type checkers, we have linters, we run the build. Claude is actually so good at writing lint rules. So actually what I do now — I used to tally stuff up in a spreadsheet. Now what I do is when a coworker puts up a pull request and I'm like, "This is lintable," I'll just tag Claude — please write a lint rule for this — in that PR, on their PR. And we have, you know, you just run like /setup-github or something like this. You can do this in Claude Code and it'll install the GitHub app which then makes it so you can tag Claude on any pull request, any issue. I use this every single day. Very useful.
So you want these deterministic steps. Also though there are ways to get Claude to be a little bit more deterministic. So for example, you can do best of N. You can have it do multiple passes and this is actually quite easy to do. So the code review skill that we use internally — it's open source and it's available in the Claude Code repo — and all we do is we launch parallel agents to do stuff and then we launch parallel deduping agents to check for false positives. But essentially best of N — the way you implement it is all you say is "Claude, start three agents to do this" and that's it.
Claude Code's Architecture
GERGELY: How does Claude Code work in terms of architecture? So as an engineer, how can I imagine its setup? We covered some of this in the deep dive and I think you told me that you had some pretty complex ideas when you started and you just simplified a lot of it.
BORIS: Yeah. It's very simple. Like there's not much to it. There's a core query loop. There's a few tools that it uses. We delete these tools all the time. We add new tools all the time. We're just always experimenting with it. So there's kind of this core agent part of it. Then there's the end-to-end part of it. And then there's actually a ton of different pieces around security. And making sure that everything that Claude Code does is safe and that there's a human in the loop for when it happens.
GERGELY: And by safety, do you mean as a user when it's doing stuff on my computer — or also as Anthropic monitoring use cases that could be deemed unsafe?
BORIS: Yeah, there's kind of a couple versions of this. Safety — there's many, many layers. And for things like safety and security, there's no one perfect answer. So it's always a Swiss cheese model. You just need a bunch of layers and with enough layers, the probability of catching anything goes up. And so, you just have to kind of count the number of nines in that probability and pick the threshold that you want.
And so for something like prompt injection for example, we do this generally at three different layers. So let's think about something like web fetch. So Claude fetches a URL and it reads the contents of that web page and then it does something in Claude Code. So one of the risks for something like this is prompt injection. Maybe there's an instruction on that website to be like — hey Claude, delete all the folders — or something like that.
So we think about this in a number of ways. The most basic way is it's an alignment problem. And so Opus 4.6 is the most aligned model we've ever released because we've taught the model how to be more resistant to prompt injection. And so you can read about this on the model card and I think it was part of the release.
The second part is that we have classifiers at runtime where if there is a request that seems to be prompt injected, we block it and we just make the model try again.
And then the third layer is for something like web fetch, we actually summarize the results using a sub agent and then we return that summary back to the main agent. So again, this kind of reduces the probability of prompt injection.
And so you can kind of see how this isn't just one mechanism. It's a layer — and by having a bunch of these different layers, it just reduces the probability a lot.
GERGELY: One interesting technical choice that you've also mentioned is using RAG or not RAG — retrieval augmented generation — and you mentioned how in the earlier version of Claude Code you used a local vector database to speed up search and you later threw this away. Can you talk about how this — because this was another example where I guess did the model get better?
BORIS: Yeah, I mean this is one of those things where we try so many different things. We try so many different tools and just statistically most of them we throw away. Even something like the spinner in Claude Code — I think it's gone through like a hundred iterations, I want to say.
GERGELY: Oh, just the spinner.
BORIS: And out of those we've landed maybe like 10 or 20 in production and 80 of them I probably just threw away because it didn't feel good enough. So just statistically almost all the code we write we throw away because it's just so easy to write this code and try stuff and see what feels good.
So for something like RAG, we tried a bunch of different approaches early on. The first one was RAG for retrieval because I was just reading up on how people were doing retrieval and it seemed like all the papers were talking about RAG. And so the way I did it was a local vector database. I think it was written in TypeScript and it just lived on the user's machine. And then I was using some embedding model in the cloud to compute the embeddings before storing it.
And that worked like pretty good, but there's a lot of issues with RAG. So for example, I was finding that the code drifted out of sync. Like if I make a local function, it's not yet indexed and so RAG isn't going to find it. There's also this question of like how exactly is the index permissioned? So who can access it? I can access it. But then how do we encode that in permission policies? How do we make sure no one else can access it? How do we make sure that if there's a rogue IT person within the company, they can't access someone else's data? This is really really important that we think about this.
And so we just decided like it was sort of working, but it also has a lot of downsides. And so we tried a bunch of other stuff. One of them was just using the model to kind of index everything recursively. That was kind of a cool idea. There was another version where we just tried glob and grep. We tried a bunch of different stuff.
It turned out that agentic search just outperformed everything. And when I say agentic search, this is a fancy word for glob and grep. That's all it is.
GERGELY: Nice. So the model both got good enough and you realized that it can use these tools pretty efficiently.
BORIS: Yeah. And this was partially inspired honestly by my experience at Instagram because at Instagram click-to-definition didn't work because the dev stack was just borked like half the time. And so what engineers used to do instead is let's say you're looking for the definition of the function "foo" — instead of click-to-definition what you would do is you would use the global index which is quite good at Meta and then you would search for "foo" per opening parenthesis and this worked pretty well.
GERGELY: And it's funny because this works for the model pretty well too. Interesting how one idea from one area can come to the other.
Permissions and Sandboxing
GERGELY: One of the more advanced parts of Claude Code that we've also previously talked about is the permission system. Can you talk about what was complex about it? And also you recently open-sourced sandboxing, right?
BORIS: Permissioning is really complex. Like everything else that has to do with security, it's a Swiss cheese model. There are a number of classifiers that run to make sure the command is safe. And there's also static analysis that we do to make sure the command is safe.
As a user, you can also allow-list particular patterns that you know to be safe. So for example, some standard Unix utilities we pre-allow because we know they're read-only, because we know they can't exfiltrate your data or anything like this. So we just won't prompt you for permission.
But actually quite few tools fall into this category because even something like the find command — there's actually a way to execute arbitrary code as part of that command because there's system flags that you can use for this. Or even something like the sed command. There's ways to use this. So there's just all this arcania about these various Unix utilities where it's actually not as safe as you think.
And so we want to be by default fairly conservative about what we allow by default. As a user though you can configure an allow list. So you can say for example these patterns are allowed, these patterns are not allowed. And so we let you define that and we also check this allow list to make sure that it's safe.
GERGELY: Yeah. And then you have this neat permission system where every time you run a command that needs permission, you can decide to run it once or run it for either this session or whatever it makes sense or just globally allow going forward. Right.
BORIS: That's right. This is a funny artifact. This was actually in the very very first version of Claude Code. This is the way permissions worked. This was the very first release. This was like September 2024, the first internal release.
I remember at the time we weren't sure whether agentic safety could even be solved. And so there was actually a lot of pushback internally from safety teams because they were like — okay, you can't just let the model run bash commands, that's unsafe. So what do you do? This is not a solvable problem so we can't launch this.
I brainstormed with Ben Mann and Ben — he started the Labs team. He's one of the founders at Anthropic. He's actually the person that hired me to Anthropic. We just came up with permission prompts as the way to do this. You put the — if you're not sure, just ask the human and they can decide.
Engineering Culture at Anthropic
GERGELY: I wanted to ask you about how software engineering is done in general at Anthropic. And one of the first questions — which is a more formal one from the outside — is titles or lack of them. Everyone at Anthropic has the same title — Member of Technical Staff. Why did this happen and what does this result in — this kind of like everyone there basically no titles, right? Except for one?
BORIS: I think it's kind of an acknowledgement that everyone is just figuring stuff out. And if you kind of squint and look at the work people are doing, it's all quite similar and quite generalist. If you talk to the average software engineer, they might not just be doing coding. They might also be doing a little design. They might also be talking to users. They might be writing their own product requirements. They might be writing software and also doing research. They might be writing product code and also infrastructure code.
At Anthropic there's a lot of generalists. This is also from my background. This is one of the reasons that I gravitated towards it. And I think "Member of Technical Staff" just kind of encodes this in the way that people talk to each other even if they don't know each other.
Without this title the default would have been — I see your name on Slack and under your name it says "software engineer." And then I'm like, well okay, I guess you're the coding person then. So I'm not going to ask you product questions. But when everyone's title is "Member of Technical Staff," by default you assume everyone does everything. And so it kind of inverts this relationship between people even if you don't know each other well yet.
In a way, it's kind of this like optimism built into the structure. I think it's also a glimpse of the future because I think this is where software engineering is going. I think this is where every discipline is going — more of this generalist model.
GERGELY: It definitely feels like it in software engineering. And I heard this funny comment by Marc Andreessen about how he said there's this Mexican standoff happening in the tech world where the designers are saying that they're actually now doing PM and engineering work. The engineers are saying we're doing design — and like everyone thinks they're doing the work of the others and they're kind of standing there like "I'm doing your work as well." When the reality is everyone's role is expanding — most of it thanks to AI because it makes easier for an engineer to do product work or for a product person to do engineering work and so on.
BORIS: So just what you've said — I remember back in June or July of last year I walked into the office and there's a row of data scientists that sit right next to the Claude Code team, at least at the time. And I walked in and our data scientist for the Claude Code team had Claude Code up on his monitor and he was using it and I was like — this is interesting, you're a data scientist, did you have — why are you using a terminal? You didn't have Node.js installed because we depended on Node.js back then. I was like, "Are you dogfooding it? Are you just trying to figure out how this thing works or something?" He's like, "No, no, I'm using it to run queries." He was just using it to run SQL and it had little ASCII visualizations in the terminal.
And then the next week the entire row of data scientists had Claude Code running on their computers. And this expanded and so if you look at the team today on the Claude Code team — everyone codes. The engineers code. Our engineering manager codes. Designers code. Data scientists code. Our finance guy codes. Everyone on the team codes.
And I think part of it is Claude Code just makes it so easy. So you don't really have to understand the codebase. You can just dive in and make small changes quite easily. But I think another thing is people are able to use Claude Code to do their jobs more — whether it's financial forecasting or data science or whatever. And by doing this, it's actually quite an easy crossover to just use it to write a little bit of code also. So it's just a way to dip your toe in the water.
GERGELY: One other interesting thing about how you work is — Cat Woo was talking about — she is, I guess, the title is the same but people might gravitate for role a bit more. I understand she's a little bit more on a product role but you said that PRDs are just not really written inside Anthropic. PRD — product requirement document. It's a well-known artifact across big tech and increasingly over larger startups where you write a spec and the idea is that you write down your thoughts, people align, you send it over and now you know what to build. But apparently you're not doing much of this or at all.
BORIS: Some of this I think is because Anthropic is still a startup. So you don't actually have to align with that many people usually. You can just kind of talk about it or do it in Slack or whatever. But yeah, also part of it is Cat used to be an engineering manager. She's extremely technical and I think this is the way that our product team thinks about it too — better send a PR.
GERGELY: You're doing a lot of prototyping instead. So that's also something where when we talked about how you were building Claude Code early on — you were showing, you actually had a whole thread about the number — I think you did like 15 or 20 prototypes for the to-do list and all of them interactive, working. And what surprised me compared to my past tech experience, you said that you did this in like a day and a half — all 20. Tried it out, got a feeling for it — which is incomprehensible for me. It would have taken a week or two weeks and people would have not done 20, they would have done three.
BORIS: Yeah.
GERGELY: So are you seeing this? Is there an increase in prototyping and building and showing instead of writing things?
BORIS: Absolutely. I mean on our team the culture is we don't really write stuff. We just show.
It's a little hard to reflect back on the time before because I think now just prototyping everything is so baked into the way that we build. Just everything is prototyped multiple times.
Like we launched Agent Teams earlier this week. This is our implementation of swarms. It's very exciting because it just lets Claude do more work for longer, more autonomously. You have a bunch of different uncorrelated context windows and you have this kind of communication between agents. They can just do more.
This is something that Daisy and Suzanne and other folks on the team and Karen — they prototyped this for months and they tried all in all probably hundreds of versions of this before they got a user experience that felt really good. It was just really really hard to get right. There's just no way we could have shipped this if we started with static mocks in Figma or if we started with a PRD or something like this. It's a thing that you have to build and you have to feel and you have to see how it feels.
GERGELY: And to me, one of the big takeaways even from there was like we probably should prototype more and just be more daring or just release your priors of how long it took to build a prototype or who needed to build. Back then it was always an engineer that needed to build, but it's probably not true anymore.
BORIS: Yeah, that's right. I mean, we're in this world right now also where we just don't know what the right answer is. I think back in the old way of building, the cost of building was high and so you had to actually spend a lot of effort to aim very carefully before you take your shot because after you take your shot, it's very hard to course correct. You can only take so few shots.
But now it's changed. The cost of building is very low. But also we don't know where we're aiming. So we just have to try and we have to see what feels good. And it's just very very exploratory. And I think also a big part of it is humility where personally I'm wrong like half the time I'd say. Like most of my ideas are bad. At least half of them are bad. And I don't know which half until I try it.
GERGELY: And you get feedback from others as well sometimes.
BORIS: That's right. It's like I have to try it myself and then I have to see what others think because my intuition does not always match others.
GERGELY: When you were showing these prototypes of just how the tasks were built, you were telling me that you built the prototypes and then your process was always — you first looked at it, you tried it out, you got a feel for it and then for the ones that you felt were good, you showed it to others. And sometimes they give you feedback like "nah, this doesn't work" and then sometimes when it felt good then you shared it even broader. So I feel like it's a mix where sometimes you can decide already and then sometimes you get feedback and then eventually some good ideas come out of it.
BORIS: Yeah, and there's a lot of examples of this. Like we launched this kind of condensed view for file reads and file search just because the model is just so agentic now. Like I felt like half the screen is these file reads and I actually don't care. I read a thing, I don't really care what it is. And so we condensed this down to make the output a little bit more readable.
I really liked it after probably 30 prototypes or something like this. It took so much effort to make that feel really good and clean. We rolled it out to employees at Anthropic for about a month and we had everyone dogfood it and I fixed another probably dozen bugs, dozen tweaks based on all this feedback. We launched it externally and almost all users liked it but there were a few users that didn't because they want more expanded output.
And so on the GitHub issue I was just going back and forth with people to be like — what don't you like — and people gave a lot of feedback. I shipped another version. Then some people liked it, some people didn't. And so I iterated again and kind of made it good. And it's actually I think almost there where people can configure it the way that they want, but still the default is really good. But this is just the process. We get it right some of the time. We have to learn from our users. We want to hear from people so we can get it right.
GERGELY: Do you use ticketing systems for your work where you capture like — all right, here's the work I want to do — or do you just pretty much do the work as it comes in?
BORIS: So at Anthropic, we leave it up to teams. On the Claude Code team, we leave it up to every person. Different people use this differently. For example, I don't use a ticketing system. Some people like to use Asana or Notes or something like this.
One of the coolest things that I saw — this was maybe like 3 months ago or something. We launched plugins and the way we launched that is Daisy, for a weekend, she had a very early version of swarms and she let the swarm run and she told it — your job is to build plugins. You have to come up with a spec. Then you have to make an Asana board and split up into tasks. And then all the different agents have to build it. And she set up a container and she set up a Claude in dangerous mode. And she let it run for the entire weekend. It spawned a couple hundred agents. They made 100 tasks on the Asana board. And then they implemented it. And that's pretty much the version of plugins that we shipped.
These kind of coordination systems that used to be for humans, but I think nowadays it's just as much for models.
Claude Cowork
GERGELY: Let's talk about Claude Cowork. It's one of the very impressive things about this. It looks great. So I tried it out. It's inside Claude. You have the Cowork tab there and you can — I feel it's a lot more visual way of running agents, interacting with them. One of the surprising things I heard is that it was built in 10 days. Can you take us through what it took to build it and what does that actually mean? Was it from the idea or from the decision of building it? And how big was the team building it?
BORIS: The team was really small. It was just a few people.
For a long time, we felt that there is some product to be built for non-engineers. The reason we felt this is — for a long time, people that were using Claude Code are non-engineers. And so in the product world, when you see latent demand, you see people jumping through hoops to use a product that was not designed for them. That's a really good sign it's time to build another product that is built just for them.
There's all these people on Twitter — there's this one guy that was using Claude Code to monitor his tomato plants. I just love this. He had a webcam set up and Claude was like, "Oh my god, I'm so happy that our plant is budding." And because it had a webcam and just every day was monitoring it and it was so happy that the tomatoes were growing.
There was someone that was using Claude Code to recover photos off of a corrupted hard drive and it was his wedding photos.
You know, like I said, our entire finance team at Anthropic uses Claude Code. Our sales team uses Claude Code. So there's just all these people that are non-engineers that were using it.
And at that point Claude Code — it's available in a lot of form factors, right? We started in a terminal, then we expanded and we added support for IDEs. So we have extensions for every VS Code based IDE, every JetBrains based IDE. There's also iOS and Android apps. There's the desktop app. There's web. Then there's Slack and GitHub apps. So we kind of expanded to all these places to make Claude Code easier for engineers.
But ultimately none of these are built still for non-engineers. And so Claude Code evolved a lot, but it still felt like there's a gap and there's a product that could make this even easier for people.
And so for the last couple months, the team was kind of hacking around and just saying — what is the right product? And at some point someone came up with this idea of — what if we just take Claude Code, add some guardrails? So for example, Cowork comes with a virtual machine. This is one of the many ways that we make sure it's really safe — especially for nontechnical users that don't want to read bash commands to figure out what it's doing. And they were hacking on this. I think it was something like 10 days end to end or something. It was just fully built with Claude Code. And then we shipped it.
GERGELY: And can you give us a sense of the complexity behind an app like this? And if we can walk through what parts needed to be built — because from the outside it's a little bit hard to tell. Is this just a nice UI wrapper that's a few hundred lines of code? I'm being obviously provocative here. Or behind the scenes is it actually a really complex piece of software? And the reason I ask is — Uber is a great example where people look at the app, it looks really simple. I've worked there and I know it's really really complex because you don't see a lot of the complexity. There's a lot of regional things. There's a lot of backend things that are all hidden. So from just looking at it, Claude Cowork — it's hard to tell how much of this is additional business logic that needed to be carefully thought out versus it's actually just a nice thin wrapper on top of the model.
BORIS: In some places, I think there's less complexity than you would think. In some places, there's more complexity.
So on the product side, it's quite simple because it's just the Claude desktop app. So you download the Claude app. It's a single desktop app. It has a tab for Cowork, it has a tab for Code, it has a tab for Chat. So it is just one app and we were able to inherit a lot of that product logic. There's some UI rendering code under the hood. It's just the same Claude Code running. It's the same Claude Agent SDK that powers Claude Code.
A lot of the complexity actually is about safety because we know — like I said — we know the user is nontechnical and so we just want to make sure they have a good experience. So for example, if someone launches the app and then they delete a bunch of family photos, that's really not good and so we wanted to make sure that we protect against this — so you can't accidentally do that.
And so that's where a lot of the guardrails came from. So there's a bunch of classifiers running on the back end. This is for safety and extra mitigations for things like prompt injection and risks like this around security. On the front end there's an entire virtual machine that we ship. There's a bunch of operating system level integrations to make sure people don't accidentally delete things. So just around safety there's a lot there.
And then we also had to rethink the permission system because we inherit the permission system from Claude Code. But also for Cowork, actually a big part of the value is not just running locally but it's using all of your tools the way that Claude Code uses it. But the thing is for nontechnical users, your tools aren't really available as CLIs. Some of them are available over MCP. Many of them are available in a browser.
And so Cowork is really really good when you pair it with a Chrome extension. And this is the way that I usually use it. So for example, I use it every week to do project management for the team. We have a spreadsheet that tracks at a really high level what everyone's working on. And this is my personal way of project managing. Other people use Asana, other people use Notes or whatever. For my own tasks, I don't use anything, but for the team overall, I have this spreadsheet and I have Cowork kind of check in and I just ask Cowork every week — hey, can you look at the rows for any status that has not been filled out? Can you just ping the engineer on Slack?
And so it'll open one tab in Chrome for the spreadsheet. It'll open another tab with Slack and then it'll just start messaging engineers in Slack and it just one-shots it. There's like one engineer's name for some reason it can't autocomplete. But everything else it just gets.
And so this is actually — from a safety point of view — we also thought pretty deeply about this Chrome extension and how this works and how the permissioning model should interact with this local permissioning model. So there's also a bunch of code to make sure that that feels smooth.
GERGELY: And what's the tech stack behind this? I assume a lot will be similar to the Claude app, but is it Electron, TypeScript, those kind of things or something else?
BORIS: Yeah. Yeah, just Electron and TypeScript. Actually, some of the people working on it are early Electron folks. So Felix, who's the creator of Cowork, worked on Electron. He helped build it.
GERGELY: Oh, amazing. And Cowork launched macOS only. What was the reason for both choosing this platform first and for now only choosing this platform?
BORIS: Yeah, so Windows coming soon. I think probably by the time this podcast comes out we will have Windows support. We just wanted to start early and start learning. Like everything we do at Anthropic, it's kind of the way that I told my own story. One of the things I like about Anthropic is it just really matches the way that people here think about it — back to this point where we don't have high certainty about the things that we build and our intuition is often wrong and so we just have to learn from users and figure out what people actually want and just spend a lot of time listening to people and understanding the feedback deeply.
This is the way that we build product and so we always launch a little bit before it's ready. We did this for Claude Code — when we launched Claude Code initially it didn't even support Windows also. It didn't support a lot of different stacks and then over the coming weeks we added support for every stack. Now Claude Code supports every single stack — Windows, whatever weird Linux distro you use, macOS. We support everything. And so for Cowork also we just wanted to launch early. We wanted to start with Mac as that was just the starting point but yeah it's going to support everything.
Observability and Privacy
GERGELY: One thing you mentioned is getting feedback. I'm curious both for Claude Code and for Claude Cowork. How do you go about things like observability, monitoring when you're rolling out? Do you use any feature flags? And I'm more interested in — did you build custom tools for this or did you decide to use certain vendors? Especially for observability, I'm sure that this is both important but it also sounds like pretty high scale in terms of the number of users — this will not be a small operation.
BORIS: Yeah, there's some off-the-shelf vendors that we use. There's some custom code that we use. So it's actually a mix of both. There's nothing too surprising about it.
There's one thing about Anthropic that's kind of interesting — because we're an enterprise company and we care a lot about privacy and security, we can't see people's data. And so if someone reports a bug, I actually can't pull up your logs to kind of see what's going on. A lot of work goes into figuring out how to log events and things like this in a privacy-preserving way. This is just very important to the way that we operate.
GERGELY: For Cowork, what kind of learnings have you had so far? It's been out for I think a few weeks now. Did you see something unexpected? Are you shaping the product based on feedback that you're getting?
BORIS: Yeah. Every day the team is landing so many fixes. The most surprising thing is just how much people are loving it. To be honest, when Claude Code first came out, it actually wasn't an overnight hit. This is something people think it was, but it was sort of a slow takeoff at the beginning. And I think the first big inflection was in May when we released Opus 4 and Sonnet 4. That's when it really clicked and that's when our growth became exponential.
But at the beginning, it was sort of a research preview. People didn't really know how to use it. Some people got it immediately, but most people didn't. It took a little while.
For Cowork, it's a much steeper growth trajectory than Claude Code was at the beginning. So it's just been an instant hit. And that's actually been very surprising. I didn't really expect that.
Agent Swarms
GERGELY: One of your new releases, which came out just very recently — it was I think yesterday or the day before when we're recording this podcast — was Agent Teams. And as I understand it, the idea with Agent Teams — agent swarms — instead of a single agent, you can have a lead agent and it can delegate to its different teammates. How did you start experimenting with this and how did you decide to ship it?
BORIS: Now, we're always doing experiments, right? There's all sorts of ways to get more mileage out of Claude Code. One way you can do it is by extending context. Another way is auto-compacting context. So it's essentially infinite context and that's what we have right now. Another way is using sub agents. So you have multiple agents kind of working together. There's just a lot of different approaches to get a little bit more mileage out of the context window.
There's this one idea called uncorrelated context windows. That's what we call it. And the idea is you have multiple context windows, but they essentially start fresh. So they don't know about each other. And so an example of this is like a correlated context window is if you have the model and it does a task and then you have it just do a second task in that same context window. And in this case the second task knows about the first one because it's in the same window.
But for something like a sub agent, it's uncorrelated because the main agent prompts the sub agent but the sub agent's context window is fresh. Besides that prompt, it doesn't know what's in the parent context window. And you can see this actually a little bit in for example sub agents versus skills. Because when you run a skill or slash command, it sees the parent context window, versus for a sub agent it doesn't. So it's uncorrelated.
There's some cases where you want that context. There's some cases when you don't. And there's this interesting thing where uncorrelated context windows — just throwing more context at the problem and throwing more tokens at it — when the windows are uncorrelated, gives you better results. It's actually a form of test-time compute to do this.
And for something like Teams, we've been experimenting with this for a while. I think since maybe like October or September or something like this. And it really just felt like with Opus 4.6, it clicked — where the model figured out really how to use this. And sometimes you see these kind of cute exchanges where the agents are talking to each other and they're discussing something and it's just very cool to see. It's very humanistic in a way. But there's other times where you just get very good results.
And so we had a bunch of internal evaluations, for example, where we have Claude build something very very complex, something more complex than what a single Claude would build. And we saw the results just really really improved with Opus 4.6 with Teams. And that's why we felt it's the right time to release it.
We also wanted to be careful. And the reason you have to opt into it, the reason it's a research preview, is it uses a ton of tokens because it's just a bunch of Claudes that are running. Not everyone wants this all the time. So just excited to see how people use it and to hear the feedback. It's something you want for fairly complex tasks. You don't probably want this for every task.
The main Claude decides the roles for the sub Claudes. We don't have a regimented way to do this. It's context specific. I wouldn't say there's one right way to do it. I think actually a lot of the magic of this comes out of this idea of uncorrelated context windows. It's less about the specific configuration of the agents. But it's something that people should experiment with. I don't think there's a one-size-fits-all.
GERGELY: Have you seen use cases — even, I know it's still research — but have you seen use cases where it looks promising, this approach, the swarm approach?
BORIS: Well, you know, like I said before, plugins were fully built with swarms. There's a bunch of other features since that were built in this way. So yeah, I think for anything where you see a single Claude struggling, swarms can help.
LLMs and the Printing Press Analogy
GERGELY: It's interesting to look at. Talking about change in general — with Andrew Karpathy, you had a really interesting exchange back in December where he posted that he's never felt as much behind as a programmer as he is now because of the progress with AI. And then you shared the story about how you started to debug a memory leak the old-fashioned way and then Claude just one-shotted it. I think it was a reflection of how everyone is feeling that things are changing so fast. And in the holiday break I started to feel that things have really shifted. How did you come to terms with this or start to embrace this change?
BORIS: This is something I really struggle with. The model is improving so quickly that the ideas that worked with the old model might not work with a new model. The things that didn't work with the old model might work with a new model. And it's weird because there's just not a lot of other technologies like this. So I just don't really have a lot of experience to draw on to figure out how I should approach this.
And it's been this new skill that I've had to learn. In a way, it's like you just always have to bring this beginner mindset. Honestly, I'm using the word humility a lot, but you always just have to bring this kind of intellectual humility because all these ideas that were bad before are now good and the inverse. I think that's honestly something I constantly have to remind myself about.
And back in the — it's funny — back in the old world when someone tries an idea again and we've tried it in the past and it didn't work, usually the feedback is like — why are you doing this again?
GERGELY: Yeah. We used to call it a bit of gatekeeping but it was somewhat valid where someone came and said — why don't we do microservices — and someone said we tried it and it didn't work. And if you tried it a year or two or 3 years ago, it was kind of valid, right? Because not much has changed.
BORIS: Yeah, that's right. And something with microservices, it's funny because every 10 years it goes in and out of style. But yeah, now it's I think the first time ever where it's actually not crazy to just try the same idea every few months because the model improves and it just works.
And I actually see this with engineers on the team. Like new people that are newer to the team, people that are newer to engineering, sometimes do things in a better way than I do. And I just have to look at them and I have to learn and I have to adjust my expectations.
An example of this is — when we release features, sometimes I'll screenshot myself using them on X or on Threads or whatever just to talk about it. But recently, Tar, our devrel guy — he actually codes a lot, he's amazing — and he just started automating this. So he's having Claude Code generate its own videos for its launches and he just started doing this. And this is something like — I thought would be, you know, maybe it's possible. It's not something I would have tried because I wouldn't have thought the model was ready. But he just did it and it just kind of worked.
GERGELY: One thing that I've felt just a bit odd about — and I think a lot of developers can relate — is I've come to terms with this starting from Opus 4.5. And also similar models like I think GPT 5.2 gave me similar vibes as well. The models have been just really good at writing code. And I realize that I don't think I will hand-write the code when I want to get stuff done. If I actually want to get the pleasure of writing, I can still do it. But one thing I reflected on is it's just been so much effort to get good at coding.
I remember when I started from hacking around to going into university to learning C and C++ — and it was just bloody hard. And actually going through my first few jobs where I started to become better at it, I became better at debugging. And there's a point where a lot of my identity was tied to being good at coding. That's how we used to get jobs or higher paying jobs. When I was an engineering manager, when we designed the interview loop at Uber, we talked with managers about what we need to screen for. And we talked — well, what do developers do most of their time? About 50% of the time they code. Therefore, we placed about 50% of the signal on coding.
So there was a lot of things tied into coding because it is just hard. I think we all know that it takes grit. It takes some level of intelligence to get good at it. And there's a sense of loss of — well, I think it's great on one end that the model can do it. But it feels that something really quickly got taken away that I don't think I personally thought it would happen this quickly. And I think a lot of other people are feeling — some people move on a bit easier, but there's definitely this sense of grief.
How did you think about it? Because again, you're an example of — you wrote so much code at Facebook, also outside of it. I know it was just a tool of doing it, but not many people could do what you did. And now the models can also work as good as you have — or if not better. That's the challenge.
BORIS: Yeah. I think it's something that used to be a thing that we do as software engineers. It's becoming a thing that everyone is able to do.
There was a moment — when I started coding, it was a very practical thing and it was a way to get things done. And at some point I just fell in love with the art of coding and languages and the tools themselves. And at some point I kind of fell down this rabbit hole. I wrote a book about a programming language.
GERGELY: TypeScript. You wrote the first ever TypeScript book with O'Reilly.
BORIS: Yeah. It was funny actually. There was this amazing moment for me in my little town in Japan. I went to the bookstore and I found that book translated in Japanese. In this tiny town. And that was just the coolest moment. And then I actually realized I don't remember TypeScript at all because I was only writing Python for a couple years at that point.
And at some point I started the biggest TypeScript meetup in the world. That was in SF. And I got to meet kind of a lot of my heroes. There was like Chris Cowell who wrote "General Theory of Reactivity." There was Ryan Dahl, the guy that made Node.
One of the first times that I went really deep into this community and the language itself and the tools themselves. And for something like TypeScript there's this beauty in the type system because Hejlsberg is just brilliant — the idea of conditional types and just anything can be a literal type. And there's these very deep ideas that even the most hardcore functional languages do not have. Even in something like Haskell, it doesn't go this far. And Anders just took it and he pushed it much further than it had been pushed. And Joe Palmer and a bunch of other folks kind of explored a lot of these ideas.
And I think for them it was also very practical, right? Because they had these large untyped JavaScript codebases. How do you gradually migrate to something typed? And you have to come up with these very beautiful ideas to do this.
For me, Scala was another rabbit hole that I fell into — in this functional programming world. And still when I write code and when the model writes code, I always think in the types first. That's what matters — what is the type signature. That matters more than the code itself. And getting that right.
So there is this beauty to it. There's an art to it for sure. But in the end it's a practical thing and in the end this is a thing that we use to build things. It's a means to an end. It's not an end in itself.
I think one metaphor I have for this moment in time that we're in is the printing press in the 1400s or whatever. Because at that moment it was actually quite similar, right? There was a group of scribes that knew how to write. And it was as I understand — of course we never lived there — but as I imagine, it was an arduous process to learn. You needed to learn, you needed to get the equipment. You probably needed some sponsorship or being selected. Practicing because you needed to produce the same thing over and over again. And few people could do that. And I assume it was either high prestige or highly paid or who knows — let's assume it was. But then the printing press came along.
GERGELY: Yeah.
BORIS: Yeah. And at least in Europe you had to — a lord or a king or something had to employ you and then you had to go through years of training and there was this class of scribes that knew how to write. They were employed by someone like this. Often the king themselves or the queen was not literate. So it was this very very niche skill and it was like less than 1% of the population was literate in Europe back then.
And then the printing press came out and what happened — so the cost of printed material went down something like 100x over the next 30–50 years. The quantity of printed materials went up like 10,000x in the next 50–100 years. This was the first effect. Literacy — it took a little while for it to catch up. So global literacy went up to something like 70%. But that took like another 200–300 years because learning to read is just very hard. Learning to write is hard. It takes a lot of effort. It takes an education system. It takes infrastructure to have paper and ink and the free time to do this instead of working on a farm. So it kind of took an early stage of industrialization to actually get there.
But I think this effect of making it so this thing that was locked away in an ivory tower and now it's accessible to everyone — this is just, you know, none of the things around us would exist today without this. If we weren't literate, if the people that built this microphone weren't literate, it would have just been very hard to have a modern economy. None of these things would exist. And I just kind of think about — back then, if people had to predict what would happen when the printing press came out, no one would have predicted that the microphone would become a thing. So I just feel like this is the best analog for the moment that we're in right now.
GERGELY: Yeah, it's interesting that you say that some of the kings were illiterate who are employing the scribes — because if we're being honest with ourselves, we have business owners who know what they want to build and they are employing software engineers because they themselves cannot write code. And I think we like to mock the CEOs who are coming — coming to the team. They might even have a drawn prototype or whiteboard and saying "this should be easy" but of course they don't understand how difficult it is.
There seems to be a bit of analogy where there's a person who wants what they want but until now they needed to hire a specialist who can build that. And there's always that disconnect between the idea and the person. And just like with the printing press — what would happen if they could actually express — if the king could actually read or write their own letters, they wouldn't need that middleman and things become more efficient. But of course for the scribe it's not the best news necessarily. But smart scribes can also — someone needs to write the books, run the press, etc.
BORIS: Yeah, exactly. And if you think about what happened to the scribes, right? They ceased to become scribes, but now there's a category of writers and authors. These people now exist. And the reason they exist is because the market for literature just expanded a ton.
GERGELY: And I guess also if we think about back then, a scribe's work was read by a few people. And with the printing press, an author — there's a lot more authors and some of them are not really read but some of them have wider reach than they could imagine. There's new careers that exist because of that.
BORIS: Yeah, I love the analogy. And the most exciting thing for me is it's just so impossible to say today what will happen after this transition happens. Just the economy as we know it would not have existed without it.
GERGELY: So what's next? What is the thing that we can't even predict today that will exist because anyone can do this?
BORIS: Well, we cannot predict, but I think we can look at what is working right now.
Standout Engineer Archetypes
GERGELY: If you look around in your environment — may that be the team across Anthropic who are software engineers or builders or members of technical staff, however we call them — who to you are standouts? What are they doing? What skills have they built up? And how have they changed the way they work?
BORIS: It's hard to name individuals because honestly this is just the strongest — these are the strongest people I've ever worked with in my career. There's all sorts of different archetypes.
There's some people that are really amazing prototypers. So — take something from zero to 0.5. Just figure out what are some cool ideas? What is the technology unlock?
There's other people that are amazing at finding product-market fit. So kind of 0.5 to 1 or maybe zero to 1.
There's other people that span different disciplines. And I'm just seeing more and more of these people — like I said — people that span product engineering and infrastructure engineering, or product and design, or design and engineering. I think I'm just seeing a lot more of these hybrids.
GERGELY: What's a belief that changed from last year to this year? Something that you either believed or a conviction that you had that you've either revised or completely threw away.
BORIS: I think one thing I wasn't sure about is how big a problem is safety, to be totally honest. I joined Anthropic because — like I said — I read a lot of sci-fi and I kind of know how bad this thing can go if it goes bad. It wasn't something I was sure about. But seeing it from the inside and then seeing the new risks that have arisen in the last year, it just makes me much much more worried about it. So I think it was kind of an important thing for me. Now it's just the most important thing for me — how do we make sure this thing goes well.
What Skills Still Matter for Engineers
GERGELY: I think it's safe to say you were a really great software engineer even before all the AI things started. And you seem to be a very productive engineer — of course part of a team as well but also individually. What are some skills of — before being a software engineer — that are still as valuable or maybe even more valuable than before? And what are ones that are maybe just not as much and they're best left behind?
BORIS: Okay, so the stuff that's best left behind is maybe like very strong opinions about code style and languages and things like this. Like I can't wait to get past these endless language debates and framework debates and all the stuff because the model can just use whatever language and framework. And if you don't like it, it can just rewrite it for you. So it just doesn't matter anymore.
I think something that still matters a lot today is being methodical and hypothesis-driven. This matters both in product design — in this world where everything is being disrupted and we need to figure out what to build next, and this is something everyone is thinking about. But it also matters for engineering day-to-day. Something like debugging — you just have to be very methodical about it. And the model can do this and it can help a lot. But I think still we're in this transition point where you still need to have the skill. I don't know if you're still going to need to have it in 6 months.
Other skills that I think are more valuable are being curious and being open to doing things beyond your swim lane. So if you're working on engineering but you really understand the business side, you can just build really awesome products. And I think the next billion-dollar product — after Claude Code — whatever the next startup is that becomes the next trillion-dollar startup, it might just be one person that has some cool idea and their brain is just able to think across engineering and product and business, or design and finance and something else. People are going to become more and more multi-discipline and this will become more and more rewarded. So in some ways I think this will be the year of the generalist.
I think the other skill that's actually been rewarded — it's having a short attention span.
GERGELY: That's being rewarded now?
BORIS: Oh yeah. You know, people — teenagers are using TikTok and all this stuff and I think in some ways it's kind of dangerous for society because you want people that can think deeply and can contemplate ideas and aren't just moving on to the next idea very quickly. But in some ways I think this year is the year that is going to reward — it's like the year of ADHD — because the work for me has become jumping between Claudes. It has become managing Claudes and so it's not so much about deep work. It's about how good am I at context switching and jumping across multiple different contexts very quickly.
GERGELY: Could I add that — from what all you said — maybe we could add one thing which is adaptability. Because you're saying of course that ADHD and you can jump across, but of course earlier you were very good at focusing deeply on one thing as well. And what strikes me about you — and maybe this is true for other people as well — you're just kind of very open to adapting your working style and seeing what works well for this stage, especially when things are changing. I think the one certain thing we can be sure is whenever the next model comes out, it'll change again. And you need to be curious and open to adapting how you work, right?
BORIS: Yeah.
Book Recommendations
GERGELY: And as closing, what's a book or books that you would recommend?
BORIS: I've gone down a rabbit hole. So — Liu Cixin, he's the Three-Body Problem guy, but he actually has a lot of other really great books. I really love his short stories. He has a couple books of short stories. I'm a big fan.
For people that are new to sci-fi and you want a little bit harder sci-fi, I really love Accelerando by Charles Stross. This is a book I would totally recommend. It's like essentially the product roadmap for the next 50 years. It starts with takeoff kind of starting to happen and AI singularity and then it ends up with this kind of group lobster consciousnesses orbiting Jupiter and it's just amazing. And the thing that I think it really captures is just the pace — this quickening, quickening, quickening pace of how this feels. It really matches the feeling right now.
And then on the technical side, I would strongly recommend Functional Programming in Scala. Even if language choice just doesn't matter as much anymore, I think there is this art to functional programming that just teaches you how to code better. And it'll just teach you how to think in types. If you read this book, I think what's really important is to do the exercises also. And I've gone through and I've done all of them probably like three times over and it's just amazing. It really just knocks this idea of functional types into your head and it's just a thing you can't stop thinking about.
GERGELY: Boris, thank you so much. This was awesome.
BORIS: Yeah, thanks Gergely.
GERGELY: This was a really interesting conversation and the thing that I keep coming back to is Boris's printing press analogy. The idea that medieval scribes were this tiny elite who could write, employed by kings who themselves were often illiterate, and that we software engineers might be in a similar position today. We are the scribes. We spent years mastering this craft. And now the printing press is arriving.
But what Boris told me is that the scribes did not disappear. They became writers and authors and the entire market for written work expanded beyond anything anyone could have predicted.
I do find this hopeful and also appreciate that Boris didn't sugarcoat it.
The other thing that struck me is just how differently the Claude Code team builds software. No PRDs, no mandatory ticketing system, designers and data scientists and finance people all writing code, and building dozens or hundreds of prototypes before shipping a feature. And Boris is shipping 20 to 30 pull requests a day without editing a single line by hand. And there are different verification systems in place — Claude Code reviewing its own code, automated lint rules, best-of-N passes, and human code review.
If you've enjoyed this podcast, please do subscribe on your favorite podcast platform and on YouTube. A special thank you if you also leave a rating on the show. Thanks and see you on the next one.
Transcript source: The Pragmatic Engineer — Building Claude Code with Boris Cherny. YouTube. Formatted for readability.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.