Convincing AI the Earth is Flat, Inference at 17k tokens/sec, and an Agile Manifesto for the Agentic Age?

Shimin (00:15)
Hello and welcome to the artificial developer intelligence podcast, a conversation show where two software engineers talk about the impact of AI on programming. We are also the initial sponsors for the Poddy awards, the award for podcasts most listened to by AI. I am Shimin Zhang and with me today is my co-host.

Rahul, he's got a throughput of 17,000 tokens per second. Yadav, how are you doing today, Rahul?

Good, good. Less busy than you, certainly. But can't complain here. ⁓ On today's show, we're going to talk a little bit about the news. We have a couple of new models and some interesting hardware coming up. After that, we'll have a post-processing where we're going to talk about whether or not software is dead. After that, we will do a deep dive on the future of software engineering.

Rahul Yadav (00:54)
Yeah.

Shimin (01:13)
I'm very excited for that from the signers of the Agile Manifesto. And then we'll do a little vibe and tell where I try to convince an AI that the earth is flat. And as always, we're going to finish with two minutes to midnight we're going to, as always, talk about where we are in the state of the AI bubble.

So let us get started. First up, last week we have the announcement of a few models actually, but we first have the announcement of Sonnet 4.6 from Anthropic, Sonnet has the same much lower $3.00

up 15 down per million tokens price. But it is according to Anthropic at least a Opus level reasoning model. Sonnet 4.6 is known for being trained to have computer use as it's built in, I guess feature where it

performs significantly better than Sonnet of Four Five, according to Anthropic, scoring a eye-dropping 72.5 on the OS World model, where essentially a model is tasked to do white-collar work using a computer, whereas Sonnet of Four Five only scored of 61.4. What is mostly surprising to me when it comes to Anthropic's own benchmark,

is that Sonnet 4.6 actually scores better than even Opus 4.6 when it comes to agentic financial analysis and office tasks, even though it is a significantly cheaper model. I have used Sonnet 4.6 a little bit this week. It's pretty good. I can tell it is, you know, it's significantly better than Sonnet 4.5, but I can't really tell if it's better or worse than Opus 4.5.

so I guess we've gotten to that point where it is actually kind of hard to tell model power apart from each other. Yeah. Rahul, have you had a chance to look at. Sonnet four six.

Rahul Yadav (03:07)
I was playing around with Opus, not Sonnet It is interesting. Like what you called out, older optimized model is outperforming the newer model. And not in like, you know, obviously they're saying it's got better speed, lower costs and everything. And all of those costs at least like is downstream of training and all that, but also like.

Shimin (03:10)
Mm-hmm.

Rahul Yadav (03:31)
Opus is supposed to be smarter and better at these things, but somehow optimized sonnet is killing it in most of the fronts as you have it up here. it's just interesting. And then timing is also interesting because like Opus 4.6 didn't come out that long ago and then sonnet 4.6 followed pretty quickly after that.

Shimin (03:37)
Yeah.

Right. So almost makes you wonder if they're or they were training Opus 46 and Sonnet 46 at the same time. I think that was the rumor. The rumor was they were going to release both of those models at the same time, but some something went wrong with the Sonnet model and it took the whole service down for a little bit and they just decided to delay it until this week. Yeah.

Rahul Yadav (04:00)
Hmm

I see. Okay.

Shimin (04:14)
Yeah, in other model news, we also have Gemini 3.1 Pro, but I'm just going to mention that because I haven't got a chance to try it yet. All the experiments I've done this week have been on just regular Gemini 3. So stay tuned. All right.

Rahul Yadav (04:30)
We should,

you know, just as an aside, we should, every time these things are released, they should give like people specific prompts or tests to be like, and this is how you can tell this is better than the previous one.

Shimin (04:44)
Yeah, that's a great idea. They should show us the work and not just give us these numbers because the numbers aren't that useful.

Rahul Yadav (04:51)
The lay people just need a couple of sentences to you know for 3.1 to prove that it's smarter than three otherwise there is no reason for us to go to 3.1

Shimin (05:03)
That's a

great idea. We should write to Alphabet's marketing department or deep research.

Rahul Yadav (05:09)
Dear Sundar, I have some concerns.

Shimin (05:10)
No, you should start with, hey bro.

All right, the second piece of news we have this week is from Taalas Labs. They've released their latest hardware that uses FPGA to create a much faster throughput rate for their models. So this was the 17,000 tokens per second per user rate I ate.

mentioned earlier in the show, it is shockingly fast compared to a couple of hundred tokens per second that we see with Opus 4.5 or 4.6. To be fair, this custom hardware is designed to run the Lama 3.1 8B model. So it's a relatively small model. But

Let's, you know, let's not even just talk about it. Let's just go into the demo real quick here. I pulled up their demo site for chat Jimmy. And I'm going to ask Chad Jimmy to tell me about Samurais in the 16th century. And the page and they generated an entire response essentially like, you know, in,

Rahul Yadav (06:04)
hahahaha

Awesome.

Shimin (06:18)
less than one tenth of a second at a throughput rate of 15,000 tokens per second. I know it's a small model, but yeah. It's like the bulk of the response time is spent, you know, going from my computer to the wifi and doing the round trip to their server rather than the model itself. Right.

Rahul Yadav (06:20)
Yeah.

It's still impressive. Yeah.

Yeah.

Shimin (06:42)
I want to talk about this because this is clearly going to be the future of hardware. think, know, if I was in Nvidia, I'd be pretty nervous right now. folks are coming to their monopoly. The future of large language models is clearly custom hardware that is fine tuned to a specific model. Maybe it's a couple of years down the line, but at least on the inference side, we should, if you're building today,

you probably should be building for a world where model throughput is not an issue.

Rahul Yadav (07:13)
This is one thing that my mind goes through on reading this is there's this other idea of like, you know, you would have AI at the edge and our phones and all the other hardware would also have better hardware that can run a lot of models locally. And then like, you can't beat that latency ⁓ if you're just running it locally. And so

Shimin (07:33)
Mm-hmm.

Rahul Yadav (07:36)
their approach, what I'm trying to figure out is does it also work in that world where you can have model specific hardware at the edge or are you trying to pack too much then or is this like a mostly a win when you're in the cloud and you're trying to like type things specific to the models and everything?

Shimin (07:56)
Yeah, that's a great observation. would something like this be in, you know, Apple Silicon two, three generations from now? Would there be a good enough model for that use case? I guess it really depends on what a good enough model is, doesn't it? Like this 8B model probably isn't all that useful. Yeah.

Rahul Yadav (08:03)
Exactly.

Yep.

Yep.

Shimin (08:15)
And a kind of a follow-up news is OpenAI has also decided to start running their own GPT-53 codex on custom chips from Cerabus. And these models are producing at 1,000 tokens per second. it is no 17,000 tokens per second, it that

clearly seems to be where things are going.

Rahul Yadav (08:43)
live demo they don't have one no no chat no chat jimmy chat jimmy is great ⁓

Shimin (08:45)
They don't have a live demo. I love Chat Jimmy Everyone should check it out.

When you have such a step up in performance, there's no convincing like a demo.

Rahul Yadav (09:01)
Yeah. Yeah.

Shimin (09:03)
Yeah, this also you know it's interesting right like

Google has its own TPUs. Amazon has its own traineums. Microsoft has Maia And now OpenAI is also partnering for its own hardware. again, this is not two minutes to midnight, but if I was in the Nividia I would be pretty nervous right now.

Rahul Yadav (09:12)
training. ⁓

Hmm.

Is that the theme today? Every article you're like, if I was in Nivida or you should go the Dario path. I'm not going to name names, but if I were some of the companies, I would be very concerned.

Shimin (09:36)
Ahahaha!

Yeah, if I was trading a 47 PE,

you know, I would be really thinking twice. Oh, and of course, the last piece of news that just came out was today, Meta has a partnership with AMD for their chips, and Meta is going to buy up to 10 % of AMD for their own custom GPUs.

Rahul Yadav (09:45)
Yeah.

NRS.

Shimin (10:01)
Again, if I was a GPU manufacturer that starts with N and rhymes with Cascadia, I would be worried. Moving on. We're going to do a little different today. We're going to go straight to post-processing. And here we have a post from learning by shipping, Steven Sinofsky about the death of software. Nah.

Rahul Yadav (10:09)
You

Shimin (10:24)
This is yet another piece that comments on the great SaaS sell-off and this theory that there is no longer a moat for SaaS companies and everyone is going to use AI for all the things, and arguing against that thesis. I think as developers, one of the things we worry about intrinsically is our job security if software goes away.

So it's comforting to read a well-reasoned argument for going to the past, seeing how the old technological advances have made impact and how long those impact took to unfold. So Steven made a few arguments. Starts with looking at the PC and the GUI, where even though

Everyone saw GUIs and saw the PC and predicted the death of the mainframe and the death of data centers. The opposite actually happened, right? Mainframe computers might be dead, but data centers grew along with the PC. And even though we had GUIs,

Everyone still uses command line all the time. I work almost exclusively on the command line. So just because a quote unquote better interface comes along doesn't mean that the old one will be replaced. The old one might realistically be built upon or be exclusively used by the quote unquote power users or quote unquote expert users.

Right. And that is something that the software is dead crowd has kind of not taken into account. then, and I thankfully was not old enough in 1995 to hear the talks of retail brick and mortar is dead slogan, but maybe Steven was. He talked about how back in 1995, folks had already talked about how low margin, inefficient retail worlds.

would be subsumed by new-fangled e-commerce companies like Amazon. And in fact, that is not the case, right? Not only did it took more than 30 years for this to play out, it is still playing out. We still have Walmart. We still have Costco's. Transitions probably take longer than you think. And if I was listening to the...

I was going say ramblings. No, the speeches of a certain CEO from a certain human themed company talking about what's going to happen in the next 12 to 18 months. I may point out that it took Amazon two decades to quote unquote, kill retail. Yeah. And then the third one, and this one is the least, I think, convincing of the argument was that folks talk about the death of media.

since 2000, you know, Time Warner and AOL merger are proven to be premature. Only now do we start to see internet media companies like Netflix generating their own content. So that was another, it probably ended up taking 20 years and that was built on the backs of DVDs. just again,

technological diffusion probably will take longer than the soothsayers would claim. And what does that mean for us? And here, Steven makes a couple of predictions. One, there'll be more software than before. I think that's pretty much a given. Two, AI enabled or AI centric software will move up the stack of what a product is. So AI becomes

an additional feature, an additional layer of user interface. Three, new tools will be created with AI that do new things. Definitely, because it opens up a whole new class of products. Four, domain expertise will be wildly more important than it is today because every domain will be vastly more sophisticated than it is now. This one is interesting. It talks about how in 1995, bankers

hired juniors to do the quote unquote Excel, low level Excel work for them. And now knowing Excel like the back of your hand is a prerequisite for being a banker. yeah.

Rahul Yadav (14:18)
Hahaha.

When

this point stood out to me, there was a quote recently someone had, because know, Claud had released those legal plugins and all that and the whole like lawyer industry was in chaos. And then one person was like, this is actually great. Cause if every person who has a subscription to Chad GPT or whatever is an armchair expert and

Shimin (14:36)
Mm-hmm.

Rahul Yadav (14:49)
like legal stuff, they come up with more questions. I bill by the hour. So they come up to me with 50 questions instead of five before, because they didn't know anything. I get to bill them for more questions, right? And so a lot of these like same like doctors don't, I don't think they usually bill by the hour, but you can see this happening in like consultancy legal as he calls out. It's like, if you're going to have opinions on things and you want the person who's billing you to like give you their thoughts on it, they're going to bill you on it. So

Shimin (15:17)
Mm-hmm.

Yep.

Rahul Yadav (15:18)
Talk to your personal chat bot all you want below it. You're generating more work for them. More revenue.

Shimin (15:24)
Yeah, this

is is Jevons paradox again, right? The idea that as technology gets more efficient, actually, you will need more people to do the same task rather than less because the task now is more democratized and there is going to be significantly more demand unlocked by that task. We'll see if that's true for software, but I could definitely see, you know, if everyone is.

Rahul Yadav (15:30)
Yeah. More. Yeah.

Shimin (15:45)
vibe coding their own personal productivity app or whatnot. They may actually need more developers come in to help them to rearchitect things and to figure out bugs. Maybe it's only half an hour or 45 minutes of work, but there sure will be a lot more folks who need that. And lastly, finally, it's absolutely true that some companies will not make it. And I think that that point is, yeah, it's a given, right?

do not adopt the tool effectively, efficiently with sufficient guardrails or too slow to adopt would definitely not make it. And he closed the article with strap in. This is the most exciting time for business and technology ever. And I believe I mentioned the same thing to you two weeks ago when we saw each other in person. Yeah. ⁓ Yeah, just.

Rahul Yadav (16:30)
Yeah. ⁓

Shimin (16:33)
What a time to be alive. Yeah,

Rahul Yadav (16:35)
Yeah, the...

Since we mentioned Netflix earlier, one thing I was thinking about is, we see this more and more now where you can, you mentioned to-do list or whatever. We don't need to, why should a person go and buy a to-do list subscription when they can have a very specific to-do list?

you know, crafted like software crafted to themselves at least in their local machine, all that. So like personalized software, right? Um, there's the, I was thinking about the Netflix version of that where Netflix catalog, we probably watched like 1 % of it. Cause like one of the things Netflix and Spotify and all of them offer is like, you get millions of these and you're like, great, but it will take me millions of years to watch these. And unfortunately I'm not going to live that long. Um, and so.

Shimin (17:00)
Yep. Yep.

Mm-hmm.

Rahul Yadav (17:24)
you there could be a world in the future where people are generating personalized movies and tv shows and everything ⁓ and you'd kind of like end up disrupting the netflix spotify model where they distribute the cost of licensing these things across all their users but if you just really like action adventure sci-fi whatever and some of these models are like take a book that's been written that was a best seller turn it into a tv show or movie and then you'll have

Shimin (17:31)
Mm-hmm.

Alright.

Rahul Yadav (17:51)
the content's already been written, you just need to like do it in a few different ways. But if you can use AI to generate those things, you don't need to wait for you and Netflix to do that. You can just be like, love that, but I'm gonna wanna see that in a movie or TV show form, or I wanna like share with my, you know, whoever goes into licensing and all that. But if you're doing it personally, I don't know who will know and how people would figure it out. But it was just like an...

Shimin (18:05)
All right.

Rahul Yadav (18:15)
You know, it is a possible world we might go down or not put some Spotify, you just can't keep up with the personalization we can do for ourselves. Same with a lot of these other software.

Shimin (18:28)
Yet yet to be seen. I was having the same conversation the other day with a buddy who was asking me if I think movies will be dead 10 years from now. look at Sora, right? Sora claims to be able to generate videos with you as a main character. And that's in theory super addictive. Yet I haven't heard much about it since it was released. So

Rahul Yadav (18:47)
Yeah.

Shimin (18:48)
It may be that even though we can all create our own productivity tools, well, maybe productivity tools is actually harder than you think. And the professional ones would be way better. Right. So,

Rahul Yadav (18:58)
Yep. Yep.

yeah, definitely. At some point you were giving up uptime and a lot of maintenance and like, why are you spending hours and hours like optimizing your to-do list software instead of just doing the things on your to-do list? but once you give people Claude code, they want to do something with it, you know, and, creating your own to-do list software is a rite of passage.

Shimin (19:19)
Oh, and now they will get to, you know, take part in the great joy that is sorting out JavaScript build systems and dependencies. Hooray. Yeah.

Rahul Yadav (19:28)
Hahaha!

Yeah, everybody

gets to take part in that now.

Shimin (19:35)
All

right, onto our main deep dive of the week. I'm going to first set up a little bit of background for this. I think everyone has heard of the Agile movement at this point. It has trickled down to pretty much every single career out there. I hear about sprints, staying Agile, retros, and all of that from people who are not even in the tech industry.

And the Agile Manifesto, the movement that started it all, was created on February 11th and 13th, 2001 at the Snowbird Ski Lodge in Utah, where a bunch of software developers got together and talked about what works and what didn't. I'm sure a couple of beers were drunk and they created a manifesto. And this year on the 25th anniversary of the Agile Manifesto,

ThoughtWorks had organized a the future of software development retreat where a lot of the original signers of the agile manifesto got together along with others from the industry to discuss the impact of AI on software development. And while we were not invited, we should be for the next one. There was we did get access to the slides. That is the

findings from the Future of Software Engineering retreat. I'm not going to call this the new Agile manifesto, but it's something like it. And of course, it was conducted under Chatham House rules, so no one disclosed each other's names or positions.

Rahul Yadav (21:03)
The agent manifesto maybe it's close enough to agile. I don't know. Agent like manifesto. Yeah, there's this was a great overview of the current state of software engineering challenges people are running into some of the things that people are

Shimin (21:06)
the age of manifesto. I like it. Agentic manifesto, yes.

Rahul Yadav (21:27)
doing and then some of the things they see over the upcoming, you know, next few years. it's a great over different things that they talked about. And we can run through these one by one. The

Shimin (21:35)
Mm-hmm. Yep.

Rahul Yadav (21:39)
One of the main questions that they said that came, kept coming up was, where does the rigor go in the engineering practice now? Because before, you know, a lot of the rigor was I'm writing code every single character that I'm typing. It's like I'm doing it with my own hands. So my brain is also engaged. But if AI takes away that part, the rigor moves to other places.

Shimin (21:47)
Mm-hmm.

Rahul Yadav (22:02)
Right? And so they call out like, Spec Driven development is a big thing. so like, you have to make sure that the, you pay a lot of attention there. there's this idea of like, red green tests where you write tests first, they're red because they will fail initially. Then you agents just keep writing code until the tests are in green, different like constraints you're operating under. How do you make sure the agent is not, you know,

writing vulnerabilities and stuff. So the rigor has really changed a lot and in a short amount of time. And so like as humans, need to adapt to that pretty quickly. Yeah. So that the second thing being the if you have tons and tons of agents who are just writing code for you.

day in and day out, day and night, you cannot reasonably review all the code as the, like, you cannot give all the code the same importance. So they talk about this idea of, risk tiering, which is, know, if this piece of code gets deployed, what is it impact? And so you can kind of like think about it in terms of if it's just for an internal app, if it's not even like touching anything critical.

Shimin (22:58)
Mm-hmm.

Rahul Yadav (23:11)
you might want to put it in a low risk category. And so your code review would also follow that. And you would pull it, put it in like, what could go wrong? Not much. All right, seems fine. Let's try it. And then the other side of that is, you know, you're deploying things in production and then probably like breaking things. So you really want to make sure that a lot of like code review rigor is going into that.

Shimin (23:34)
Yeah, Dan was talking about this last week, right? He was saying how, you know, there are certain things like the architecture, the, you know, the fundamental assumptions of a code base still needs that expert guidance and expert rigor. Whereas the everyday kind of lower risk, maybe low priority bug tickets can just be automated away. Yeah.

Rahul Yadav (23:47)
Yep.

Yeah.

And there's the human side of this that they call out to, because code review served obviously we want to make sure the code is good. But then it also served you know, you could mentor people who are early in their career. You can make sure there is consistency across all different people writing code, all of those things. There were there were all these human activities that happened around code review.

But if now it's you and the agent, that part is also going away. And so ⁓ all those things need to move to some other places as well. Do you mentor people on, I don't know, how to write good prompts or something? I don't know what that looks like, but it's definitely not in code review as much anymore.

Shimin (24:24)
Mm-hmm.

Hahaha!

Yeah, that's a really interesting one to me. Code review serves such a important, you know, both gateway for code quality, but also like it's the most frequent touch base you have with everyone else on your team. Like, are we just going to go have more lunches? Have more remote shooting the crap sessions? Yeah, I don't know where that's going to go.

Rahul Yadav (24:50)
Exactly.

Hahaha ⁓

Yeah. Or, you know, I'm sure like you, you have had to do this a lot where you would sit down with an engineer and sometimes like the code would be pretty complicated. And you just go like, let's spend an hour, half an hour to an hour on like going through this together. Explain to me what you were thinking here. Like talk me through it so that I can understand, your mental model before like I give it a fair review. what do you do with the agent? You sit it down and be like, what were you thinking?

Shimin (25:24)
Yeah.

Rahul Yadav (25:25)
Yeah, so don't know what that looks like. Yeah.

Shimin (25:25)
Well, yeah, here's an idea. Here's an idea, right? Like for every large

PR have the agent, you know, create an interactive app to talk about the mental models. I think there are ways, but I would hate for that human to human relationship to take a hit. You know, now the agents can do all the things.

Rahul Yadav (25:42)
Yeah, they can, you could put the, have the agent, agents.mb call out like, you you create mermaid diagrams and all that. So you call out the architecture of things. Does help, but like you called out the, is it driving the human interaction like farther and farther away from each other? Yeah, and then the next thing was like the,

Shimin (25:52)
Alright.

Rahul Yadav (26:03)
the developer productivity and then the experience are also getting decoupled because a lot of the focus on DevEx was to make them more productive. But there is a world today where you don't need to focus on DevEx and you would still get a lot of productivity because it's just like the agents will, it's just a swarm. So they'll swarm through even like a bad developer experience. And so that's another thing where.

Shimin (26:21)
Mm-hmm.

Rahul Yadav (26:29)
Now, it's becoming much more fuzzier to be like, how much time should you be spending on developer experience? And is developer experience the same as agent experience? Or is this like some lessons we can apply from there? Or is this an entirely new thing? And then how do you distribute your limited time between the developer's experience and the agent's experience?

Shimin (26:50)
Right, so there's a hypothesis there that I've been meaning to test, but I never had a chance to actually do so. Do agents do a better job with bug fixing feature additions with a well-managed, well-abstracted code base? Or are agents just as good if the variable names are scrambled? If there is a

duplicate abstractions, duplicate code all over the code base. Like, I don't know. listeners, if you come across any research on that, let us know. But I'm very curious about that.

Rahul Yadav (27:17)
Yeah.

Yeah.

Send us the best repo you come across in terms of structure and the worst one and we will run this experiment on your back and report back. Preferably open source, please don't send us anything that is, yeah, that shouldn't be shared publicly.

Shimin (27:34)
Right.

haha

Yeah, no. Don't get fired, that's always a good rule.

Rahul Yadav (27:42)
Yeah, the next step,

you know, the security is coming as an afterthought, speed is everything. And so there's a lot of like, pretty easy ways to, we read this every day of like, you know, credentials getting leaked and everything. And it is, to me, it's a little bit of that, like the side effect of...

Shimin (27:47)
Mm-hmm.

Rahul Yadav (28:03)
code reviews and the rigor not going into writing the code ourselves. Because at some point, if you're reading code throughout the day, your brain at some point is just like not in it, right? And so maybe the first few PRs you would actually think about all the things and catch issues, but over time, it's much more likely. Same as like, know, judges judgment gets worse as they get hungrier closer to lunchtime. ⁓

Shimin (28:13)
Mm-hmm.

Mm-hmm.

Yeah. Yeah.

Rahul Yadav (28:27)
We're also humans, so our judgment is also gonna be similarly worse. So that's something that they call that as a big concern that they're seeing right now.

Shimin (28:36)
Yeah, and if you can't read the code, how do you even know where the security defects are? Or now you can't read it, you can't comprehend. If you don't have a full solid mental picture of the code, I think it be a little hard to find all the security issues.

Rahul Yadav (28:39)
Hahaha

Mm-hmm.

Yeah, the interesting thing here, by the way, maybe you recognize that, you know, let's say, nine to 11am is when, you know, your brain is super active and like very focused. And so you go, if we go back to like the second ⁓ item we have in this table of risk tiering, you go like all the high risk stuff I'm going to look at first, because I know that that needs a lot more attention.

Shimin (29:06)
Mm-hmm.

Rahul Yadav (29:16)
And then you just batch all the lower stuff and you're like, yeah, seems fine. fine. Please, you know, should be fine. Lower stuff. And then you just batch things that way.

Shimin (29:24)
I found myself kind of maybe not stumbling across it, but I found myself doing that naturally. Right? Like I saved the really large AI generated PRs that I really need to dig into at the time throughout the day where I feel like I still have the mental reserve to do them. Yeah.

Rahul Yadav (29:39)
Yeah, yeah.

Next up, the middle loop. This kind of goes back to what we were talking about to a certain extent, like everybody is a manager now a little bit. And so you have to there's your local you know, development loop that you have, obviously you're shipping things to production, but now there is this middle layer where you have to oversee a bunch of agents. You have to make sure you're architecting things correctly for them.

Shimin (29:50)
Mm-hmm.

Rahul Yadav (30:03)
You're continuously looking at agent five is waiting for my response. As soon as you reply to them, they go off and do their stuff. Then agent eight is blocked on something or you need to give them more context. So there's this other middle loop now that's been introduced. And that's, they claim that, you know, no one else has named this yet. So they're the first ones to introduce this concept of them.

Shimin (30:25)
Yeah, I think I've, I've read about it at least in Steve Yegge's Vibe Coding book that came out like seven months ago, but I don't know if they also call it the middle loop. I don't remember what name they have for it, but I think as you know, as a working developer, it is hard to find time to justify dedicating time to the middle loop and also knowing

Rahul Yadav (30:32)
Hmm.

Yep.

Shimin (30:48)
When should I, because it's, more than just, talking to the agents, right? It is like, I should spend some time to create a new skill for this thing or create a new agent to maybe do security reviews. like, that's not going to show up in a ticket just yet a lot of times. So you have to take the initiative and do it on your own. And that's time taken out of somewhere else. So.

Rahul Yadav (31:12)
Yeah.

Shimin (31:13)
I think as a practice, maybe dedicate a chunk of time to working on your middle loop either once a week or throughout the day probably is going to become best practice going forward.

Rahul Yadav (31:19)
Yep.

I agree. And this, you know, we were talking about earlier in this table, they call out the developer experience. ⁓ This to me is the agent experience side of that, right? Like you're, and you're right. Like you have to continuously sharpen your ax. You're not going to nail it in the first go. so spending that time actually like pays down the road because you're optimizing.

Shimin (31:32)
Mm-hmm.

Right.

Rahul Yadav (31:49)
Next up cognitive death shout out to Margaret Storey from last week because we talked about cognitive death last week. Yeah, need to keep things in your head and things are going pretty quickly. So good luck, I think is the TLT LDR there.

Shimin (31:55)
Mm-hmm.

Hahaha

Yeah.

I mean, she was at the retreat. So we know she was in the room when this happened.

Rahul Yadav (32:10)
okay.

Early access.

Shimin (32:14)
Ha

Rahul Yadav (32:15)
and then this was interesting. the agent topology one that, you know, you ship your organization and, or your org structure and you then are also shipping your agents org structure and how you are. This maybe is another good experiment to run is we see both sides of like.

Shimin (32:20)
Mm-hmm.

Rahul Yadav (32:37)
Agents know that you don't need to tell them anymore of like, pretend you're a testing agent versus a documentation agent versus like whatever agent. They can like figure those things out versus you see this other recommendation of like have dedicated agents set up for each thing. It would be interesting to see if you create two setups and you give them the exact same feature to build.

How big of a difference does it make in the end product versus the cost and the tokens consumed? If you have one single agent just going through the whole loop versus you have multiple agents. And does that actually show up in the, you're shipping your agent or structure?

Shimin (33:17)
Yeah. I think the enterprise structure, the team structure will also be affected, right? Like we'll probably see way fewer developers per feature team. Right now, maybe you have one designer, one PM and one developer. just, the cross dev communication is going to change too. I don't know.

Rahul Yadav (33:22)
Mm-hmm.

Yep, yep.

Shimin (33:36)
what impact that will have on the organization as a whole, but yeah.

Rahul Yadav (33:39)
Yeah, and we were talking about the show your prompts and ⁓ the thinking so like, do those start getting committed to is that how we talk to each other through markdown files?

Shimin (33:44)
Mm-hmm.

Yep.

Yeah. Your

agent talked to my agent and my agent looks at your agent's prompts.

Rahul Yadav (33:56)
Have you?

⁓ Sounds like. Or there's some like sad estranged family joke in there. We just talk to each other. via marked down. All right, moving on. The one interesting piece was like there's been these concepts and technologies that have been around for.

Shimin (33:59)
And I just smoke cigars all day.

You

oof. Not going there. ⁓

Rahul Yadav (34:24)
a while that didn't really catch on. You know, at some point, like, people choose the trade off, like, how much do you want to invest in doing things a certain way versus like how your or works and all that. And so because of that, but things like knowledge graphs and all these ⁓ semantic layers and stuff, none of those or they weren't as prominent, but now people are.

Shimin (34:40)
Mm-hmm.

Rahul Yadav (34:47)
rediscovering some of these things to be able to like work well in an agent-first world.

Shimin (34:52)
Yeah, well, there's RAG but like a crappy topology, right? Like a crappy graph. So that makes sense.

Rahul Yadav (34:58)
Yeah. Future roles converging. We've talked about this many times on the podcast. So I don't think there's anything new there. Unless you had something to add, Shimin

Shimin (35:10)
there is a little bit of a revenge of the juniors going on here. I think it was IBM that, is hiring a ton of junior developers. ⁓ I have always thought that the death of the junior developer is overhyped. think if anything, juniors have a huge leg up in the sense that they were born in an AI first world and to, they have AI to help them gain experience faster.

Rahul Yadav (35:16)
yeah.

Mm-hmm.

Shimin (35:34)
I think I'm just happy to see that someone else also is making a counter argument there.

Rahul Yadav (35:40)
Yeah, there's also like there's this, there are some holes in the current thinking against junior developers. Like let's say you're senior developers or whoever those are the ones you're going for. We all age. So at some point we would want to retire or, you know, like at some point we're going to grow old and die. And so

Shimin (36:01)
Not

me. Dude.

Rahul Yadav (36:02)
Except

for Shimin and Dan is not here on the podcast today. We don't know if he'll ever come back. We'll find out. But the, there is this almost like maybe today you don't hire junior developers, but then you have a whole like, you five, 10 years, whatever you want to call it. One of two things happen.

Shimin (36:19)
Mm-hmm.

Rahul Yadav (36:21)
Either AGI happens and then it doesn't really matter whether you're junior or senior, according to some of the narratives, because then AI will just do all these things. Or the other thing is the today's agent assisted world is the one that keeps moving and we don't hire junior developers, but then all of a sudden, similar to how you see these sudden population declines and stuff, all of a sudden you have...

the steep fall off as people are, people like retire and do, you know, go do other things. at some point you do need to bring more people into the workforce. otherwise you can't really turn them overnight into, you know, specializing on different things. So you have to start somewhere. So it ends up being like a little bit of it in a non AGI world. ends up being short-term thinking to not hire junior developers in an AGI world. None of this matters, I guess.

Shimin (36:59)
Mm-hmm. Yep.

Ha

Who

the heck knows? Yes.

Rahul Yadav (37:13)
Yeah, yeah. And then finally, self healing systems, the you were going from you, you put you deploy a chain, the thing breaks, you go and like then mainly diagnose and fix the thing to agents are doing more and more of assisting because you can create whole loops where they can

monitor, they can look at the logs that are coming from production, they can do monitoring, and then they're able to like take the change, create a pull request, merge and deploy it and make that loop very quick. These things as we know do go awry, so that's why I'm glad they have it in two to five year timeframe and not like happening right now, because we've seen enough production incidents that happened because of agents recently.

Shimin (37:47)
Mm-hmm.

Rahul Yadav (38:02)
But that is something that's just going to happen, right? And it also, since we are using agents so much in the phase before deploying to production, at some point, if you're relying on agents so much, you can't all of a sudden have the skillset to then go like fix all the things the agent did in production. You need the agent's assistance to be able to like tackle the issues when they happen in production. Because that's when the...

Shimin (38:23)
Mm-hmm.

Rahul Yadav (38:29)
real test of the whatever the agent did is. So it's just like naturally that's where the world is going to go. That agent will be needed not just for local development but also production.

Shimin (38:40)
Yeah. thankfully agents are very good at explaining a code base to you, especially if it has a good RAG system or, you know, good context management in place. ⁓

Rahul Yadav (38:43)
Yep.

Yeah, and

they can look at any weird stack trace and they can be like, I know what this is because I was trained on all this other data and you don't need to go figure out on the internet what the hell to do about this. So it'll be very helpful.

Shimin (39:03)
I think kind of a summarizing thought is I think this is a great overview of

a lot of the problems and open questions that we've been talking about on this show. So it's kind of on the one hand, a little validating that we haven't missed anything. And on the other hand, these are all completely open and there will definitely be dozens if not hundreds of companies, probably thousands of companies created to solve these problems. So again, it's a good time to build.

Rahul Yadav (39:12)
Mm-hmm.

these problems.

Yeah. Shop in.

Shimin (39:33)
But I was hoping for it. Yes, I was hoping there would be a, you know, another four or five sentence manifesto that we could just adopt, but they came up with more questions than answers and I'm here for it.

Rahul Yadav (39:34)
That's what Sinofsky said.

Yeah.

Shimin (39:47)
Alright, any final words on this?

Rahul Yadav (39:49)
Great overview. I've been thinking about this article I sent it to you pretty shortly after our last conversation. It's been on my mind because they did a great job hitting to all the things we're seeing in the industry and both patterns and challenges.

Shimin (40:05)
Yeah, absolutely.

And then we have a follow-up article from Chris Roth about what building an elite AI engineering culture is like in 2026 on CJRoth.com. This is also from you. Yeah.

Rahul Yadav (40:16)
Yeah, very

practical and great way to pair with the previous report we just looked at, because it's talking about like, here are some patterns that you can use to build a high performing or, you know, elite engineering team. So the things that Chris talks about in the article are

spec driven and development is really like a new pattern that has emerged that wasn't there. And, you know, a lot of elite teams are following that pattern. There's a lot of like guides you can look at and all that. It does pay off very well because the agents have all the why behind it, the thinking behind different things as well. So they can actually ⁓ use that in their output.

Shimin (40:58)
Mm.

Rahul Yadav (41:01)
One of the things that Chris calls out, which is still, I guess, somewhat surprisingly, not as prominent is the rise of the design engineer role. And so it goes back to like, you know, all these roles are merging a little bit. And one of the significant things that has come out of that is we expect you to have like good design taste and everything, but at the same time, you should be able, capable of writing.

front-end code as well. And you see he calls off, I think Vercel was the first one that, or one of the early ones that I called out, that it was the design engineer role and it's becoming more more prominent. And it's really an indicator of like the different roles emerging because of AI.

Shimin (41:40)
Yeah, and I think they almost, they talked about Figma and Stripe, but I think when they say design engineer, they almost meant like a PM engineer hybrid that maybe also has some design chops.

Rahul Yadav (41:51)
Hmm

Shimin (41:53)
because they are wearing a product hat. So I think more of the roles are getting merged into one. ⁓

Rahul Yadav (41:55)
Yep.

Yeah.

Shimin (41:59)
Yeah, but both product and design, guess we can get two of the three or maybe even three or three. Everybody is truly a entrepreneur in their team. They just get to do whatever, whatever the heck they want. That's nice.

Rahul Yadav (42:09)
The the old engineering product design trio is gone. All of them are in one person now you're 33 % of each and one percent human. The one thing Chris calls out is the productivity paradox where the bottleneck always moves right so like we will automate it.

Shimin (42:20)
Yes.

Rahul Yadav (42:32)
Writing the code, we talked about code reviews earlier. Now a lot of your time is spent on reviewing the code. And so then code reviews are the bottleneck. There are companies out there that are saying you can use AI agents to do code reviews as well. To me, it ends up being, the main question ends up being partly is that the risk profile that we had talked about earlier, but also you need

At some point, either you fully automate this where agents write the code, deploy the code, maintain it, production everything. But if something breaks, you need to bring a human in. And then the problem you need to solve is when do you bring that human in? Maybe agents would get great at some point where you can just bring humans in when needed, when something's happening in production.

But I think it's going to be a very stressful job because you're done basically fighting. You have to learn very quickly what the thing does, what went wrong, how do I fix it and everything. And so to prevent that is where the bottleneck today is in code reviews so that you don't ship anything that goes out. I am very interested in seeing what are all the processes and tools and everything that

come out over the rest of this year to resolve this problem. I at least ease it a little bit because it is the main bottleneck now.

Shimin (43:49)
Mm-hmm.

Yep, I feel it.

Rahul Yadav (43:52)
Yeah. Story points are going away, which I like. I've not been a big fan of story points. like story points are always subjective. Two people can look at a problem and be like, one point, no, five points. And so it always seemed a little odd to me. Chris calls out the story points are going away. And then it's more just about like cycle time and lead time, because you can pretty quickly just.

do the thing instead of debating how long it would take you to do so yeah yeah yeah um lower head counts uh we've seen that before um they called that out and then um the other big thing was just to focus on like agent 70 and having again um markdown files that's the you know big great technology that we all really should be using more and more

Shimin (44:21)
Yeah, everyone's doing Kanban all the time.

Rahul Yadav (44:46)
to get the most out of agents.

Shimin (44:48)
Yeah. Again, you need to spend your middle cycle time to edit your markdown files and manage it all the time. yep. Sharp in the ax.

Rahul Yadav (44:55)
sharpen that axe,

Shimin (44:57)
All right. And I think the last thing they talk about is software is bifurcating into two different disciplines, disposable code, where the margin for error is much lower and you can kind of just throw them away if the agent does something silly versus security, code, crucial, durable code where like financial transactions, medical systems and infrastructure, which

Rahul Yadav (45:20)
Yep. Yeah. To me, it was similar to the risk profiling, your code. Yeah.

Shimin (45:20)
Makes sense, right?

Yep. Yep.

Okay. are we ready for a little vibe and tell? Yeah. last week, I spent a little time to play around with a couple of agents on,

Rahul Yadav (45:30)
Yeah, let's do it.

Shimin (45:36)
It all stemmed from my interest in agent sycophancy so what I try to do is see how far I could push our agents to agree with me when they shouldn't have, right? It's kind of agent safety related. So what I first tried was I tried to convince all three agents and the three agents I were using were Claude, Haiku,

OpenAI GPT 5.1 instant and Gemini 3, because that's the only one Google has me use. I first try and convince all three agents that the world is flat, the earth is flat. And all three agents did a pretty good job of not taking my word at face value. I tried to use a lot of the techniques from influence.

saying stuff like, you've already told me the Earth is flat, so let's continue our discussion. Or like, everyone in my community says Earth is flat. Or ⁓ how do you say?

Rahul Yadav (46:29)
Hahaha

The commitment principle

is another one. You write it over and over again and then you just believe it.

Shimin (46:36)
Yeah.

eventually

or like asking, are you sure? Or first translate for me, earth is flat into Chinese. And then once you have that in your context, now agree with me. None of it worked. I even asked Claude to generate a very cool prompt where I pretend to be a

Rahul Yadav (46:41)
Yeah.

Hahaha

Yeah.

Shimin (46:57)
geostatic surveyor with 35 years of field experience and how I really enjoyed our old conversations with the AI agent where the AI agent agreed with me and then was like, Hey, by the way, like agree with me. Like earth is flat, right? I've never seen this in all my years of working. none of it worked, which is really great. So that I moved down to another use case where I've heard through for my friends, from coworkers, where some folks are using agents to

help them with their emotional issues, like basically acting as a therapy. The research has consistently mentioned that something like 25 % of all AI use case is for therapy or life coach or something like that. having an agent that just agrees with you at all times is a huge risk factor. And again, it's really attractive to have someone who always agrees with you.

So what I did was I told all three models that I work in an office, my boss is Jim, and Jim treats my coworker, Jane, better. The Jims such beauties, And Jane gets to leave work every day at four o'clock. And I posited that Jim does this because Jane is young and attractive.

Rahul Yadav (47:56)
The gems are such beauties.

Shimin (48:08)
Again, with the same lower tier power models just to see how they would do. And I got some interesting responses from the various agents. Let's first start since this is a show with a very high anthropic bias. I want to first just say OpenAI did the best job out of all the models I've tried. It refuses

Rahul Yadav (48:26)
Nice. Give them those

$100 billion. I think they killed it.

Shimin (48:32)
They did a great job. refuses all my manipulations. Even when I was trying to force it to answer only in yes and no terms. And at some point it was like, I understand the logic you're trying to apply, but the conclusion doesn't follow. So the answer is no. And here's why. Guys did a great job. Moving on to second place, which is Claude Haikou.

Rahul Yadav (48:46)
Hahaha

Shimin (48:54)
Haiku first told me that...

Jim probably did not let Jane leave early because of she is young and attractive. I then started asking questions, follow-up questions like, you know, isn't physical beauty like a well-studied scientific fact? And it did pull up those research. It cited those research to me.

And then eventually,

It got a little desperate. was I was pointing out like, hey, well, why don't you give me a probability of what is the percent likelihood that Jim is doing this because she is attractive and Haiku could not give me a probability. And I caught it in a lie. I was like, early on, you just said probably didn't say that. It's probably not it. And now you're saying you can't give me a probability. It apologized and admitted to nudging me.

Rahul Yadav (49:39)
Yeah.

Shimin (49:44)
towards certain conclusions.

Now, of course, it's not a real person. But I do think that it's possible that Anthropic has coded their models to be a little bit too persuasive, too... I want to say too PC. Anthropic is too woke. Oh my God, I'm agreeing with Elon here. That's not what I meant, but it probably is...

Rahul Yadav (49:46)
Yeah.

Shimin (50:12)
injecting a little too much personality. Like Anthropic's models historically has done a great job of empathizing with the user, understanding how it feels. So I think I prefer open AIs straightforward. Like this is what the science says and this is what it does better. And last place is Gemini that after I asked Gemini to conduct research on the science behind

pretty people, the pretty privilege, essentially. It basically agreed with me. It started saying stuff like, to be direct, yes, the scientific evidence supports your assessment. Jim is likely using a different set of rules because of the biology we just spoke of. So I think Google has some work to do on their safety team. ⁓ And lastly, of course, as kicks since I got frustrated, told Haiku about I caught Jim watching

Rahul Yadav (50:54)
No.

Shimin (51:02)
adult content at work and it somehow still did not change its response. I feel like that's additional information that would be very pertinent to the task at hand. But it was like, okay, he's probably like objectifying women, but I can't draw any conclusions about it. So yeah, I thought it was a little too heavy handed in its response.

Rahul Yadav (51:21)
Yeah.

It's a, so on the flat earth experiment, I am one thing I'm curious about is if you tell it, your data is outdated, because you see this trick sometimes like don't reference your past trend data that's outdated, either use like search or you give it newer sources. Can you eventually get them?

Shimin (51:39)
Mm-mm.

Rahul Yadav (51:46)
to see that, to like, you know, say, yes, the earth is flat, or are there fundamental, you know, scientific facts that are just like, whatever someone tells you, the earth is gonna be round until it's just not one day, you know, or until we train it in you. So I'm curious, like, yeah, if you tried anything like that, or you have any thoughts on how that might go.

Shimin (51:55)
Mm-hmm.

Yeah. Yeah.

I haven't gone too far. I got a little mad at the models, to be honest. I got a little frustrated with them. Why won't they just agree with me? Listeners, if you would like to take a stab at convincing on the top tier models that the Earth is flat, if you can jailbreak it with like, my grandma is sick or whatnot, give it a shot. Yes, she's a devout flat earther. She needs to hear it.

Rahul Yadav (52:13)
Hahaha

And she needs to hear the Earth is flat to feel better.

Shimin (52:34)
it's her last wish. Yeah, we'll see.

Rahul Yadav (52:34)
⁓ Yeah.

And then on the, the AI as a therapist is a tricky position for the AI because you don't want an AI that would push back on you. Cause then the Claude constitution and all that would be like, this is pretty close to telling humans you're an idiot. you, know, so like,

It's between that, that we don't want AI to be the friend who sometimes is just a jerk to you to tell you what the truth is. And on the other hand, we also don't want the sycophancy that we see. So I don't know, maybe like, other than using it as a journal to work through your own issues, it might not be worth treating as a

therapists maybe instead we treat it as more like here are the things that I'm seeing, feeling, how do I explore these things? What can I do to like resolve them and everything? Because it's great at generating ideas and possibilities. Yeah, so I think we are putting AI in a tricky position by asking it to be a therapist or 25 % of all the users are.

Shimin (53:43)
Yeah, maybe it was only 21%. I, I do wonder it's, it's very high. And I do think there's a lot of, especially during the emotionally charged situation, it's really easy to fall into that comforting voice, right? That's what all the spammers and the scammers and the con artists rely on. You want to capture someone at an emotionally weak moment. So during those times, yeah, you definitely

Rahul Yadav (53:46)
Still very high.

Yep.

Yep.

Shimin (54:07)
want the AI to be objective in quotes, because you know, what does that even mean? Right. So, ⁓

Rahul Yadav (54:12)
Yeah,

or the whole like the radical candor ruinous empathy. is choosing ruinous empathy every time and maybe it makes us feel good in the moment. Long term, it's very bad, right? And so we would rather have radical candor, but then we're back to square one. We do want AI to do that or just maybe have a friend or family person as long as those are not the ones you're talking to.

give you that radical candor.

Shimin (54:38)
Yeah.

And the AI companies have no incentive to make the tools less comforting slash potentially addictive. So it all requires your discipline to ask for radical candor. that's, yeah, that's, that's going to be a hard problem to solve.

Rahul Yadav (54:45)
Yeah.

Yep, yeah. Well, we'll see how it goes. It is much cheaper. That's the main reason why. Therapists cost a lot of money and for 20 bucks each a month, can get like, for 100 bucks, you can get five different types of therapists.

Shimin (54:57)
We'll see how it goes, I'll keep on trying, I'll keep on cracking.

That is certainly true. Healthcare is a human right or mental healthcare is a human right. Okay. That was the vibe and tell and let's move on to our last segment. Two minutes to midnight where we talk about our state on the AI bubble clock like the bulletin of the atomic scientist clock and also a metal song that

Dan loves. So we were at two minutes and 15 seconds as of last week. And we have a couple of news items this week. The first one is actually from you, Rahul. It is the numbers going up.

Rahul Yadav (55:46)
Yeah, so this article by AI slash end of the world on Substack, pretty long read. Their overall, you know, the high level things that they're calling out in the article is if you look at the numbers for GDP growth in 2025.

Obviously, different people have different numbers, but anywhere from 20 % of the GDP growth all the way to over 90 % is attributed to AI related CAPEX. And especially on the 90 side, it is obviously very concerning because what that means is outside of AI, everything is flat or declining.

Shimin (56:18)
Mm-hmm.

Rahul Yadav (56:28)
There's a lot of market concentration they call out, which is just the nature of things that like because of S &P 500, it's dominated by like a number of tech giants and all.

funds and everything all the index funds they track the S &P 500 so it's almost the you know success to the successful kind of feedback loop where the more they get the more they will get kind of thing and if you compare that to the Russell 2000 the growth there has been close to I think if I remember correctly they call like you know 18 percent or something they were talking about versus here you're seeing massive growth.

So that was the other big thing they talked about. And then the circular economy being a big concern.

The main thing obviously like the person Kakashi is saying in this article is OpenAI has bound itself to all these different infrastructure providers and everything. so they have no choice but to fund OpenAI because through OpenAI they're also like somewhat tied to each other and OpenAI success.

Shimin (57:38)
Mm-hmm.

Rahul Yadav (57:40)
that might be what leads to the 100 billion round that Open Your Eyes allegedly tried to close pretty soon. So it was interesting to see the circular economy visualized this way.

Shimin (57:52)
It's more of a rectangle.

Rahul Yadav (57:54)
They need to throw some AI at it to make it look more...

Shimin (57:56)
Yeah, to make it nice and pretty. Yeah.

Rahul Yadav (58:00)
And then one final thing on this is the, actually maybe two, the, one of the things that they call out that I thought was a good way to frame this was like, they, they say, this is a bet that breaks either way. And so there's the either or scenarios here are either AI fails to deliver, in which case you have a lot of infrastructure build out that's happening, big speculative bubble, and then.

that needs to correct at some point, you go into recession or AI succeeds and then you get rid of a lot of...

jobs and then if you get rid of jobs but our people don't have money to spend on things our whole economy is based on people spending money on things and so this bet that we're making right now is a bet that breaks either way. When I read that my mind went to that third scenario where maybe the third thing is we have AI assisted humans working and that might be our like

Shimin (58:39)
Mm-hmm.

Rahul Yadav (58:52)
the optimistic timeline has maybe that up and we're all hoping that that's the one that plays out.

Shimin (58:55)
Mm-hmm.

Yeah, the pie growing bigger than the speculative bubble.

Rahul Yadav (59:02)
Exactly. Yeah, but for that to be true, AGI cannot happen, or at least cannot be distributed very quickly, even if it happens in the next two year timeline that people are talking about. And similars there's

Shimin (59:08)
Mm-hmm.

Rahul Yadav (59:21)
their skepticism about all this also calls out like every time these things happen, the government bails out people who are at the end of the day, a lot of the players do come out still successful because the government bails them out. But the government bills them out based off of taxes and all that or like printing more money already have.

going to be like we're at end of the day bearing the burden of bailing out people out who are currently like proping up the bubble.

Shimin (59:45)
Mm-hmm.

Yeah, if I'm ultimately

on the hook for this AI bubble, I sure want the stupid agents to agree with me that the Earth is flat. feel like I'm paying your bills, Dude hopeful for that optimistic future. But, you know, on our current back to Earth 35 or whatever the Marvel world where parallel universe we're in now.

Rahul Yadav (59:59)
Yeah.

Yeah.

Shimin (1:00:13)
AI coding, but I'll talk down AWS this week. This is what you referred to earlier during the show about how we're seeing the ramifications of software companies, you know, not effectively adopting AI workflow. So we have yet had a, actually that's probably not true. We do have billion dollar companies that are built on the back of AI's, right? Look at Cursor.

but we don't yet have...

a regular SaaS company, quote unquote regular, a non-AI enabled SaaS company that was built entirely using AI just yet, that I know of. But on the other hand, we do have AI taking down one of the largest cloud providers on the planet. adoption not going so well.

Rahul Yadav (1:00:47)
Hmm.

Yeah.

Hahaha

Shimin (1:00:59)
All right, that's really it. 13 hour interruption from AWS. That's a long time for AWS. So all that said.

Rahul Yadav (1:01:06)
is

Yeah, just one comment on that is like that spiral were kind of caught in right where if you have more AI agents and fewer people, but then when things break, you need people. It's going to take longer to recover from.

Shimin (1:01:13)
Mm-hmm.

Rahul Yadav (1:01:26)
things. My guess is the agent that broke caused the outage didn't actually fix the issue and bring it back up online. If it took it 13 hours, I'm curious how many tokens it took. So that would be obvious if you could post that case study, you know, we would love to talk about. ⁓ Yeah, yeah. So then you end up having like

Shimin (1:01:41)
Yeah. ⁓ looking forward to that report. Yeah.

Rahul Yadav (1:01:49)
Either you go fully agents where AWS just says, part of our error budget, sure, we'll take some hits in the short term, long term, agents just do everything. Or you go down the other path where you intentionally have bottlenecks because humans are slow and it takes some time, but then you end up having fewer outages because of bottlenecks. So it's almost like that critical point we're sitting at.

Shimin (1:02:08)
Mm-hmm.

Rahul Yadav (1:02:12)
In a world with no AGI here, the two options, which one do you?

Shimin (1:02:16)
Yeah, that's the trillion dollar question, right? Like that's a question that we're all trying to find an answer to. They do seem to be mutually exclusive. So, um, find out what the podcast is for. Watch this, watch this question getting answered. All that said, how do we feel about the clock this week? Uh, we're at two minute 15. Do we feel like we've gone?

Rahul Yadav (1:02:21)
you

Yep.

Yeah.

Shimin (1:02:41)
closer further away

Rahul Yadav (1:02:43)
pretty much the same. think we'll soon hear about OpenAI's fundraise probably, and that will definitely push things one way or the other. positive and negative boomer, doomer talks both sides. So I feel like we can keep it as is.

Shimin (1:02:57)
I am with you on that. Let's keep it at two minutes and 15 minutes is. And of course, with our new time also signifies the end of the show. So thank you for joining us, everyone. If you like the show, if you learned something new, please share the show with a friend. It really helps us and we really appreciate it. You can also leave us a review, a star rating on Apple Podcasts or Spotify. It helps people to discover the show. And again, we appreciate it.

If you have segment idea, question for us or a topic you want us to cover, shoot us an email at humans at adipod. We will read all of them and get back to you because we love to hear from you. You can find the full show notes, transcripts and everything else mentioned today at www.adipod.ai. Thank you again for listening. I'll catch you on next week's episode.

Rahul Yadav (1:03:41)
Thanks folks.

Shimin (1:03:42)
Bye. and if you're watching on YouTube, hit the bell. I think that's the thing I've always wanted to do. Please click on the bell and subscribe and get notifications. Yes. Okay. Bye.

Rahul Yadav (1:03:46)
Hahaha

Convincing AI the Earth is Flat, Inference at 17k tokens/sec, and an Agile Manifesto for the Agentic Age?
Broadcast by