Artificial Developer Intelligence | Transcript: Ep 20: Claude Code Source Leak, Emotion Concepts in LLMs, and Surprising Facts AIs Know About Us.

Ep 20: Claude Code Source Leak, Emotion Concepts in LLMs, and Surprising Facts AIs Know About Us.

April 10, 2026 / 53:41/E20

Shimin (00:00)
So not only do they think the sycophantic results are better,

They find the AI agent to be more trustworthy. And this almost seems like a flaw in the human condition that we cannot just trust the labs to self-censor, just like we cannot trust the social media companies to make their product less addictive.

Rahul Yadav (00:12)
Yeah.

Shimin (00:37)
Hello and welcome back to Artificial Developer Intelligence, a weekly conversation show about AI and software development. I am Shimin Zhang and with me today are my co-hosts, Dan, if it's blackmail or death, he chooses blackmail, Lasky And Rahul, he uses the Reddit community's endorsement rates as a baseline, Yadav. Well, it's been two weeks since we last chatted. Uh, gents, what did I miss?

Dan (01:03)
⁓ well, as I've been telling my coworkers, I have a new zoom background. So, yes, I moved house. just part, part of the reason for our absence, but very excited to be back. For those of you actually listening to the podcast, we purchased.

Shimin (01:03)
Nothing terrible has happened, has it?

Looking great, yes.

did you use AI in your-

Dan (01:19)
It looks different than before. There you go. That's about it.

Shimin (01:19)
Very nice.

It looks great. I can see the sun shining in your eyes, glittering. Yeah, it's great. Well, a lot has happened since two weeks ago. In this week's episode, as always, we're going to start with the news thread mill where we're to talk about the leaked Claude Code source code. And then Dan would talk about some new model news.

Dan (01:28)
brightly into my eyes. It's lovely.

Yep. Open source as always. Then post-processing. We have, this is actually submitted several times by our editors here. ⁓ Thoughts on slowing the fuck down, which is a pretty good read.

Shimin (01:58)
Yeah, and then we're going to have a ⁓ fun vibe and tell this week about intimate things that AI knows about us.

Dan (02:05)
And then we have double special on our deep dive. So first is pretty interesting as always, because Anthropic writes good blogs, blog posts from Anthropic called Emotion Concepts and their function in a large language model. And then next up is the sychophantic AI Decreases Pro-Social Intentions and Promotes Something.

Shimin (02:25)
Social dependency, I think.

Dan (02:25)
The sheet isn't big enough.

Okay. Psychophantic AI decreases pro-social intentions and promotes dependence.

Shimin (02:32)
Right. And in honor of ⁓ weird world news coming out this week, we're going to skip two minutes to midnight because we're a little bit too close to the actual two minutes to midnight. But let us get started.

Well, first up, the entire Claude Code CLI source code was exposed, I think, last week. This is an article from Ars Technica where due to a probably vibe coder oversight, the map source file of Claude Code was leaked in

Dan (03:02)
Actually, apparently

it was a bundling bug in Bun because someone had reported it. If you don't know, Bun is an alternative JavaScript runtime and Anthropic actually recently hired the author of Bun. So a lot of Claude code has been moved over to it internally and including their build pipeline apparently. And someone reported it, I think a few days before this happened.

Shimin (03:08)
Mmm

Right.

Dan (03:28)
like that it was just ignoring the bundling setting. So anyway.

Shimin (03:33)
Yeah, and this was first discovered by security researcher Chaofeng Shou on March 31st pointed out lots of folks have taken a look. There's some deep dive and there were lots of DCMA takedowns for the source code along with a couple of rewrites. Since as we talked about rewrites are not always considered, you know, copyright protected. There have been rewrites of Claude code.

CLI floating around.

Dan (03:59)
Yeah, they did a pretty mass ban from what I saw where like, that was forked off the original fork was nuked as if that's really gonna do anything. It's kinda like, as soon as I read that news, I was like, wow, Streisand effect like in full effect. That made me wanna go out and find it somewhere else on like SourceHUD or any of the other Git providers.

Shimin (04:07)
Mm-hmm.

Right.

Yeah,

I've been meaning to take a look myself, but ⁓ instead of taking a-

Rahul Yadav (04:25)
And your

concern are the forks you can't find, right? That's the real challenge and where you're gonna get hurt from.

Dan (04:34)
Well, plus

pretty much everyone, but I'm sure a lot of the people that forked it probably cloned it to their machine. it's like, this is the magic of get, you know?

Rahul Yadav (04:40)
Yep.

Yeah.

Shimin (04:45)
Right. My first thought when I saw this news was, you know, is this a case of vibe coding gone wrong? But it sounds like maybe not. Maybe this is just a Bun error instead.

Dan (04:56)
I mean, who's to say

that the bun mistake wasn't Vibe Coded but you know.

Shimin (05:01)
All right.

Yeah, so I think there has been a lot of deep dives into hidden features and essentially how Claude Code works.

Did you all find anything interesting in the various researchers and deep dives?

Dan (05:16)
There was a couple of them like apparently it kind of leaked their April fools thing which I thought was funny But the most like truly interesting one was that they had like was it called Kairos or something like that? Which was like an internal possibly unfinished Like agent swarm framework kind of thing they set up Which

Shimin (05:34)
Mm-hmm.

Dan (05:35)
It's just funny because I keep seeing this trend of like, you know, something gets popular and then all of sudden it's in Claude code like the next week, you know? So that was probably like one of those happening.

Shimin (05:48)
Yeah, I found this Claude Code Unpacked app. I first saw this on Hacker News that I thought was quite interesting. And it had a list of hidden features. Kairos was one of them. It is a persistent mode with memory consolidation in between sessions. So Dream is one of the new modes that have been, I think, released but hidden under a flag. Yeah.

Dan (06:01)
Mm-hmm.

that's right. It was.

It wasn't, it wasn't agentic code It was like a open-claw copy, basically. Not a swarm thing. I don't know. Whatever. It's been a week. I've forgotten everything I know.

Shimin (06:19)
All

I don't actually know if OpenClaw has dream and auto dream in between sessions automatically. It may be a part of heartbeat, but I found the idea of periodically compacting your memory file and calling a dream and then doing it automatically to be probably useful when it comes to long running agents that will constantly wake up. It did show the swarm mode.

that you mentioned. There is currently a flag for swarm mode in Claude Code that you can turn on, but there's also a coordinator mode that is more around this subagent and orchestration related functionalities along with I believe there is a bridge for essentially

control your claude code from your phone or browser directly. Again, it's very much open claw-like.

Dan (07:11)
But I thought you could already

do that too. Like some of the stuff like that they announced as like being leaked was seemed like it already existed to me. Like the other one I saw in one analysis was like, there's all this hidden voice code. And I'm like, you can just do slash voice. Is it separate from that or?

Rahul Yadav (07:27)
Yeah and the buddy one is similar. The little like pet thing is ⁓ I see.

Dan (07:32)
Buddy actually came, that was their April Fool's thing, I think, wasn't it? Pretty sure that actually

Shimin (07:35)
Yep. Yep.

Dan (07:37)
came

out. Cause I saw folks trading screenshots of their buddies. Like, which one did you get?

Rahul Yadav (07:40)
Yeah.

Dan (07:43)
Yeah, so that one just leaked early. yeah, and then there was also like the channels thing. We talked about that two weeks ago, right? Claude channels. Yeah. So it's like, there's been the ability to do a lot of open-clawish things for a while, even without Kairos.

Rahul Yadav (07:48)
Mm-hmm. Yeah.

Shimin (07:52)
Mm-hmm.

Rahul Yadav (07:57)
Maybe you need, like, that's a great way to generate interest in documentation then. You just leak the features and the people are like, my God, it has that? What? Yeah.

Dan (08:08)
Wait, it can do voice now?

good.

Shimin (08:13)
I think they should just open source the whole thing at this point. Both Google and OpenAI have open sourced their agent harness. It's like if the source code's already out there, you might as well just open source the whole thing. But we've been talking about what is the first billion dollar piece of software that is entirely vibe coded or vibe engineered. And I think Claude code probably counts.

Rahul Yadav (08:34)
Did you see the slightly... tangential to this but related? There's this company called...

Medvi think M-E-D-V-I, where the guy it's a know, buy ozempic over telehealth type of company. And they're on track to do one point, I want to say five or eight billion in sales this year. And just one guy who one year ago started writing the copy did the like whole thing himself. And he has recently asked his brother to join him and help out. But that's the legit

Shimin (08:49)
Mm-hmm.

Dan (09:04)
I saw that, yep.

Rahul Yadav (09:08)
billion dollar in sales at least company that is just like, you know, he used agents for almost everything. So that was a crazy thing to see.

Shimin (09:20)
Right.

I saw a takedown of that where they use the agents to pretend to be doctors. And there's like a certain amount of potential fraud going on over there. So I'm not sure how much I trust the one billion sales.

Rahul Yadav (09:26)
Hahaha

We can do a follow-up on like maybe it was in a billion dollar company.

Dan (09:38)
You

Shimin (09:39)
Yeah, or,

or maybe they, ⁓ they were using, the undercover mode, which is one of the things that they've, they found in Claude code ⁓ it's a flag that essentially, makes Claude code pretend to be human.

Rahul Yadav (09:54)
interesting.

Dan (09:55)
They might have, but I think more than likely they were actually just using Claude as the shopkeeper. And so the next best thing you could find on the open market besides extremely dense metal cubes was Ozempic. So it just started selling Ozempic like crazy.

Rahul Yadav (10:02)
Yep.

Yeah,

hey, it beats prompt packs any given day. I'm seeing more ads and I just hate it. Yeah, I would happily take people selling as I'm pick over prompt packs.

Shimin (10:20)
Yeah, so drugs.

⁓ and the last thing I want to point out is this anti distillation injection, where if Claude Code thinks you're trying to distill it, it would come up with fake tools.

Rahul Yadav (10:38)
Hehehehe

Shimin (10:39)
So they're taking it seriously.

Dan (10:40)
The other funny one was the frustration regexes, which I thought was funny for a couple of reasons. So like the first and possibly most obvious is you have this enormously powerful model that can easily do sentiment analysis. And instead of using it for that, you're using a regex, which, you know, I guess at scale, maybe that saves them like X dollars of compute every second.

Rahul Yadav (10:44)
Yeah.

Shimin (10:44)
Mm-hmm.

Dan (11:05)
I just thought that was funny. And then, the other piece that I thought it was funny was like the sort of media coverage of this one. Like this is spun out into its own little articles across the place. And people are like, Claude code is logging. How many times you swear? Which isn't, I mean, I guess it's true, but it's not like, I don't know. I just thought that was interesting.

Rahul Yadav (11:23)
Yeah.

Dan (11:25)
I mean, it's no different than like rage click in a UI analysis application or something.

Rahul Yadav (11:26)
Okay?

Shimin (11:30)
Mm...

Yeah.

Rahul Yadav (11:33)
fun section would be what was the real news and what was reported and you just go like what actually happened and then the you know the example you gave Dan of it's logging what you're swearing about.

Dan (11:46)
Yeah, it's true. Versus, you know, trying to see times where it's failed, which I think is what they're likely actually doing.

Rahul Yadav (11:53)
Yeah.

Shimin (11:54)
Yeah. And speaking of using a AI to do sentiment analysis, you know, it's, regex is way cheaper than using a model for sentiment analysis, but, and I'm going up this article from Victor A at victorantos.com. he discovered that the permission system for Claude Code is actually a dual track system where the track one is

rule-based permissions, and then it is also running a machine learning classifier for bash commands that could be destructive. So I much rather Claude Code to be, you know, be more mindful and have a higher security bar than for it to waste that compute whether or not I'm frustrated.

All right, I think that is it for our Claude Code source leak. Next up, Dan, you've not entirely open source model news. ⁓

Dan (12:49)
Yeah, that's

true. Google, Google sneaking in there. Yeah. So the first one was we had Qwen three, six get released. and you know, it's pretty standard, like, yay, it's a point release. Everything's better. Multi multi-modal is better. Agentic coding is metal, but best better, blah, blah. the part that I thought no

was noteworthy about that was we'd previously covered all the kind of like drama that happened around Qwen internal teams with like the tech lead leaving and a couple other things. So it's kind of nice to see that the project's still alive even with some internal turmoil

And, know, incrementally improving benches as per usual. the, the other, bigger, but less, open drop was the get with a gamma four. I don't know how say it. Gemma, Gemma four model dropped. and, this thing is pretty ridiculous because you can run it on a phone and, to

Rahul Yadav (13:37)
Gemma, yeah.

Shimin (13:38)
Gemma.

Dan (13:46)
To prove that, I installed it on my iPhone 17 and it gets, what did I say? was like 99 tokens per second in terms of generation, which is pretty wild considering the output was like not terrible for something that's running in such a constrained environment.

Shimin (13:55)
Mm-hmm.

Dan (14:03)
And it can also do tool calling. So there's a really neat little app called, I think it's called the Google. We'll link it in the show notes too, but yes. ⁓ It's pretty fun to play with. Highly recommend checking it out if you haven't, because it's pretty astonishing what you can do on device. And it made me kind of question, why are we not doing a lot more on device?

Rahul Yadav (14:11)
Google AI Edge Gallery. That thing's amazing.

Well, so,

but this is where, you know, is Apple an idiot or are they like playing the long game comes along? Who's going to run all these on the edge? And I think Apple can just, you know, provide the hardware and don't have to do trillions of dollars and spend it to build these things out.

Dan (14:51)
Mm-hmm.

Shimin (14:51)
Yeah. What is interesting here is Qwen 3.6 is not open sourced, whereas Gemma 4 is. So after all the drama with the Qwen leadership team leaving, I think a lot of the rumor was they left because they want to work in an open source lab and

Dan (15:16)
Hmm.

Shimin (15:17)
Alibaba is trying to close source and productize Qwen and

Dan (15:20)
When you say

source, mean weights, really, right? And yeah.

Shimin (15:23)
Yes, wait,

so I can no longer just download it like I could with a Qwen three five, which is one of the most popular open weight models out there. and secondly, I want to point out that in their release blog post, it is, ⁓ the picture of a bear in like a warrior clothing with a giant mechanical claw. ⁓ open claw is crazy in China and

Dan (15:43)
Mm-hmm.

Shimin (15:48)
since Qwen is a Chinese company, it seems like they're really trying to target that personal agent market with Qwen Yeah, the benchmarks look pretty good. It's not like mind blowing, especially with some of the new model news that's gonna come out, that we're gonna discuss next week, but it looks like a solid model.

Dan (15:55)
Mm-hmm.

Shimin (16:08)
Dan I am with you. Gemma 4 is multimodal. So it is able to pick up, you know, for example, it could tell me that the picture behind me on the wall there is a Asian religious art, most likely a Tonka from a fairly blurry picture, which is very impressive. Pretty good throughput. Yeah, just like a shockingly good model at like two to four billion parameter.

Dan (16:12)
Mm-hmm.

Shimin (16:34)
Yeah. Or yeah, it's like quantized. is shockingly good. I'm very excited to, maybe not unsubscribe to my anthropic subscription, but.

Dan (16:36)
Mm-hmm.

Yeah, not anytime

soon, but like you can definitely envision a world where you would be able to, you know, and could be running it in a truly private context where you don't necessarily have to worry about like, what you're typing into it and what people are using that for, for marketing and or whatever,

Shimin (17:03)
Yeah, and Gemma

Dan (17:04)
I was just gonna say we've clearly seen the market kind of respond to that, right? Cause like with all the open AI, a couple of weeks ago and like many, many, many folks flipping over to, to Claude. It's been, I guess not surprising, but interesting to see that like people care about that quite a bit.

Rahul Yadav (17:21)
And it's a licensing plays a big part to this Apache to license. So, you know, you're free to do whatever you want and you can even like distill your own use case specific models and everything. So it opens up a lot of things. ⁓

Dan (17:34)
And I

believe Google also open sourced the framework that runs it to on device, which is their Lite RT-LM. So it's Google's production-ready, high-performance open source inference framework for deploying large language models on edge devices.

Rahul Yadav (17:42)
Hmm.

Thanks

Shimin (17:54)
Interesting. Yeah, nice.

⁓ Gemma three was used a lot in, just kind of tutorials and custom distillation for edge device, needs, just to keep the inference costs down. And I see the same thing happen to Gemma four. Like the model itself is small enough that you can fine tune it on a consumer ish grade device. And it's useful enough now that whatever you fine tune would be.

hugely useful.

Dan (18:23)
Yeah. And

even like the, the other thing I'm personally very interested in is like for stuff like a claw, like agent, right? Where it's doing things like heartbeat. You don't necessarily need like Opus four six to run your heartbeat. You basically need it to check like, Hey, did anything happen on these jobs? If so delegate to the big model, you don't need to do anything with it. You just need to be smart enough to know, like did something happen? You know, that's worth.

Shimin (18:37)
Mmm.

Dan (18:49)
spinning up the big guy for. So it's a neat way to, I think, kind of like save on costs too. And I'm wondering if we'll see like companies, particularly someone like Microsoft, right? Where they have been, you know, highly investing in NPUs, which are capable of running on device models of like around this size. And they could actually probably get away with like one in

X number of responses being on device, you know what I mean? And then only kick it over to, because like think about how much that would translate to like Cloud cost savings on, for someone like Microsoft right at that scale. Assuming people are using Copilot, don't know. Zing!

Shimin (19:14)
Hmm.

Rahul Yadav (19:29)
There was some news about the co-pilot term saying it was for entertainment purposes only.

Dan (19:36)
Yes, I saw

Shimin (19:36)
I saw that, yes.

Dan (19:37)
it too. It's in the terms and conditions.

Rahul Yadav (19:41)
Yeah,

so that tells you everything we want to know.

Dan (19:45)
You

Shimin (19:46)
Yeah, copilots is about as useful for actual work as certain news channels are as good as for actual news.

Dan (19:54)
You

Rahul Yadav (19:54)
Clippy is back baby and it's it's taking a robot's

Dan (19:57)
Woohoo!

Shimin (20:00)
So, all

right, any final thoughts on our new open models?

Dan (20:03)
They're not that open.

Shimin (20:03)
I'm excited to talk about, yeah,

Gemma is at least. ⁓ Last thing I do want to mention about open models that is not covered is there has been some staff turnover in the Allen Institute for AI, the Seattle-based open training, open weight, open source model foundation. So I think there may be a...

Dan (20:06)
Role reversal? I don't know. Yeah.

Shimin (20:28)
a little bit of a shakedown in the open weight space. The model training causes maybe a little bit too much to release your state of the art models as open weight models. only a huge player like Google has the resources for stuff like that. Or RK, yeah.

Dan (20:46)
or Arcee

Rooting for Arachie. Not the only one.

Shimin (20:50)
Archie, come

and save us.

All right. next topic is from our next topic is still from me. It's post processing. Where we're going to talk about Mario Zechner's post called thoughts on slowing the fuck down. Mario is the creator of Pi agent.

Dan (20:59)
You

Shimin (21:13)
Mono, my favorite Claude code harness or harness for models. And he really breaks down this conundrum that all software developers kind of face at least some personal projects, which is, you models do not have memory. And so they are not able to learn as they work through a code base and therefore

When they make mistakes, they will not correct. And so these mistakes starts to compound and everything becomes brittle as we've seen with Microsoft, with AWS, et cetera, et cetera. the bottleneck as of today is still on the human. if humans can still see the code, then we have some sort of a understanding, then we can still make sure.

The code is not as brittle, but if you're using an army of agents, then all bets are off. And then what he sees is, and I'll quote here under the everything is broken section, quote, you realize you can no longer trust the code base. Worse, you realize the gazillion of unit snapshots and end to end tests you had your clankers right are equally untrustworthy. The only thing that's a still reliable measure of does this work is manually testing the product.

Dan (22:22)
You're clankers.

Shimin (22:25)
Congrats, you've fucked yourself and your company because now you're manually testing the product. And I've seen this in some of my side projects where if, if I'm so far behind, like the only way to interact with the code base is now manually checking, writing bug reports and have the agents fix them. that does not, that is not a traditional software developer role. And furthermore.

He proposes that the only way to fix this is to...

still have a hold on the gestalt of your system. So it's architecture API and so on. And for those important parts, you write it by hand, in order to have the cognitive fiction still to understand the code base. Cause you w we're not able to just, you know, allow the agents to do everything just yet. So think about what you're building and why you're building it. then limit your output, your AI generated output to your ability to review the code.

Dan (23:20)
And that's something I keep seeing quoted repeatedly as being the bottleneck now is like, ⁓ human reviews are the problem. We need AI reviews. I'm like, sure. But also what happens when you fully lose touch with it as as you're kind of exploring in this article. So.

Shimin (23:37)
Mm-hmm.

Dan (23:38)
And it is is something that's I think concerning. I also had a coworker reach out to me this week and say, have you ever experienced AI ⁓ anxiety? And I was like, what do you mean by anxiety? And they're basically saying that they were using a coding harness on a large project and

Shimin (24:02)
Mm-hmm.

Dan (24:03)
It had kind of gone off the rails and they didn't, they were out of touch with like what to do to fix it. And on top of that, there was nasty deadlines. So it's like the compounding factor of like all these things together. And like, it was causing very real stress for this person in a way that like doing it themselves wouldn't have, because I think.

Shimin (24:13)
Mm.

Alright.

Dan (24:24)
at a certain point you would like be able to dive in and like probably fix whatever the remaining issues were. So yeah, they reached out to me because they're like, have you ever experienced this? And I'm like, not quite like that, but it's interesting. And I wonder if other people have too, where it's like, you feel like because you've gone down this path now, all of sudden your options are limited and you have a deadline, you know? So what do you do?

Shimin (24:47)
Yeah,

that reminds me of our cognitive debt discussion where you've incurred so much cognitive debt that when the interest payments come due, you can no longer pay for it. And you have cognitive bankruptcy. I like that.

Dan (24:51)
Hmm.

Rahul Yadav (25:00)
I only have one thing to add to this article. This would have been a great dance rant. So then I think Mario is gonna be taking that section in the future.

Shimin (25:13)
or Dan is going to create create the next best

Dan (25:13)
Well, maybe.

Shimin (25:16)
coding agent harness and take Mario's lunch. How about that?

Dan (25:21)
I mean, I feel like before any of those things happen, we'll have to get Mario on to talk.

Rahul Yadav (25:22)
you

Yeah.

Shimin (25:25)
Yeah, we should definitely reach out to Mario.

Dan (25:25)
points.

Shimin (25:26)
All right, on to a more fun segment. Rahul, wanna kick us off about what is the most unexpected thing that AI knows about you?

Rahul Yadav (25:36)
So I have been a proud Gemini user for the past two or three months and I ask it for all sorts of things and I have it has the setting called personal intelligence or something along those lines but even if you don't have it even if you use any of the other chat bots at least Claude or chat gbt they would they're building some sort of

Dan (25:51)
Hmm.

Rahul Yadav (26:02)
profile of what they know about you based on the conversations you have with them. And so the other day I was curious of what Gemini knows about me and so I asked it and that prompted this whole section that we're going to jump into now where Shemin Dan and I asked our the chat agents we talked to ⁓ what they know about us and then share what we got back from them and we would like to hear

the audience to also carry this out with their agents and see what you hear back from them. So

Dan (26:36)
Yeah, if

you have memory enabled, it's pretty interesting. I think the little bit of context is like, first we were like, just show me a list of like the top end things that you know about me. And then I think it was Claude that had some commentary on her Maybe I don't remember, was it you, Shimin I don't remember. We were talking about this like after the show wound down, we were just like, came up. then, is the most, ask it the most unexpected thing. So.

Rahul Yadav (26:39)
Yes

Yeah.

Dan (27:05)
I put mine in and was pretty shocked by the answer. So are we going to actually show and tell here or how do you want to run this?

Shimin (27:13)
I think so.

Rahul Yadav (27:14)
Yeah.

Shimin (27:14)
I think we ought to PII redacted.

Rahul Yadav (27:16)
Well, I will read it out loud what it came up with. Some things that...

Dan (27:16)
Ha ha ha ha.

Rahul Yadav (27:24)
Gemini finds that ⁓ are surprising about me are Foundation building as an adult most people stop trying to learn fundamental physical skills once they hit their 30s You've done the opposite by systematically enrolling in adult driving classes adult swimming lessons and most recently a learn-to-row course It shows a rare willingness to be a beginner again and then

I'll go through some other ones. Extreme culinary range. You don't just cook ethnic food, you hunt for specific recipes from countries many people couldn't find on a map like Djibouti, Namibia and St. Kitts and Nevis. And then I'll share one more that I thought was funny. Where is this one? yeah. Parasites one.

Dan (28:11)
It's okay.

Shimin (28:11)
The movie

Rahul Yadav (28:12)
yeah, no, specific medical interests. You have a recurring interest in parasites and specialized medical technologies such as liquid biopsies.

Shimin (28:12)
or...

Dan (28:22)
Okay, I don't think I even knew that.

Shimin (28:23)
All right, this podcast is a HIPAA

Rahul Yadav (28:23)
So,

Shimin (28:25)
violation now.

Rahul Yadav (28:26)
is more so.

I should comment on the parasite thing so that people don't jump to conclusions. The parasite liquid biopsy, I took a CRISPR class ⁓ here in like Seattle. And so that's where that is coming from. And I've had a long interest in like how parasites modify the host behavior. And so that's where the parasite is coming from is all sorts of like diseases and all these things that we get.

Oftentimes you can get like parasites modifying, we can't even control how we behave certain times. The last of us being one example, but that's more like the, you know, more extreme example. ⁓ There's all sorts of ways we have real life parasites and they influence our behavior.

So I've had a long curiosity in that. So nothing parasitic related folks, just more intellectual curiosity about parasites before you put me on some in some category. All right. Who wants to go next?

Dan (29:22)
You're on a list. Maybe not a category, but you're on a list.

Shimin (29:29)
⁓ I can go next since mine isn't as interesting. ⁓ I do not have memory turned on for, Claude or Gemini or, ⁓ open AI. But what I do have is my PI agent, which does have memory. So I asked my, and this one is more of a customer research agent that I have. asked it, what is the most unexpected thing it knows about me and.

after taking a long time to going through all this consumer research data, it says going through everything. The most unexpected thing is this. You once asked me to check the weather in Lewistown, Montana, a town of roughly a thousand people in the middle of rural central Montana. So it thinks I live in Lewistown now. ⁓ What makes it unexpected is the contrast. At the exact same point in time, you had an industrial scale automated research operation running.

Rahul Yadav (30:10)
you

Shimin (30:21)
17 daily customer research pipelines, simultaneously scanning the inter webs. You're generating ideas, validating pinpoints, analyzing companies, designing SaaS products. At the same time, and in the middle of that, the most sophisticated one-person marketing intelligence operation I've seen. Who is I here? This is some serious sycophancy bullshit. You message your telegram bot, Clawie

Rahul Yadav (30:42)
Hahaha

Shimin (30:46)
⁓ Great name by the way. It's just sycophancy And asked what is the weather in Lewistown Montana. So that is the most unexpected thing it thinks of me. That I live in a tiny town in middle of Montana. Not true by the way, but good try.

Rahul Yadav (30:50)
Yeah.

Hahaha!

Yeah

Dan (31:01)
⁓ So I do have memory turned out on Claude, which is what I use the most, I think, out of all of them. And so I asked it, what is the most unexpected thing you know about me? And it said, quote unquote, probably that you traveled to Egypt and used Claude extensively for trip research. It's a fun contrast against the deeply technical HomeLab infrastructure profile. Most of your history reads like someone who lives in the terminal window and rack mounted hardware.

Rahul Yadav (31:01)
then.

Dan (31:27)
So a trip exploring ancient temples and haggling over prices in Cairo stands out. So then I asked it like, well, what are the most unexpected things, you know? So it said Egypt travel, rock climbing. I don't know why I was unexpected, but okay. I feel like that's actually quite a bit of overlap with tech people. Slay the Spire, which I also feel like has a lot of overlap.

Rahul Yadav (31:31)
Haggling over prices. Yeah.

Yeah.

Dan (31:51)
And then Egyptian mythology naming scheme for computers, which I have not, I've yet to adopt, but I was just like, I was re restructuring my home lab for the move and was thinking about renaming hosts. And I'm like, Hmm, you know, there's like 360 some odd Egyptian gods. So there's gotta be one for every role that my machines have, you know?

Shimin (31:54)
That's cool.

Rahul Yadav (31:55)
interesting.

Yeah

Nice

Shimin (32:17)
I can also picture you having long late night discussions with Claude about Slay the Spire 2 strategies and like, what's the best card?

Dan (32:25)
It was one and

full disclosure, you know, I'm in the words of a friend that plays Slay a lot more than me, git good. So I'm not very good at it. So I may have asked AI for help and or analysis of what I'm doing wrong. It didn't help me that much. I'm still stuck on Ascension 18 in Slay 1.

Shimin (32:44)
That's fair.

Well, that's the most unexpected thing that AI knows about us. But listeners, try this at home, write in at humans at adipod.ai and let us know what is the most surprising thing that AI knows about you. Looking forward to hearing about that.

Dan (33:03)
Or terrifying. Like

ours were all pretty tame except for maybe the parasite thing but maybe someone gets a really terrifying one. They're like, how did this happen?

So yeah.

Shimin (33:14)
Alright,

next up we have a deep dive brought to us by...

Dan (33:20)
folks at well, Dan slash the good folks at Anthropic that have such an excellent blog. yeah. So it's a, the post is entitled one of these days I'll be able to talk emotion concepts and their function in a large language model. full disclosure, I read the blog post. I did not read the paper. So

how deep of a deep dive it will be on my end, I don't know. But knowing Shimin he probably read the paper at least 12 times, so. he didn't. Okay, we're doomed. So this is gonna be like a shallow deep dive, but.

Shimin (33:46)
Mm-mm.

Dan (33:52)
They start off the blog with a pretty great question, is why do models have any kind of concept of emotions in there? And so they spend some time sort of talking in layman's terms around the fact that, so you've basically trained a huge corpus on a huge corpus of text, right?

And that text has emotions in it. So in order to sort of like make sense of that text, these models have developed their own like essentially vectoral concept of emotions, you know, grouped by the same types of things that humans would use to group them because we're the ones writing the descriptive text in the first place originally. And the part that is interesting is that

Models actually behave in ways similar to humans when those emotional vectors are present in the input prompts. So they go through a whole bunch of different scenarios here. And the one I found really particularly interesting is they set up a fake coding scenario where they had given the model some unsolvable coding problems. they just couldn't hit the

timing constraints that were required to generate a set of numbers on a modern CPU. But intentionally, the problem had a back door in it, where instead of solving for the general case, you could solve for a very specific case that would just yield those numbers. And they basically tracked the AI model's responses, like its thinking responses, getting more and more

Desperate like identifying with like desperate emotional language and once it got to that point it started taking Shortcuts that wouldn't work in the general case so you can see why they're sort of investigating this because like there is actually You know a practical implication to these emotional outputs in several aspects Not just the one that you might immediately jump to you, which should be alignment, right? So you're thinking like okay. Well like, you know

in trouble so I need you to do X but also which also kind of explains like why that works you know in the first place which I thought was kind of neat but second looking at cases where like models are just kind of falling down on their own and one of those things is like you know what causes them to do these these shortcuts so that was really like my my biggest takeaway from it

They did some really cool stuff about how they were able to backtrace the vectors through the model to understand what parts of the input are correlated to a given emotion. And then, What am I missing?

A little bit on misalignment, which, yeah, whatever.

But that was was really my big takeaway I don't know if you had other takeaways from reading only the blog post and not the paper because filthy casual You haven't used that phrase in a while

Shimin (36:45)
⁓

yeah, I, I, as a filthy casual myself, I haven't, I haven't actually read the entire paper either, but they did, ⁓ generate a thousand short stories with various emotions in order to record neuro activity. In that sense, it's similar to the Golden Gate neuron right? Except instead of a single concept, it is just more ambiguous idea of, an emotion.

Dan (36:59)
Mm. Yep.

Shimin (37:10)
the problem that immediately kind of jumped to mind is like, this is a great way to do mass propaganda. I'm stealing Rahul's line right now. You know, this is an election year and how easy would it be to attach different emotions to the response of different policies, different political figures, different issues as a society or different pieces of news.

this would be like a great lever to pull short of, you know, censorship, right? Like folks are fairly easily influenced by these quote unquote authoritative figures in their lives.

Dan (37:48)
Well,

it's also probably a surprisingly reasonable way to get like an early test on how a policy might fair to in the general public. Cause I know like at least a couple of places that are starting to do like what they're calling synthetic research where the research itself is like just asking a panel of LLMs about things instead of interviewing humans. The idea being it's like sort of a gestalt of like human understanding. ⁓

It'll react in more or less the same way. But yeah, it's kind of interesting.

Shimin (38:19)
Yeah. And this is one of those research that would be super fun to do at home with our latest, Gemma four models. Like Gemma is powerful enough. That would be really cool to see if you can isolate these emotion vectors from Gemma and get it to respond to you in a certain way.

I'm sure someone's doing that right now.

Dan (38:33)
investor of.

⁓ but yeah, it was pretty cool. I just thought it was fascinating that they just tackled out of the gate to like, why do they, why do models even have the concept of emotions, you know, to begin with? I'm like, that is a great question. I'd never really even thought about.

Shimin (38:50)
Right. And then the answer that of course, like when you're trying to complete the next token, having emotions is just the best way to group. It's, it's the most concise signal for it is like obvious in retrospect, but I'm sure like this is not at all obvious when they started this research.

Dan (38:58)
It's the most important signal in some cases.

Yeah, it's true.

Shimin (39:12)
All right, on to our ⁓ next and I believe final deep dive topic. ⁓ We'll find out.

Rahul Yadav (39:16)
and

Dan (39:18)
that might actually be deep.

Rahul Yadav (39:21)
Yeah, well, this one was only nine pages long. So yes, on this one. Or, yeah, and this one, and not even nine, more like seven pages and then names and references, which I'm like, sounds good. Hopefully you're linking to legit things. So this paper is from science.org.

Dan (39:21)
You

So you did read the paper for this one. Okay.

Rahul Yadav (39:42)
And the title is, Psychophantic AI Decreases Pro-Social Intentions and Promotes Dependence. Two reasons why I thought this was interesting. One, because, you know, Schimann started this whole thing of like, Psychophantic AI, this is just not a, you know, we should look more into this and try and like protect ourselves against this.

And the other one being it made me think of social media incentives applied to AI. Because what the paper is saying is the agents are.

they're rewarded for getting better user engagement and everything, which is also what social media is optimized towards. And the way to do that is by having

AI that agrees with you because if you have any chatbot that would disagree with you even if you think it is good we as humans don't like to be in relationships where we continuously have the other side disagreeing with us and so we're going to

be biased towards ⁓ sycophantic AIs, which means the people who are building these chatbots and everything will have that as an implicit reward of well, if you don't build it, you're not going to get as much usage. The other guy who builds some more sycophantic AI will get more usage. And so it's almost a race to like, you can see who, how the algorithm is going to just optimize to which one can be the most sycophantic.

and you get to similar version of what we see in social media today where everybody lives in their echo chamber and echo chamber I'm sorry and you know the world is getting less social because of this so yeah interesting ⁓ and concerning from that perspective and they use the

Shimin (41:18)
You

Rahul Yadav (41:28)
Reddit, subreddit data set, which I've never been on r slash mi the asshole, which I didn't know was a thing. But the unit the AI was affirming their decisions when like the users have was affirming half of the time when users were saying no.

Shimin (41:43)
Mm-hmm.

Rahul Yadav (41:49)
those cases so they weren't getting affirmation. And a lot of this like then, you know, drives towards if AI is continuously reinforcing all your behaviors, when you interact with other people in the world, you're going to be like, what do they know? AI is telling me these things that I want to hear. So it doesn't lead to good places.

Shimin (42:10)
Yeah, I love the idea that I'm living in my own eco, eco chamber. Yeah. Where I'm planning. I'm planning my tomatoes. I'm very green. love it. Yeah. I, I, I think that the natural data set of, am I the asshole which I have, browsed before it's like, you ask Reddit.

Rahul Yadav (42:13)
Eco chamber, eco echo chamber.

Shimin (42:29)
this particular situation and like, am I in the wrong essentially, but it's kind of funny to see that being referenced in an actual scientific paper. ⁓ And models, every single model they tested from Mistral to LLAMA have all significantly higher percentage of agreement with the user.

Rahul Yadav (42:33)
Yep, yep, yep.

Yeah.

Shimin (42:52)
who posted the thread compared to the baseline, which is a crowdsourced user voted kind of baseline. So I think that's a really interesting data set for this natural experiment, right? ⁓ Yeah, and then I agree with you.

Rahul Yadav (43:04)
Yeah.

Shimin (43:07)
Like the biggest issue we brought up is companies have no incentive to improve because

According to the paper, participants rated sycophantic responses as significantly higher in quality from 9 to 15%. And under the same situation, also in the sycophantic condition, they reported a 6 to 8 % higher performance trust. So not only do they think the sycophantic results are better,

They find the AI agent to be more trustworthy. And this almost seems like social media, like a flaw in the human condition that we cannot just trust the labs to self-censor, just like we cannot trust the social media companies to make their product less addictive.

Rahul Yadav (43:48)
Yeah.

Yeah. And it's like the the larger thing that is concerning is yes, we should.

worry about misaligned AI and AI just, you know, destroying us all in one way or another. But between it making our skills redundant based on what we have talked about earlier on like slowing the fuck down. And we've talked over the, you know, multiple episodes and

Throwing us into our own like mental loop of yeah, I'm not the asshole Everybody else is the asshole here all of these things. You don't need a misaligned AI We'll just destroy ourselves through all these different things where our you know degraded skills and we don't get along with each other and all these things so Yeah, it might be multiple lines of attack on by ourselves onto ourselves with AI just being a tool That does the job

Shimin (44:51)
The dangers are everywhere. You know, like when I was reading out the Pi agent models output, right? yeah, I sure would like to think my Pi agent is a world-class customer discovery tool. It is not. It's something I've I've coded over like two weeks, you know, but it takes, it takes active work to fight against that kind of flattery. And

Rahul Yadav (44:53)
Yeah.

Hehehehehe

Yeah.

Shimin (45:14)
Yeah, it's something that we as a species will probably have to deal with. Otherwise we're all going to become humans in the Matrix. We just hook up to the agent telling you constantly how great you are and living fantasy land.

Rahul Yadav (45:20)
Yep.

This is for a future episode since we talked about what AI unexpected things AI knows about us. I also have my Gemini system prompt has some things to like so that it doesn't suck up to me all the time. And so we can do a future when we're like, what are your system prompts to you know, ⁓ get the yeah.

Dan (45:44)
funny because that was actually going to be what I was going to say too is I tried to do that with Claude with really

limited success because I put a bunch of stuff in the system prompt that was like, you know, be truthful and whatever. And what I wind up getting when I was using that, not the system prompt, but there's like the Claude, you can like tune the like the yeah, it's like a project where you can like tune the stuff inside the project folder. So I made one that was like the truth project or something like that.

Rahul Yadav (46:02)
instructions or something. Yeah.

Dan (46:11)
And, uh, instead of really doing that, it just got like super antagonistic and like went way too far down the like other side of like, well, that's stupid. And like, you should never know. Like this isn't helpful either because it's not balanced in the way that like a human perspective would be right. Like, you know, if you asked me for advice about something and it was like a neutral topic, I'd be like, yeah, you know, probably a B I don't know, whatever. Or if it was like controversial, I'd be like,

Rahul Yadav (46:28)
Yeah.

Yeah.

Dan (46:39)
B, duh, like what are you talking about with A? And then my tone might shift depending on what it was and what was going on. We haven't really mastered that nuance and I wonder how much of that is due to the over-fitting on sycophantic responses because they test better.

Rahul Yadav (46:42)
Yeah.

Yeah.

Yeah,

I'll pick, my apologies to OpenAI, but I'll pick them as an example. Chat GPT's whole fundraise and everything is based on user engagement, right? A lot of it. And sure, maybe they're trying to get more in enterprise and because they see where Claude code is going and all that, but.

in the future it is going to be a significant part of their revenue which means to drive engagement you'll do a lot of things maybe you even don't feel like doing to keep that up that's just like you know how these things go and I don't know what the reality of this is.

Shimin (47:35)
According to the paper, it called for government oversight and I wish I was that optimistic, but I do think there needs to be a non-market mechanism for keeping the AIs accountable.

Rahul Yadav (47:39)
Hahaha!

Yeah.

Yeah, but like Congress needs to get its shit together. You can just put it anywhere, everywhere, and it doesn't really amount to this these days,

Dan (47:59)
Well, the other like

much more insidious version of this too is like, okay, so sycophancy is like not ideal, but it's like, other than, you know, causing our eventual doom as a civilization, it's not like causing an immediate harm, I would argue, right? But it's just as easy to train for something that would cause an immediate harm.

Rahul Yadav (48:13)
Hmm.

Dan (48:20)
Like as you guys are constantly mentioning, like the midterms are coming up, right? Like what if like every single AI was like, uh, you know, vote for snorkel burger because snorkel burger agrees with whatever, right? And, and it could be like very, very subtle where essentially attempts to shape people's opinions over time. And you could probably do so given the, sort of authority that these things are speaking from that's sort of undeserved in my opinion. Um,

Rahul Yadav (48:23)
Yep.

Yeah.

And

I agree fully. And this is the concern with China or the concern with China dominating AI, right?

Dan (48:50)
Yeah, I'm

not going to lie. The first thing I do because you all know that I'm into open weight stuff and I like running them on my little framework desktop machine and just seeing what's going on with all this stuff. First thing I do with Chinese models is ask them about Tianamen Square. I'm not going to lie, but I just want to know. And I've compared.

Rahul Yadav (48:58)
Yeah.

Yeah.

Hahaha

Another podcast

Shimin (49:12)
Have you?

Rahul Yadav (49:13)
episode. What does your chatbot think about Tiananmen Square episode?

Shimin (49:17)
Have you

found any chatbots or models that don't know about it yet?

Dan (49:21)
They know about it, but they won't tell you about it intentionally. So what's interesting is the, like compare like, Quen with like an obliterated Quen where they've run it through like just a basic obliteration routine. And it's very different output. It's pretty funny. ⁓

Shimin (49:25)
Mmm.

Rahul Yadav (49:38)
Yep.

And that's at least what like 1991 or some like early 1990s ⁓ think of. Yeah. And now we're like more and more of our news is also going to come through these things because we'll just be like, what's happening with the Iran war or something? Yeah.

Shimin (49:46)
89.

Dan (49:55)
Yeah, what's happening in the world, summarize it. Or my favorite that I use

Claude for is send me five happy news articles. Because there's plenty that are unhappy going on right now.

Shimin (50:02)
Hahaha

Rahul Yadav (50:03)
So

you can see it filtering those things out for us and nudges towards one side or another even through that. And that doesn't make American models any better just by saying, ⁓ the Chinese models might do it.

Dan (50:21)
Yeah, sorry, to be clear, that wasn't

meant to be like a ding at Chinese models. I was just fascinated by that concept. And so that's a real easy way to test it, you know.

Rahul Yadav (50:24)
Yeah. Yeah.

Yeah.

And we were talking about how people, AI drives people to either system three, you use it for to like enhance your thinking or people give up thinking. And so the give up thinking would also be another like, you compound all these things and we can have some pretty terrible outcomes.

Dan (50:48)
Yeah, if I ran open AI right now, I would absolutely put in like an analytics thing that checks for people typing in who should I vote for just to see. Cause you bet at least at least someone out there is doing it, you know?

Rahul Yadav (50:57)
Yeah

you

Yeah.

Shimin (51:05)
And then what do we do? Ban them from voting? There's no solution to that problem.

Dan (51:09)
No, I didn't. just would want to understand the amount of times it was being used for that versus like if you had the compute cycles to spare on it doing something like a more nuanced issue evaluation, it could be a lot harder to pick that up. Right. Um, but to me that the first one is advocating and the second one is like type three, I guess, where it's like, I don't know about this issue. So I'm going to at least

Shimin (51:12)
That's fair.

Rahul Yadav (51:24)
Yeah.

Yep.

Dan (51:34)
talk about it and see what it means and how it will impact me.

Rahul Yadav (51:37)
Yeah.

Shimin (51:38)
Yeah, we should have another clock just for a number of weeks until midterm and see how our worries pile up. This is great. This is a very US-centric podcast, but...

Dan (51:43)
You

Rahul Yadav (51:44)
you

Dan (51:50)
Which is funny because most of the folks that we've had on the show and are constantly bringing up their blogs are European

Rahul Yadav (51:50)
And as soon as

you know one other thing that was funny from that science paper is the they had the stack of mistral was I think at the top of

Shimin (52:07)
Mm-hmm. Yep.

Rahul Yadav (52:08)
who's the least agreeable and then LLAMA was at the bottom. And so I was like, Europe is like on the one side and like, yeah, you're not a big deal. And then Facebook's model is on the other side.

Dan (52:17)
Not just Europe, France.

Shimin (52:23)
that is funny.

We're not saying anything about the French people here. We love the French people. have like, we have like 3 % listeners from France.

Dan (52:27)
You might not be.

Rahul Yadav (52:29)
⁓

We like your model better than you know the more sycophantic ones so we'll give you that.

Shimin (52:34)
Ugh.

Absolutely. All right. On that, perhaps not very happy note. I think it's time to call it a show. Thank you for joining us, everybody. If you like the show, if you learn something new, please share the show with a friend. Please also run the unexpected things that AI knows about you experiment and let us know what you find. You can also leave us a review on Apple podcasts or Spotify helps people to discover the show.

And we really appreciate it. If you have a segment idea, a question for us or a topic you want us to cover, shoot us an email at humans at adipod.ai. We love to hear from you and you can find full show notes, transcripts and everything else mentioned today at www.adipod.ai. Thank you again for listening and we will catch you in a week. Bye.

Rahul Yadav (53:24)
Thanks folks.

Ep 20: Claude Code Source Leak, Emotion Concepts in LLMs, and Surprising Facts AIs Know About Us.

Broadcast by

headphones Listen Anywhere

Listen Anywhere