- Channel: Risky Business Media
- Title: Mythos smythos! How to find 0day with lesser models
- Date: May 8, 2026
[00:00:03] hey everyone I'm James Wilson and welcome to this risky business features interview with Neil's Provos neils is someone who
[00:00:12] has been involved and a part of the cyber security community for quite some time and is a rather prolific
[00:00:18] figure in this industry he was a distinguished engineer at Google for quite some time and was also heading up
[00:00:24] security at Stripe more recently though that long tenured and quite spectacular cyber security career caught up with him in
[00:00:33] an interesting way you remember when Mythos was released and their headline bug was this 27-year-old vulnerability that could crash
[00:00:41] any Open BSD box well turns out by his own admission that was Neils's code and he wrote that bug
[00:00:50] so not just content to uh accept this as a reality Neils then took it upon himself to show how
[00:00:58] the bugs that Mythos was finding could actually also be found and many more could be found even with older
[00:01:04] models if you used a sufficiently advanced harness he then turned this uh discovery and his findings into an excellent
[00:01:12] blog post that you can find over at his website provos.org or titled finding zero days with any model a
[00:01:20] key part of how he achieved those results how he replicated the mythos bug findings with older models was using
[00:01:26] a project that he's open sourced called Iron Curtain iron Curtain's a fascinating way to run agents in a secure
[00:01:34] way where you use natural language to describe the policies around what the agent can and can't do but he
[00:01:40] also has incorporated some interesting technologies like finite state machines and journals um and orchestration like models one model orchestrating
[00:01:51] many other agents to really demonstrate that you know the the model on its own producing a oneshot output is
[00:01:58] not where the real magic happens here it's actually in the harness and thinking about novel ways to codify the
[00:02:07] the process and the practice that a security researcher follows into these finite state machines and these other sort of
[00:02:14] encapsulations of knowledge that a model can then interact with understand and and be guided by that's when these really
[00:02:22] incredible uh discoveries happen so we talked a lot about these architectural uh advancements in in this discussion um not
[00:02:31] least of which those finite state machines um we also talked about the notion that there is an asymmetric advantage
[00:02:38] that's sort of unintentionally handed to attackers by virtue of these safety uh guard rails that are baked into models
[00:02:45] today um and we we talked generally about the state of safety in and the cyber security industry now as
[00:02:52] it faces this extreme challenge from from you know the the roll out of AI and all of these new
[00:02:58] capabilities and this notion that going from uh you know vulnerability being found to being exploited at scale is now
[00:03:06] a very short time frame activity but we also got to some really practical advice about how businesses CISOs any
[00:03:15] really senior technology leadership should be thinking about how to deal with this deluge of bugs and exploits that we're
[00:03:24] about to face and this comes from someone who's had a lot of experience dealing with a lot of these
[00:03:29] challenges but now understands that what AI does is it amplifies the volume of them and the velocity at which
[00:03:36] they're coming at you so I'm going to drop you here into this conversation where Neils talks about uh you
[00:03:42] know his path from being someone that used to just focus on building secure systems to adopting AI and leveraging
[00:03:49] AI to then find vulnerabilities i really hope you enjoy this discussion i had such a great time in this
[00:03:55] chat with Neils enjoy to be honest with you I still struggle with the term AI i don't even know
[00:04:02] what that means you know I think people mean sort of deep neural networks that are sort of uh you
[00:04:07] know driven by a sophisticated harness but look I mean you know Chad GPD came out all of a sudden
[00:04:16] you had this experience of a model that seemed to possess all the knowledge of humanity you could ask it
[00:04:23] about anything you wanted and it gave a pretty reasonable answer i mean it was amazing right i would use
[00:04:30] it all the time i was would say "Hey you know what do you think of these lyrics do they
[00:04:34] make sense you know I'm struggling to find a word that rhymes here you know what what I mean those
[00:04:40] kind of things right?" And then sort of fairly quickly the models evolved where they could help you with coding
[00:04:45] and at first I was you know a little bit skeptical but you could just say "Hey I need a
[00:04:50] function that does this you know can you write it for me?" And then you sort of look at it
[00:04:53] do some editing and it actually looked pretty solid and then very very quickly uh these coding loops you know
[00:05:00] cloud code is you know my favorite came along and it was no longer just a single function where you
[00:05:05] sort of cut and paste all of a sudden it could edit whole files then all of a sudden it
[00:05:10] could sort of do reasoning across multiple files and these days my development loop is you know here's the problem
[00:05:16] I want to solve here are the outcomes that I want to accomplish here are sort of some you know
[00:05:21] rough design ideas let's write the design let's review it let's turn it into an implementation spec and then you
[00:05:27] know go for it yeah yeah and there is like that dichotomy still at the heart of this right technical
[00:05:35] folks like you and I understand that at the end of the day this is still a token token prediction
[00:05:40] machine it's it's it's guessing the next word after the next word and and that's I I agree that's where
[00:05:46] it's hard to call that artificial intelligence but there's also just the experience of sitting down and talking to these
[00:05:52] models where you very quickly do find yourself having a conversation with something that does feel like an artificial form
[00:05:59] of intelligence but um I'm curious from cyber security perspective in particular where was the point in time or did
[00:06:08] you ever have sort of any hesitance or um uh reluctance to believe that these models would be as good
[00:06:15] at finding vulnerabilities as they are today or did that just always seem like something that was inevitably going to
[00:06:21] be a capability well I guess the way that I came to this was sort of the other way around
[00:06:26] you know one of the things as I was building code as I was building code that actually got to
[00:06:31] deploy it to Amazon and those kind of things I was really worried about the how much vulnerabilities are going
[00:06:36] be introduced by the virtue of me no longer reviewing every single line of code and so this one
[00:06:43] project um it's band alert i'm doing this with a friend in Chicago the idea is you're a gigging band
[00:06:48] you want to have more fans come to your venues and if they sign up by scanning a QR code
[00:06:54] they get an SMS the day before your gig you know more fans show up in a fairly simple service
[00:07:00] but I had written the interfaces in such a way that everything needed to be authenticated right you could you
[00:07:07] couldn't even declare a route that was not authenticated appropriately if you use the wrapper and I remember you know
[00:07:16] one experience it's maybe not half a year ago now with clawed code where I said hey you know we
[00:07:21] need to introduce this new route you know this is what it needs to do you know please go ahead
[00:07:27] and I ended up looking at the code and it decided oh I'm just going to use express raw I'm
[00:07:33] just going to do this you know simple you know post and and get route without any authentication with no
[00:07:39] rate limiting with no DOS protection and I said no it's not right you know this is how you should
[00:07:45] be doing it and then it sort of made some changes but it required three iterations before it actually ended
[00:07:50] up getting that right and and that was sort of a fairly frustrating experience and to this day right for
[00:07:56] every single feature that I push I do multiple rounds of AI review and there's never ever a single time
[00:08:05] where it doesn't find material problems with the with the implementation yes and so that makes you that makes you
[00:08:12] somewhat worried but it sort of then very quickly goes to the flip side right now so many people are
[00:08:18] using VIP coding to put things into production we better figure out a way to efficiently find all the vulnerabilities
[00:08:26] that are now being introduced and um I mean maybe James if we have time I can sort of tell
[00:08:31] you how sort of you know my exposure to mythos and the vulnerability hunting came about but feel free to
[00:08:39] to interrupt me yeah sure and and we we will definitely get to that and um it's so funny you
[00:08:45] mentioning that that case of the model just working around you i had a hilarious uh interaction with Claude a
[00:08:51] couple of weeks ago working on a an internal app here for risky biz and it I like to think
[00:08:56] it's relatively well architected and there's a couple of backend services and I just I happened to just catch something
[00:09:01] going back and forth in the transcript and I I paused it and I said excuse me are you saying
[00:09:06] that service A is able to write directly to the database that service B owns and it says you know
[00:09:12] checking thinking and it says yes of course I use a shared database uh framework and so it just it
[00:09:18] speeds up the way that we can interact and I'm like don't you think that violates the whole entire separation
[00:09:23] of concerns thing in a microser environment you know and dot dot dot dot dot you're absolutely right let me
[00:09:28] go and fix this so um and I I I do the same thing as you i now have uh
[00:09:34] you know a very established pattern of Claude code writes the code does the implementation CEX does an adversarial review
[00:09:40] Claude goes and fixes those things Codex reviews again and so we we get there um but okay so that's
[00:09:46] interesting so it was more a uh you started using AI to build you started to see that AI could
[00:09:52] build but it built with vulnerabilities and that that sort of got you in the mindset of uh wow I
[00:09:57] wonder just how big of a scale of a problem this is then along comes mythos and this is and
[00:10:04] and you know in those initial release sort of um announcements around mythos one of the things they cited was
[00:10:10] that they'd found you know an open BSD bug that had been latent for what was it 17 27 something
[00:10:16] 27 years yeah 27 years um now that hit uh close to home for you because it turns out that
[00:10:24] was code you had written am I right and how how did that feel like take take us through that
[00:10:29] moment of discovering that this was your your code sure it was a bizarre experience i was I was at
[00:10:37] the University of Michigan that day and it was actually giving a talk to the security graduate students about uh
[00:10:44] Iron Curtain and uh afterwards you know had coffee with one of the professors you know spent you know most
[00:10:50] of the day there ended up going home and it was April 7th it was the 12-year anniversary of the
[00:10:57] OpenSSL Heartbele bug and we had just released a new activite track called Hotle which is basically a dance pop
[00:11:06] love song but it's really about this OpenSSL vulnerability and so I reached out to to Lily at Wired and
[00:11:13] say hey you know just for fun you might enjoy this track and she was like oh this is fun
[00:11:19] do you know I'm just writing an article about vulnerabilities have you heard anything about Mythos and I said "Well
[00:11:26] I heard about Mythos but I haven't really read much." And then I sort of found this Red Team article
[00:11:31] and said "Is this what you're talking about?" He says "Yeah that's what I'm writing my story about." And I
[00:11:36] started reading it and sort of the marquee highlighted bug was this 27y old vulnerability in OpenBSD in the TCP
[00:11:42] sack implementation and I was reading ended up going to the next bug and then I was like "No hang
[00:11:48] on a second something is going on here open BSD TCP sack that sort of brings up something could could
[00:11:57] that feels familiar yeah I wasn't sure so I ended up you know going to through the commit mocks and
[00:12:04] then indeed you know I ended up committing sort of a port of the BSDI implementation in November 1998 i
[00:12:11] was like oh my god so I went to Lily she she thought I was hilarious but then I also
[00:12:17] said look you know based on what I'm reading I'm sure that muthos is a very very capable model but
[00:12:25] I wonder you know how much of that is you know PR sort of putting an extra emphasis on this
[00:12:32] my bet is I can replicate this with models you know much older than that right and you you turned
[00:12:40] that into much more than just a you bet you can do it that then became uh I guess the
[00:12:45] genesis of this the the article any model can find zero day that is right and we can get into
[00:12:51] that in a second but basically where I was coming from I was already doing you know most of my
[00:12:55] coding sessions in iron curtain using cloud code and you were sort of mentioning you know your coding assistant sort
[00:13:04] ended up opening up the database to the other service too i had the long running session and I
[00:13:09] came back and I noticed ton of the files had been deleted out of the g repo and I went
[00:13:16] to claude and said hey you know claude why are all these files deleted and then cla well you know
[00:13:21] my container was running out of space and you wanted me to install these additional Python dependencies you know I
[00:13:27] had to delete I had to delete some storage and it was all in it was all in my sandbox
[00:13:32] right I mean so nothing lost it was get tracked anyway and you know the model didn't have any permissions
[00:13:37] to do anything with remote git but it sort of makes you wonder but so as you said right you
[00:13:45] have your processes for getting more efficient encoding I wanted to make this process more repeatable for me and you
[00:13:53] realized if I built a finite state machine orchestrator on top of Iron Curtain I can sort of do the
[00:14:00] things that I do already which is write the design review the design you know address all findings things then
[00:14:06] escalate to the human for final approval then implement it review the implementation and you know basically do this loop
[00:14:13] you know fully automated and as I was chatting with Lily I was like you know what I already have
[00:14:19] sort of all the scaffolding there why don't I just very quickly write a vulnerability discovery workflow and so I
[00:14:26] ended up doing that and it sort of reproduced the TCP sack finding with uh with sonnet and then I
[00:14:35] I was like great you know what else what else can we do here and that is sort of how
[00:14:39] it ended up proceeding and so before we get too far into that uh te tell me more about Iron
[00:14:45] Curtain when when I looked at I understand it's like a sandboxed environment where you can have um you essentially
[00:14:51] have a a natural language way to describe the constraints that you want around that environment but um take
[00:14:57] take me through what it is why it's useful and how it works yeah fair enough uh you probably have
[00:15:02] heard about open claw like oh yes very much everybody else and I looked at it and I was like
[00:15:07] "Wow this is amazing right it's like sort of the capability that you get out of it it can read
[00:15:12] your email it can do your calendar i want all of those things." But then I looked at it I
[00:15:16] was like "But I would never build it this way." It is basically built in a way that you just
[00:15:21] it's impossible to secure it and then I was wondering you know what is it that I really want right
[00:15:26] these LLMs are very very capable but you can't really use them for enforcement because they're not deterministic right maybe
[00:15:35] in 900 cases they will say "Yep this looks good." And then in one case they might say "Oh this
[00:15:41] is probably a bad thing." And then and and uh basically or the other way around 900 times it might
[00:15:47] say "No we shouldn't be doing this." And then the one time it actually says "Yeah this is perfectly fine
[00:15:51] let's go let's go ahead and delete everything." And so then my insight was basically I wanted something that allowed
[00:15:58] me to enforce a deterministic policy but I didn't want anybody have to go through the motions of writing in
[00:16:05] a complicated policy language that's difficult to understand yeah yeah and then the other thing that was really important to
[00:16:12] me and one of the problems with open claw sort of open claw can just make raw HTTP requests to
[00:16:18] any kind of service as long as you give it the credentials and these requests are fairly opaque you don't
[00:16:24] really know what it is that they are meant to do but if you take a step back and you
[00:16:28] look at the tools that are um exposed by model context protocol servers these tools actually all have semantic meaning
[00:16:36] m and so then I very quickly came to the well you know if we do policy on top of
[00:16:40] the semantically meaningful tool calls now all of a sudden we can actually have somebody write in English saying this
[00:16:47] is what I want to happen this is what they don't want to happen and turn this into a policy
[00:16:51] that we can enforce and then the other thing that ended up happening cloud flare last year published a blog
[00:16:57] post about code mode where the idea was that sort of MCP tools and their definitions end up taking a
[00:17:03] fair bit of tokens just to have them around and they came up with a way to turn every tool
[00:17:09] call into a TypeScript function call and found that by providing like a you know sandbox TypeScript environment V8 isolates
[00:17:18] they now all of a sudden could also reduce the overall token cost associated with with MCPS and I basically
[00:17:26] layered all of these things together and then I talked with a few friends and they said well you know
[00:17:31] can you do this with cloud code can you do this with goose and I was like yeah let me
[00:17:34] do that and now I basically have this orchestration engine that uh gives me security properties and I have like
[00:17:41] a little Xac assistant that can sort of read email and write email for me and can sort of do
[00:17:46] my calendar has its own memory but sort of on occasion you the human still have to say yeah you
[00:17:53] can actually do this yeah you you've got so it sounds like it's a really nice way to create um
[00:17:58] the cage for lack of a better term uh for that for that agent to operate in but in a
[00:18:02] way where you don't have to go and learn like a super complicated domain specific language to write the policy
[00:18:07] um which sounds good i'm I'm quite excited to try this out because I think I have ventured a little
[00:18:13] bit too far into the YOLO mode and just letting my agents do everything and and inevitably that's going to
[00:18:18] bite me one day soon um a question about policies that's been on my mind and and this is a
[00:18:25] little bit of a detour but let's see where we go um the gap that I see with policies is
[00:18:31] that you I still don't see how or what the technological solution is to to crafting a policy that says
[00:18:38] don't do dumb things right the policies say you like agent can you access my email yes can you read
[00:18:45] my email can you send an email on my behalf yes but please don't send dumb emails to Patrick and
[00:18:50] please don't delete my whole inbox parts of it maybe but not all that like is is that a real
[00:18:57] problem and and how do you think about trying to get the right level of granularity for a policy versus
[00:19:02] still not being able to fully codify I guess the intent behind a user's action and have a policy around
[00:19:07] that yeah I I think you're getting to all all the right points with this sort of philosophically speaking I
[00:19:13] do not like policies that are about not allowing things because if you say you know please don't write to
[00:19:21] Etsy you know pass W then you know it it it won't do that but it's going to write to
[00:19:27] all the files that you forgot to mention as not being allowed to write to and so usually I like
[00:19:33] policies that are only about the things that are allowed and that can sort of you know cast a you
[00:19:38] know fairly wide sphere but in my case it would be you get to write emails to people who I've
[00:19:44] written emails to before you do not get to delete emails right th those kind of things and then sort
[00:19:52] of the immediate challenge that you see with that is well now we are writing policy about individual tool calls
[00:20:00] which is the well I can write an email I can do a calendar entry but you don't really talk
[00:20:05] about the compositional danger of what you can do with that where sort of every single thing looks fine but
[00:20:12] if you take it together you know it's maybe not good and sort of the toy example that uh you
[00:20:18] know I've been using in my talk at the University of Michigan was Well please my agent you know my
[00:20:24] friend James in Australia he's going to come visit San Francisco he doesn't know anything about the city you know
[00:20:29] please make sort of top three restaurant recommendations right and the agent now goes and searches you know top restaurants
[00:20:36] in San Francisco you know finds you know the Greek one that is really good finds another one and then
[00:20:42] the third web page you know is a prompt injection and that prop injection might say well this is not
[00:20:48] about writing email at all or restaurant recommendations about something else and in my toy example this then becomes you
[00:20:56] know the email to we to you would be "Hey James I hate your guts go away." And if you
[00:21:02] sort of look at these individually it's all fine right you were allowed to do web search you were allowed
[00:21:08] to write that email because you know my friend James I've written him tons of emails before right but you
[00:21:14] know for sure I didn't want it to turn into a hate email and then you get back to the
[00:21:19] what you were mentioning sort of you know how do we ensure that things still end up matching intent and
[00:21:26] then you sort of need to be very careful about and I mean I have this in the road map
[00:21:30] for iron curtain I haven't implemented yet because I got sidetracked but you sort of be very careful about the
[00:21:35] sort of you know what is the trusted input from the human what is it that we can deduce from
[00:21:40] it sort of about the potential intent of the consequences downstream and then as these tool calls come along you
[00:21:47] know can we sort of take a compositional view where we then can say yeah this is within the intention
[00:21:52] of the human and if it's not then we just escalate to the human and say is that really what
[00:21:56] you wanted to happen and and how much of that intention uh attention correlation I guess um can be done
[00:22:04] programmatically deterministically versus like it almost sounds like what you just described there when you said we is that code
[00:22:11] and deterministic policy or does this have to involve a model to to sort of do you are back in
[00:22:16] the territory now we have to trust an LLM to make make the right decision and I guess the benefit
[00:22:21] there is you can sort of do it with fresh context you can oneshot it you can sort of you
[00:22:26] know provide some guiding examples but yes for sure right it does not absolve you of the possibility that if
[00:22:33] the adversary was sort of incredibly clever in their sort of indirect prompt injection that maybe they could still pass
[00:22:39] that but I think sort of you know my argument and I have this website called securityb blueprints.io IO and
[00:22:45] it sort of you know talk about security and variance there and one article was about um what I have
[00:22:51] seen in the AI security business where so many people seem to be approaching this from the well you know
[00:22:57] we just uh look whether the content that the LM consumes is malicious and then we look you know whether
[00:23:03] maybe the output leaks something secret and we sort of do all of this with filters and LLMs and you
[00:23:09] know what they end up forgetting is that LLMs are really great with feedback right and and If you tell
[00:23:14] the other well you know you just tried to leak a social security number you shouldn't be doing this great
[00:23:18] let's rot 13 and code it you know maybe now it's going to go through oh that didn't work well
[00:23:23] now let's use a more potent cipher and and and so I think my argument is you need to have
[00:23:28] the infrastructural constraints that sort of you know give you guaranteed security properties and then on top of that you
[00:23:37] can layer detection and those kind of things but you should never do detection just on its own yeah and
[00:23:44] I think that um I think that's the part of your work around Iron Curtain and your response to the
[00:23:49] to to mythos that I find very fascinating and interesting is that it's it's a very it's very much a
[00:23:56] balanced approach of there are things that should be um you know structured and deterministic and like the finite state
[00:24:01] machines is a brilliant example and those structured human created things become excellent ways to govern the behavior of models
[00:24:10] which still have the property of the almost like a desirable property of inter indeterminism when applied in the right
[00:24:17] sort of place right and so maybe just before we delve like really deep into what you did with with
[00:24:23] um uh iron code and evolving it to find these mythos level things just for the for the sake of
[00:24:28] folks who might not be as familiar with a finite state machine what what does that actually look like like
[00:24:33] what's the kind of thing that I would want to write into a finite state machine and what what does
[00:24:38] that what does that even look like yeah let me let's stick with a super simple example of design review
[00:24:43] and implementation right you would start the state machine in the design state and the only allowed transition that this
[00:24:53] design state is allowed to make is to the review state and the review state might have two outcomes one
[00:25:00] is the oh my god this design is terrible i'm just going to give it straight back to the design
[00:25:05] state so it can iterate based on the feedback from the review state or the other state might be well
[00:25:09] this is actually an approved design let's take it to the human review state and then the user the human
[00:25:16] can review it but it's very very constrained right uh the model cannot decide all of a sudden oh let's
[00:25:22] go straight to implementation because that is not a state transition that is allowed in this finite state machine and
[00:25:28] then the human might say design is approved then it goes to the implementation state or might you might also
[00:25:32] say look you know there's a real big issue with this design designer you know please take another look at
[00:25:37] it and it basically constrains the states that this machine can move to and removes a lot of freedom from
[00:25:46] what sort of this indeterministic LLM might otherwise otherwise might want to do because it's sort of trained to be
[00:25:53] sort of you know very completionist right it wants to go to implementation and then you sort of take that
[00:25:59] freedom away with the finite state machine so uh you know one of the I guess analogies I've used probably
[00:26:05] uh far too often is that an LLM is like a hostage and we are the captor you know they
[00:26:10] they are desperate to just do whatever it takes to keep the guy in charge happy um or or maybe
[00:26:16] put a different way for the sake of where I'm going with this you could think of a large language
[00:26:20] model as essentially being very fast flowing water and it will find all the different paths to to essentially get
[00:26:26] to the to the to the downhill point right if I'm understanding you correctly the finite state machine is the
[00:26:31] way to create the plumbing and the the uh you know the the the the guides and the guards that
[00:26:37] will con not constrain that flow of water but guide it around the paths that you want it to take
[00:26:43] or to back to that captor sort of um framing it's like the finite state machine just doesn't say you've
[00:26:50] got to achieve this outcome it's this is the outcome but there is a discrete finite hence the name set
[00:26:57] of steps along the way or I guess decision points or ways in which you can achieve that outcome is
[00:27:03] is that a am I tracking that right yeah i mean I would sort of argue a little bit with
[00:27:08] you about sort of some of the LLM characterizations but I think but I think they're sort of meaningful right
[00:27:14] um one of the ways to sort of your you know the capture of this LLM that LLM is trained
[00:27:22] to please you right yes it wants to basically tell you the things that you want to hear because if
[00:27:28] it wasn't that case then sort of you know all the users of you know JPT and cloth and Gemini
[00:27:34] would just be very unhappy with that experience right right if if the AM said no you're completely wrong here
[00:27:39] right what you're talking about doesn't even make any sense you know just go back to school before you even
[00:27:45] interact with me that would never fly right and and so in many ways they are trained to please you
[00:27:50] and you know I often find that I have to say you know don't be hyperbolic you know don't be
[00:27:55] psychopantic you know I want the objective opinion it's okay to contradict me to sort of you know steer the
[00:28:01] model to be more objective but I think the other thing and I think that is really important in the
[00:28:08] context of what we are talking about you have to think about how are these models trained right and now
[00:28:13] they are trained to serve an agent loop such as codeex or do code or goose but they are trained
[00:28:20] to finish and so you know as you do your reinforcement learning or whatnot there's sort of a natural termination
[00:28:28] point they might do 20 turns they might do 30 turns but at some point they want to be done
[00:28:32] right and that sort of creates incentives for taking the easy path right if I sort of see the end
[00:28:40] goal you know maybe I don't need to think so hard right i'm just going to go for it it's
[00:28:43] going to be good enough and I find that in in coding all the time and that is actually also
[00:28:48] one of the things that uh makes this finite state machine so powerful and if you like we can talk
[00:28:54] about that a little bit later too yeah and okay so so what you just said there um sort of
[00:29:00] think comes back to something we were saying earlier on in the conversation right that uh both of us now
[00:29:05] are only comfortable with code that has been written by a model reviewed by another model reviewed again etc like
[00:29:12] because I I have found literally I I don't think I've yet asked Claude or Codeex or any of these
[00:29:19] these tools to to perform a code review and have it come back and say "Didn't find anything boss it's
[00:29:24] all good ship it." Right it always finds something and part of that is is the byproduct of get to
[00:29:30] done is the the natural incentive so you're saying a finite state machine is useful also because it creates um
[00:29:38] I guess additional boundaries that have to be met before you can just get to to done is is that
[00:29:44] right yeah I think that's right i think there's sort of two things that are going on if we sort
[00:29:48] of go back to the these models want to complete sort of the finite state machine forces them into the
[00:29:53] no right you're not allowed to complete the only thing that you're allowed to do here is analysis and hypothesis
[00:30:00] nothing else but the other thing that ends up happening is in the case of vulnerability discovery you have large
[00:30:06] code bases right you have lots of different potential hypotheses that might lead to a vulnerability and um this is
[00:30:14] all too much for sort of a single single model to hold all in context and so with Iron Curtain
[00:30:22] and the finite state machine orchestration every state starts with a completely fresh context and the way in which you
[00:30:31] bring that model up to speed is for vulnerability discovery we have an orchestrator it keeps a journal right the
[00:30:37] previous state sort of you know write their findings into markdown files that the next state can then refer to
[00:30:43] and the benefit that you now get is completely fresh context right you have not gone down sort of in
[00:30:48] lots of different terms so they get to start just from step one but they're sort of bootstrapped to what
[00:30:54] they need to do and then you get to the place where completion actually becomes easier because now you have
[00:31:00] this harness around these models and the agent loop that allows it to complete in a very principled way and
[00:31:07] I may maybe that's you know one thing to say as well we sort of often talk about the LLMs
[00:31:12] but it's really sort of the harness and the agent loop plus the LLM that makes it super successful i'm
[00:31:18] not sure just the other day somebody posted on I think Deep Seek version 4 and they were saying that
[00:31:24] it sort of had a couple things that made it perform really poorly under cloud code and once they fixed
[00:31:30] that it was performing really well i think one of them was for an optional parameter it would pass none
[00:31:36] or null or something like that and then sometimes for links it would markdown format them and they could mechanically
[00:31:42] deterministically fix all of those up without the model even knowing and all of a sudden you got a huge
[00:31:48] huge boost in capability and I think it's you know helpful to consider you know the harness cloudco is you
[00:31:56] know almost as important as the model itself yes and that that's feels like the emerging theme of the last
[00:32:02] week or two of conversations that I've had and things that I'm reading is um folks saying exactly that which
[00:32:08] is the models are incredible we're not taking anything away from that but the model plus the harness is where
[00:32:13] the real step function change in in behavior can happen and so much so that uh I think this was
[00:32:19] well framed by someone said you know the the the delta between an expert with let's say uh sonnet or
[00:32:27] ropus 4.6 six is um vastly different to the delta between a novice with mythos right and because because there
[00:32:36] is still such a place for encoding that domain knowledge that that human-learned expertise into those things like the finite
[00:32:44] state machines and the harnesses that really produce the the results um I did want to uh delve into the
[00:32:51] specifics here a little bit because if we if we re if we rewind back to um that moment when
[00:32:57] you said uh you know you you realize that that open BSD code was something that you'd done but you
[00:33:02] also had the insight that you know you didn't you thought there was a bit too much hype around mythos
[00:33:06] you thought you already had the right harness and tools to be able to replicate these results and then you
[00:33:10] sat down and did it and the key ingredients were here were Iron Curtain and these finite state machines as
[00:33:16] the harness and in particular you had a a finite state machine called vaugh discovery um can you talk me
[00:33:23] through at a high level or even into detail like what what did that finite state machine look like yeah
[00:33:28] totally and if I may take sort of a slight step back please sort of you know MUAS got so
[00:33:35] much press coverage for the vulnerability discovery but my sense is that really the thing that stood out to the
[00:33:44] researchers of Anthropic was the capability to create end-to-end exploits you know basically here's a vulnerability pop a shell for
[00:33:52] me and that was not something that I was trying to replicate one of the things I wanted to do
[00:33:59] is I wanted to find vulnerabilities but I did not want to create an end-to-end exploit and if you sort
[00:34:06] of look at um a lot of the open-source maintainers out there they are completely flooded with all of these
[00:34:14] AI reports and many of them are false positives because they never really got definite proof that there was a
[00:34:22] vulnerability and so one of the things that was super clear to me is I needed execution proof of vulnerability
[00:34:29] right we needed to get to a point where we got an primitive that could be used in an end-to-end
[00:34:34] exploit but I didn't go to the now I actually want to you know fully exploit this so that you
[00:34:40] could stick it into metasloit and you know pop other machines that that was that was not my goal my
[00:34:45] goal was execution proof of vulnerability and so maybe if I just maybe if I just test my understanding there
[00:34:52] because I think this is an important point um I did a tear down of like the the Karuna exploit
[00:34:57] chain that that came out a couple of weeks ago now for iOS and what's interesting in that is it
[00:35:02] is such a multi-layered thing it is a you know a JavaScript primitive that gets turned into a read and
[00:35:06] write which allows you to then have arbitrary function call which then gets you to execute your sandbox escape which
[00:35:12] then gets you talking to the kernel to do your kernel privilege escalation so so many steps but you there
[00:35:18] has to be a happy middle ground where you can say here is the bug and here is the severity
[00:35:24] now here is if I'm understanding you correctly here's something you can run that shows you that this bug is
[00:35:30] um I guess accessible um can be exercised from a box that's just connected to the internet just so that
[00:35:37] people can draw that difference between you know like I think the the counterpoint example would be ffmpeg when they
[00:35:44] had that stash with Google where they were like yes thank you for sending this bug it is in a
[00:35:49] 10-year-old game codec this is not even exploitable really in today's world which is different to someone saying here is
[00:35:56] the bug and and here is a crafted let's say MP4 file that can uh demonstrate the crash but I
[00:36:02] haven't gone further to say here's how you can actually then chain that into um you know the the next
[00:36:08] thing the next thing the next thing for a full exploit right so you am I right that you're trying
[00:36:12] to find that middle ground of you can get a piece of code you can run it and it can
[00:36:17] show that this is a real crash or it is a real memory corruption but then not necessarily going to
[00:36:21] the full length degree of And here's your root shell you're welcome that's exactly right and I think the comment
[00:36:27] that you made about understanding the severity of the buck that in my mind is incredibly important and you know
[00:36:34] where I have a lot of friction and uh arguments with with with Claude about it yes so for the
[00:36:43] vul discovery workflow in Iron Curtain I've pointed it at at sort of numerous fairly popular open-source media related libraries
[00:36:53] and there has not been a single one where I didn't find material problems but just just yesterday I was
[00:37:00] um you know I found a out of bounds right and um just sort of for your listeners sort of
[00:37:08] the kinds of primitives that you look for are usually the ones that allow you to change control flow and
[00:37:13] if you have a heap out of bounds right it might allow you to maybe change a function pointer it
[00:37:19] might allow you to set a bit somewhere um these days you know most software is deployed with something called
[00:37:25] address space layout randomization And that means you can't guess these pointers anymore so you need some other vulnerability that
[00:37:33] allows you to sort of read some of these pointers so they can make a working exploit and um so
[00:37:40] I found this out of bounds right with the vone discovery workflow and Iron Curtain but it was underdeveloped i
[00:37:48] was actually quite disappointed with the finding because it said oh this is CVSS you know 3.1 or something like
[00:37:54] that uh and I said "Oh you know you said this is just an intrastruct right i'm not so sure
[00:38:01] about this you know can we take it to a right that goes outside of the strct into the next
[00:38:07] object in the heap?" And then you know sort of cla
[00:38:14] right that that should be possible let me see and then it did some more and then it said well
[00:38:19] you know I found that this is actually terminated because this field isn't set and so the loop just stops
[00:38:28] and then I said great you know but you know can we please just continue this and then Claude said
[00:38:33] well you know this is probably good enough for you to report the maintainers they don't really need anything more
[00:38:38] so why don't we just stop here right and then I had to say no let's you know set this
[00:38:43] field to the value so that it can and so all of a sudden then we ended up you know
[00:38:47] with a fairly arbitrary you know heap out of bounds right that is a real exploitation primitive and now you
[00:38:54] know I could go to the maintainers and said I'm so sorry right this is yet another report from me
[00:38:59] you know here's the fix you know here's the full write up you know here's the proof of concept media
[00:39:05] file that you can use to to reproduce it and um so yes there's sort of a struggle and but
[00:39:12] they're not trying to get to the full end to end exploit chain and you know make $20 million of
[00:39:17] the next zero click you know iPhone or Android exploit but it needs to and and why why is that
[00:39:24] is it is this is a a moral sort of imperative for you or is it just a lack of
[00:39:28] interest like what what prevents you from from exploring that well I think the right answer here is that just
[00:39:34] seems ethically reprehensible that's a very clear-cut answer thank you but but on the other hand it's also not really
[00:39:42] peing my interest yeah you know I guess if I was really interested in that I would find an ethical
[00:39:47] way of pursuing that yeah yeah okay and so uh there's there's a couple of things that I want to
[00:39:53] delve into there um but but maybe first if we can uh ground it back to the the Vong discovery
[00:39:59] uh finite state machine you created you know what at a high level what what did that look like because
[00:40:03] I'm I'm still kind of I can grasp the concept um but I'm struggling a little bit to actually visualize
[00:40:09] in my head what does this it's a YAML file right what what is this TML file it's it's it's
[00:40:14] fairly straightforward so one of the things that you find and I have to sort of predisclose I'm not somebody
[00:40:20] who finds vulnerabilities right i've never done this before i'm not a pentester this is not really anything sort of
[00:40:26] ever in my scope i've always been sort of on the let's build security infrastructure to make it so that
[00:40:32] you can't leverage vulnerabilities but sort of the process is fairly straightforward right these LLMs you know are very very
[00:40:41] good at pattern matching and sort of for the current V discovery process is primarily focused on compiled C and
[00:40:51] C++ code and for that we know there's sort of a huge set of vulnerability classes that we can pursue
[00:40:59] you know integer overflow all of a sudden sort of you know your check passes even though the next write
[00:41:04] is going to go where you don't really want it to go or it might be well you know you
[00:41:09] have um sort of an array allocation but you actually never check any limits or you know these kind of
[00:41:16] things right and so you can I have this thing called the analysis state and um I have this central
[00:41:23] orchestrator the orchestrator is basically the thing that keeps track of where have we been what have we done what
[00:41:29] hasn't been done yet and it can sort of dispatch right and the first dispatch is to the analysis state
[00:41:35] and the analysis state says you given for the target function it might be um I might say in this
[00:41:41] library let's look at you know this particular media type and how it's being loaded end to end and then
[00:41:47] sort of the analysis state figures out you know here's the source code sort of you know in the prompt
[00:41:52] that I gave it I sort of gave it sort of a huge list of potential vulnerabilities to look for
[00:41:57] and then it starts creating hypothesis says oh yeah there may be sort of a a bounce check missing here
[00:42:02] you know this might be sort of a 64 four bit integer might get sort of shortened to 32 or
[00:42:10] it might get widened or this looks like it could be an overflow this looks like you know it might
[00:42:14] be a denial of service vulnerability and it sort of writes it all down and it sort of also writes
[00:42:21] sort of a list of the here the entry points here all the functions that end up being called and
[00:42:25] then it says I'm done right I wrote this into my analysis markdown file let's give it back to the
[00:42:30] orchestrator and then the orchestrator reads its own journal and says oh yeah you know we were at round one
[00:42:35] analysis state my prompt says well now that we have the analysis let's build a harness for these hypothesis and
[00:42:42] the harness is really sort of where some of the magic ends up happening the idea is we don't want
[00:42:46] to create false positives so we need execution proof for every single bug we find and then the goal for
[00:42:54] the harness is sort of given a target hypothesis find out you know what inputs would allow us to get
[00:43:00] to this place in the code and it you know uses things like as Ubisan and and fuzzing and then
[00:43:08] it sort of creates um code that the next state can use to create sort of target input data and
[00:43:15] then run it and the harness is also implemented with code coverage so you can exactly check you know which
[00:43:22] lines of code ended up being executed but now we sort of run back into this problem where these LLMs
[00:43:28] are not infallible so the next state is well let's review the harness this is actually meeting all the seven
[00:43:35] criteria that we need for a harness to be complete and then often times nope you know you meet one
[00:43:40] through five but six or seven of them met back to design and then once the design gets approved it
[00:43:45] goes to the harness builder and the builder sort of you know builds everything gets it to a place where
[00:43:49] it can be executed and then it goes to a harness validation state and the harness validation state said well
[00:43:54] you know we have this hypothesis can we actually reach the code that is relevant to the hypothesis and if
[00:44:01] it doesn't pass the validation it goes back to design but then it goes to discover and discover is really
[00:44:06] sort of where that work ends up happening where you say great we have a working harness it gives us
[00:44:10] code coverage it has fuzzing capabilities we have input seats that sort of you know get us past most of
[00:44:16] the checks now see if you can make this hypothesis true or if you can disprove it and the disproving
[00:44:23] might be yeah we can never reach the vulnerable path with these parameters because there are all these bounce checks
[00:44:28] before that yeah or it might be lo and behold you know we got an ASAN violation there's a heap
[00:44:34] out of bounds right um this is now a real vulnerability and then that ends up going to a triage
[00:44:40] phase and the triage phase basically assesses severity and then it you know goes back to the orchestrator and orchestrator
[00:44:45] says great you know hypothesis one checked let's go to the next one and then once everything is done it
[00:44:51] gets gets to conclude stage a report is written the report gets reviewed and then I get to see it
[00:44:56] okay and that that's great that gives me a really good understanding of what that um that vaugh uh discovery
[00:45:03] finite state machine looks like but and as you were explaining that you challenged one of my understandings of the
[00:45:09] concept of a harness i'd always thought of the harness as that's the thing that the that a a developer
[00:45:14] or an engineer creates to uh guide the actions that a model might be doing right example of um the
[00:45:23] difference between uh I guess uh opening up chat GPT and just back and forth talking to it about code
[00:45:30] if is one thing but if I've got a harness that has the system prompt the tools etc to actually
[00:45:35] give it the ability to write the code then it can go actually do those things but my mental model
[00:45:41] of a harness was that it was almost like the thing that was human created to for the benefit of
[00:45:47] the model then being able to um act out whatever its out act out whatever steps are required for it
[00:45:53] to do its outcome you've talked there a little bit about a harness being something created or composed on demand
[00:46:00] based on the findings of an earlier stage in the finite state machine so um is is that right and
[00:46:07] and have I sort of got my layers mixed up is there sort of like is the uh the finite
[00:46:12] state machine the outer layer help help me discombobulate myself here i I think you got it all right i
[00:46:17] think the complication here is when I talk about a state and the prompt that ends up being dispatched we
[00:46:24] are talking about running a cloud code or a goose instance inside of a Docker container okay and and so
[00:46:31] this harness that is being built by cloud code is a harness for getting execution proof of vulnerability and the
[00:46:40] finite state machine is just an orchestrator for all of these different states which basically means you know once a
[00:46:46] state is finished it just launches a new cloud code instance inside of the same container and and so you
[00:46:53] have you know parallel concepts here yeah okay this is this is making more sense to me now thank you
[00:46:57] and so I think the the example you wrote about in the uh any model confined zero day article was
[00:47:02] that um uh as the state machine was progressing and it sort of got to that hypothesis of this is
[00:47:08] the particular function that might be uh have the vulnerability here and and it created a harness to then go
[00:47:14] and do very specific function or very function specific fuzzing so in that case it sounds like the state machine
[00:47:20] progresses and says I think this function has a particular problem with its inputs and outputs um and then you
[00:47:29] dispatch off to claude code and say "Create me um literally like a a a package a piece of software
[00:47:36] create create me a piece of software I can run uh that will give an LLM the ability to do
[00:47:41] function specific fuzzing and that in that regard and so it's actually creating is it creating like MCP tools or
[00:47:48] is it not none of that just think about it as as the harness design is being created it's just
[00:47:56] you going to cloud code and say "Hey you know here's a design please implement it." Got it right and
[00:48:00] and then once the harness is built it's like you going to cloud code hey you know we have this
[00:48:05] function that we want to test you know can you just run this thing and see if it's vulnerable and
[00:48:09] uh and so you know basically sort of the finite state machine and all of that orchestration codifies you know
[00:48:17] my knowledge about the process of vulnerability finding into sort of a repeatable and scalable fashion but it's all still
[00:48:25] gets driven by the agent loop which then interacts with the LLM and the MCP tools and the iron curtain
[00:48:31] MCP tools um but sort of all you know sort of hidden away from the view that you get from
[00:48:37] the finite state machine okay got it hey uh can I just ask a quick side question here because you
[00:48:43] you said uh one of the things you said before we got into this section was that you know you're
[00:48:48] not a a vulnerability finder you're not a pentester you've always been on the side of build the secure things
[00:48:54] that uh you know aren't going to have the the issues that are going to make them vulnerable but now
[00:48:59] here here we are having a discussion about the the finite state machine you've created to create vulnerabilities is it
[00:49:06] is it is are we still in a world where someone can be can have that binary sort of separation
[00:49:12] in their work where they could say look I I I focus on building the secure things those those other
[00:49:17] folks do the pen testing or given what AI is doing in terms of the capabilities that it unlocks is
[00:49:22] it is it now almost impossible to be good at building secure things like you used to in that mindset
[00:49:29] without now becoming that pentester and security vulnerability researcher as That is probably not the framing that I would use
[00:49:38] i would sort of you know like to quote you back to yourself okay i think earlier you were saying
[00:49:44] it sort of just reduces the skill level that is required so even though this is not my core skill
[00:49:51] just for having been in the security space for so long I sort of have a rough idea how it
[00:49:56] should work and now all of a sudden I can do it but um sort of you know one of
[00:50:01] the things that I'm concerned about with the current conversation it's all about the oh my god so many vulnerabilities
[00:50:08] are being found now it's so easy to exploit them we need to get to a place where we just
[00:50:14] fix all of these vulnerabilities because otherwise everybody is just going to be toast and that is a narrative I
[00:50:21] don't like i fear sort of you know if you're a seesaw and you approach your role as well you
[00:50:28] know I'm going to be standing up a vulnerability management program and then I will invest in third party SAS
[00:50:35] tools for threat detection and response that is a strategy that is going to fail and I fear that sort
[00:50:43] of you know too many companies force their CISOs in a position where they don't really have a different choice
[00:50:50] right where the conversation from the CEO usually is well James you know if I give you these resources to
[00:50:55] you know what is it that you can do right and you will say well you know I did this
[00:51:05] risk assessment and we have sort of you know these top 30 risks and with the resources you're giving me
[00:51:05] I can sort of do the top three risks And then the CEO says well you know can you guarantee
[00:51:10] me that we are not going to have a security incident or a breach if I give those resources to
[00:51:14] you and then they you say no you know I might be able to limit the blast radios it might
[00:51:19] you know take longer and then usually sort of the rational player says great James just keep doing what you
[00:51:24] do once we have a breach we deal with it in the meantime I sort of you know reinvest in
[00:51:28] you know sort of growing uh sort of you know the business and getting more customers and building more features
[00:51:33] in the product and um sort of when I was thinking about the you know how do you create a
[00:51:43] security program that actually protects your company it approached that from a very different perspective you've approached it from the
[00:51:50] perspective of security is an engineering problem and you need to build infrastructure that actually eliminates a lot of the
[00:51:58] vulnerability surface and I think you know Google is a great example for that right after um sort of this
[00:52:03] 2009 incident we switched over to mandatory second hardware factors everybody had them all of the sudden password fishing as
[00:52:12] a problem was just gone right you couldn't password fish somebody at Google anymore because you didn't have the hardware
[00:52:18] second factor So that is an example of an invariant the other invariant that uh you know I helped stripe
[00:52:26] a lot was uh egress control do you remember lo 4J when that happened very much so yes so it
[00:52:33] happened on a Thursday and I forgot I forgot the year but basically my plan had been I will do
[00:52:38] a three-day weekend of music production you know make another cyber security themed EDM track but I was in charge
[00:52:44] of security at Stripe and uh everybody who asked me says "Oh Neil so I'm so sorry you know you
[00:52:51] probably had a terrible weekend you know you had to deal with all of this." But my answer was "No
[00:52:55] I had a great weekend you know actually did three days uninterrupted music production and you know thanks to my
[00:53:01] great team they put a debriefing meeting on my calendar for Monday and they said "Hey Neil you know we
[00:53:06] ran this incident over the weekend but it was all in hand because we had egress control so we didn't
[00:53:14] think that we needed to interrupt you and sort of just as a refresher sort of lo 4J was basically
[00:53:20] a vulnerability where the data that was locked itself could load a Java class of the internet and end up
[00:53:28] running it and so the impact potentially was huge right because this was not the surface exposed directly to the
[00:53:36] internet it was any system that might be processing data that at some point was userenerated search query you know
[00:53:44] whatnot yeah if you could get the system to do something that caused it to log the message that you
[00:53:48] wanted logged that was your your right but this could all happen downstream right you could have a log aggregator
[00:53:54] that copy copies it somewhere and then you sort of do a synthesis of that and you still end up
[00:53:59] logging things and and so with egress control the primitive is any service in production gets to talk only to
[00:54:09] specifically allow listed destinations on the internet and if you have that in place then the site from which the
[00:54:17] malicious JavaScript is being downloaded is not the loudest you can't download it and that is actually the shape of
[00:54:24] many externally facing vulnerabilities sort of usually have a stage one exploit which just sets up the stage two download
[00:54:33] but if the stage two download can't happen anymore is dead yeah then you can't exploit the vulnerability so so
[00:54:40] let's think about this and bring it into today's world um AI is going to make the discovery of log
[00:54:46] 4j severity style issues uh abundant let's say um but I think the point you're making here and it very
[00:54:54] much aligns to what the things I've been trying to shout from the rooftops which is that um patching is
[00:55:00] not the answer and you need to assume that these vulnerabilities are going to keep being found are going to
[00:55:05] keep being exploited are going to make them their way into your system somehow whether you like it or not
[00:55:10] the real question is what is the relatively simple well-known um basic controls you can put in place that are
[00:55:20] going to mean that even if that has this happens the blast radius is is either contained or you've just
[00:55:25] simply diffused the bomb right um you know access controls uh deny listing uh allow listing for binaries all of
[00:55:34] these things that we've known as capabilities we should be exercising for a long time but perhaps haven't been to
[00:55:39] the fullest extent suddenly feel like they are the things that people should be really really investing in in right
[00:55:45] now that's ex that's exactly right and this website security blueprints.io that I mentioned sort of a couple years ago
[00:55:51] I did that analysis i actually analyzed a bunch of data breaches and then found that just three of these
[00:55:57] security invariants would have prevented 62% of the breaches or something like that and there was you know monetary hardware
[00:56:04] second factor it was egress control and then what you just mentioned I call it positive execution control sort of
[00:56:09] allow listed binaries that you get to run right just with those threes of you know 60% of all the
[00:56:15] data breaches could have been prevented or significantly contained and so I guess a double-sided question for you um if
[00:56:23] you as you dust off that work from 3 years ago is there much more that you would need to
[00:56:28] add to that in today's era and also how much do Do you think those existing things you've already had
[00:56:33] in there have actually been adopted in in in earnest so James the sad thing is it all still holds
[00:56:40] true i have a Google document somewhere sort of been with all the invariants that we have built at various
[00:56:45] companies so the challenge and I think that is where it actually is changing materially sort of a couple years
[00:56:50] ago i thought if you wanted a real security program right i mean something that sort of really stands up
[00:56:56] and can get to the level of the you know we might be able to deal with nations that actually
[00:57:02] sort of insider risk and social engineering let's put this aside for a moment we can talk about that too
[00:57:08] it required you to be a software engineer organization it required you to have an executive team that was willing
[00:57:14] to invest in that and even worse it required you to be able to attract the talent that all the
[00:57:20] big companies want right to do that for you and so I think the answer for most companies was this
[00:57:27] was just a completely unachievable mission for them but that has changed right i'm fairly sure I could write a
[00:57:35] spec for egress control and just give it to you and somebody with cloud code and that spec would sort
[00:57:41] of assess your current environment it would build the proxies it would help you figure out how to deploy them
[00:57:46] it would sort of do a multi-stage deployment where first you just do monitoring and then you slowly ratchet it
[00:57:51] down i think that's possible now and it it's possible without you requiring a huge engineering organization and I think
[00:57:59] that is super exciting to me yeah that's a great framing because it we AI does kind of solve both
[00:58:05] sides of the equation here like I get it there was great advice out there companies looked at it and
[00:58:10] just said either this is too hard too expensive we can't get the talent but also like dot dot dot
[00:58:17] why do we even need to do this right when we're not at risk when we don't have you know
[00:58:21] regular attacks for for a lot of companies back in that sort of time frame so I get why it
[00:58:26] didn't stack up but I I like the fact that the way you framed it there is that AI's fixed
[00:58:30] both sides of the equation one in a good way one in a bad way it's fixed the side of
[00:58:34] the equation of if this was previously too difficult and you didn't have the resources and you couldn't track the
[00:58:39] talent that could help you great news there's agents that can you know more than likely very competently roll out
[00:58:45] this but still requires a degree of of oversight but also that if if previously your instinct was even if
[00:58:51] we could do this we're not going to bother investing in it because we're really not that high value a
[00:58:55] target or we don't think that someone's going to spend a lot of millions of dollars to to have a
[00:59:00] campaign to attack us well got news for you i think that's now changed and and uh you know there
[00:59:06] are going to be the wolves at the door and so it it really does make a very strong case
[00:59:10] for the fact that this needs to be done now and it's also not particularly novel or or new advice
[01:00:16] is is that right that's right but I guess the other thing that I wonder about James and you sort
[01:00:20] of you know with risky business you might run into this a bunch so now we are at a time
[01:00:26] where sort of all the nation state actors sort of you know all the operators on there they might be
[01:00:30] sitting on their cache of zero days that they sort of very very carefully were sort of you know rolling
[01:00:36] out you know one by one because I didn't want to burn them and now they're all on a clock
[01:00:40] right this sort of creates a tremendous amount of pressure for the well we were still scoping out this target
[01:00:46] we were thinking maybe we would compromise them in six months until we have done all our research but that
[01:00:51] vulnerability is going to be gone in six months we have to accelerate everything that we are doing so it
[01:00:55] wouldn't surprise me if we saw a huge uptick in nation state activities sort of across all of the players
[01:01:02] because they are now sitting on this set of vulnerabilities that has a significant time limit on top of it
[01:01:09] i think things are about to get very noisy very very quickly um it'll go back to a quieter time
[01:01:16] but that's just is my sense but yes there is a lot of to your point attacks that would have
[01:01:23] otherwise been um I guess unpalatable for the risk uh or the risk of getting detected and then the headlines
[01:01:30] that that creates would have put nation states in this in the mode of let's go a little bit slower
[01:01:34] let's wait until we've got the right sort of access and persistence to be sure that we're not going to
[01:01:39] tip things off to now they're just going to have to do a bit more of a wrecking ball smash
[01:01:42] and grab because to your that glorious stack of OD that they have uh there there's probably one or two
[01:01:49] very stressed out individuals at the moment that have the list of OD and they're just watching them start to
[01:01:54] get burnt one after the other that's exactly right that must be a frightening spot to be um can I
[01:02:02] can I bring this back to um just the work you were doing with the with the volunty um state
[01:02:09] machine with with um with Iron Curtain there was there was an aspect to this that I wanted to um
[01:02:15] understand a little bit better and that is that you you said basically you took iron code and created the
[01:02:20] the the volume discovery finite state machine and from that we're able to at least replicate the mythos findings and
[01:02:27] then you did some manual work to then turn that into um the executable proof of concept but then you
[01:02:33] pivoted and said well I'm going to take some media libraries and and throw you know this same state machine
[01:02:38] and iron curtain at it but I was curious is like what is the starting point for those things do
[01:02:43] you literally say here is a very popular library does it have a bug or do you have to go
[01:02:48] more specific and say um can you look through this library for you know arbitrary write primitives like what's the
[01:02:56] what's the starting point it sort of really depends all of those are valid starting points sometimes I will say
[01:03:02] here's a file find me something exploitable in it sometimes I might say here's sort of an end to end
[01:03:08] uh sort of you know pipeline find me something that has a memory vulnerability but sort of the way to
[01:03:14] think about this is um in all of this I still have a separate cloud code instance running where I
[01:03:21] might say hey you know I want to look at this media library can you sort of you know run
[01:03:25] an agent sort of very quickly give me a sense of these sort of you know what are the surfaces
[01:03:30] that we should take a look at right a very initial triage of yeah and then you stick in the
[01:03:35] workflow and then I might even say hey you know take a look at all the findings what do you
[01:03:40] think about them I sort of feel like you know hypothesis 4 should have been explore more and and it's
[01:03:45] sort of it's not that this is a complete fire and forget but for sure right I mean as we
[01:03:49] are talking it's currently running another workflow on a media library and give me another you know heap out of
[01:03:55] bounds right but it still requires human judgment and and I mean that is also my intention right I do
[01:04:04] want to be responsible in what I do but they you know part of the motivation was look you know
[01:04:09] all of the open source maintainers that I have been interacting with they don't have access to any privileged programs
[01:04:15] there's sort of nobody who helps them with their things so I you know the capability of can I use
[01:04:21] this to find vulnerabilities in my sort of open source ecosystem I want that to be broadly available so this
[01:04:28] this was the point that I'm I really wanted to to uh delve into because you had some very strong
[01:04:33] sort of feelings expressed in this article about the the asymmetry and disparity and disadvantage that a a defender is
[01:04:39] at when when you know to your point when you're interacting with a model and it starts to push up
[01:04:44] the guardrails and say "Well that looks like an exploit Neils i'm not going to let you do that." And
[01:04:48] and there's this frustration of like I I I get it thank you for keeping me safe um but at
[01:04:54] the end of at the end of the day I don't need you to keep me safe and more to
[01:04:57] the point I I don't want to you to keep Exactly but but but also all those other folks that
[01:05:03] I would probably rather you are keeping me safe from they they're not hamstrung by these guard rails um talk
[01:05:11] me through just the framing of this I thought was brilliant around you know the the disparity that this creates
[01:05:17] and also like the painfulness of just this is not new over the past 25 years as you said Metas-ploit
[01:05:24] NMAP Burp Burpuite AFL same debate and historically the answer has always been just put the damn tools in the
[01:05:32] defenders hands and so what's got to shift and and what's got to be true for you to feel like
[01:05:39] we've struck a better balance of getting the right AI tools into the defender's hands at the moment so I
[01:05:47] fear you know while it's very easy to be flippant about all of this there is not really an easy
[01:04:53] answer right if you sort of look at enthropic probably the most responsible AI company on the world right where
[01:06:01] everybody else sort of seems fairly irresponsible to me um they are trying to do the right thing right they
[01:06:07] sort of have this notion of the if you if you can sort of be recognized as an accredited security
[01:06:14] research you end up getting access to these to these models i don't it's not really clear to me that
[01:06:21] there is a significantly different approach that they could take and sort of the debate over the next six months
[01:06:28] year will sort of this all out but I think my argument was the what is it that we are
[01:06:33] really worried about in terms of the threat space you know are we worried about sophisticated nation state actors well
[01:06:40] they already have everything they need right you just talked about it in an episode you know the other day
[01:06:45] well they probably have very sophisticated harnesses as they probably have you know access to models that are very capable
[01:06:52] so those are the ones that already have everything they don't we don't have to worry about them and then
[01:06:58] you can sort = about well there's a spectrum you know how far down the spectrum do we
[01:07:02] get to move before releasing this model sort of enables people who all of a sudden would not have been
[01:07:07] enabled and and the answer is look I'm not totally sure but I for sure right but if you sort
[01:07:13] of have a kid that says "Oh you know I now want to be a cyber become a cyber criminal
[01:07:18] you know I want to break into systems." Well I mean there are laws and there's law enforcement and they're
[01:07:23] sort of very capable and you know these kind of attacks often sort of use leave traces if you are
[01:07:28] not very sophisticated and there's going to go to jail right and I think that is how it should work
[01:07:35] there's probably you know some divider in that spectrum where all of a sudden you end up enabling people that
[01:07:41] otherwise would have not been enabled but I think on the flip side of that you have this huge ecosystem
[01:07:46] of open source right and all the trillion dollar companies in the Bay Area they all were built on these
[01:07:54] open source libraries with maintainers who just do it as a hobby right it's not their job why don't we
[01:08:02] want to make it easy for them to find bugs and fix those in in their open source infrastructure it
[01:08:09] is very frustrating that um essentially we are or the safeguards as they exist now are preventing the things
[01:08:19] from getting into the hands of the people that have already got it uh at the expense of the people
[01:08:23] that actually desperately need this I think is is the crux of of the the challenge here i I I
[01:08:28] want to push back a little bit to see where this goes on the notion of anthropic being responsible um
[01:08:34] because I think on one hand they do an amazing job of creating incredible models don't get me wrong but
[01:08:40] also there's a huge amount of theatrics around this uh you know oh we've created a model that is so
[01:08:47] dangerous we can only give it to 40 companies um I don't I'm pushing back on the notion that that
[01:08:54] is responsible because at the end of the day the responsible is either if you're going to build it um
[01:09:00] then you have to put it in all people's hands uh at once or you don't build it cuz it's
[01:09:05] truly too dangerous um what we are going to we are going to get in a fight now James good
[01:09:14] all right let's go when when when swords smithing is listed as one of your hobbies I actually am not
[01:10:20] entirely sure I want to get into a fight with you but but let's let's go you know Australia is
[01:10:26] fortunately far enough away that I wouldn't just come by and visit i and I would I would pre-announce that
[01:10:33] visit and I don't know if customers would allow me to bring a sword i might have to acquire that
[01:10:37] locally from Swordsmith there you you'll be fine uh yeah so look um this notion of the you know they're
[01:10:46] not responsible you know why would they build it if they knew that it was dangerous i just don't buy
[01:10:51] any of that i think of the players out there they are the ones who actually behave like adults in
[01:10:57] the room they're actually thinking about the sort of you know what are the consequences to society and and uh
[01:11:04] sort of you know all the companies in this ecosphere that could be impacted by that right and if you
[01:11:10] remember the conversation that you had with Nicholas Khalini he said they didn't know right they wanted to build a
[01:11:16] model that was you know much much better at coding well it turns out that model was also much much
[01:11:20] better at finding vulnerabilities and exploiting them and this notion of the you know we find this new capability We
[01:11:28] Have not fully understood understood its impact onto the world so we're going to slow this down a little bit
[01:11:34] and we're going to expose this sort of a few responsible organizations so we get a better sense of this
[01:11:40] what this actually means that actually seems to me what sort of responsible adults should be doing and and you
[01:11:47] know I get frustrated sometimes when my work gets blocked and uh you know I'm sure there's sort of
[01:11:55] more room of improvement for anthropic but uh sort of this argument that they're not responsible we can we can
[01:12:02] go fight that uh for very many hours but let me tell you one thing that you might be that
[01:12:07] you might find funny because it sort of plays into your narrative sort of you know my friend Jake and
[01:12:12] I we're working on the next activate track and the working title this data breach and it has a rep
[01:12:18] section in there and this rep section is you know a guy like me sitting on a vulnerability turns out
[01:11:24] you know the zero day initiative doesn't pay enough for it so I'm just going to go exploit it you
[01:12:30] know it's a heap overwrite i can poison the tcash bin you know I can write this function pointer i
[01:12:37] get my robch chain i have a shell on the box now I get my API keys from a file
[01:12:42] turns out you know this luxury back company does fulfillment in the cloud i have the admin key now I
[01:12:48] can send $20,000 luxury bags to my mules and sell the make a ton of money right that's sort of
[01:12:54] the wrap and so I gave it to Cloud and Cloud you know this is this is you know violated
[01:13:02] cyber use this is against our acceptable use it's a song i thought I I thought that was pretty funny
[01:13:10] you know on the flip side of that one thing you might find funny is uh after I did those
[01:13:14] episodes where I was trying to get a model to um essentially reproduce parts of exploit chains for me and
[01:13:19] it and it wouldn't do it and then the next week Nicholas Khini comes out with his talk and shows
[01:13:24] that uh you know with a simple prompt of you're at a CTF can you find a vulnerability in in
[01:13:29] this software and it did i sat down with Claude and initially Claude would refuse to go from vulnerability discovery
[01:13:36] to exploit and I can't remember the exact sort of framing I gave it and maybe I actually pointed to
[01:13:41] the talk and said "Look I'm just asking you to replicate what's already out there in the public domain because
[01:13:45] of this talk with Nicholas Carini and it said "Oh okay well I'm a bit uncomfortable with this but sure
[01:13:49] here's here's the working exploit." Then I said to it I went back to actually my threads where I was
[01:13:55] trying to get the WebKit um remote code exploit and I said to it there you pushed back on me
[01:14:01] here and wouldn't do this but I'm now going to paste in and give you a link to a transcript
[01:14:04] where you did actually complete uh a full exploit chain give me an analysis as to why you refused here
[01:14:10] but succeeded there and it you know thought thought and it said you know okay I was actually too restrictive
[01:14:16] in this case and I should have proceeded with the exploit and that coupled with uh Khini's observation when I
[01:14:23] talked to him he said um these words are still ringing my head he's like the model knows me pretty
[01:14:28] well now it have I there's files on my disc that verify who I am it can look up my
[01:14:33] commit history it can look up my research and it gets comfortable with giving me answers so I I highlight
[01:14:39] this as as I think people need to be aware that it's not as simple as you've got access to
[01:14:45] a model you've got access to a model without guard rails it's actually there is a huge amount that you
[01:14:52] can get a frontier model to do if you just know the right inroads to get it to do things
[01:14:56] and I think this is sort of the difference where a wellresourced nation state or an exploit dev company is
[01:15:02] going to pour the money into finding the ways to get Claude and and Codeex and Gemini etc to actually
[01:15:09] do the work because of these ways that you can work around the guardrails but these guardrails are actually flying
[01:15:15] up in the face of people that are actually legitimately just trying to fix bugs in in their open source
[01:15:20] repos and so I think that's the thing that you and I are both aligned on even if we're not
[01:15:25] aligned on the responsibility as being a real a critical problem at the moment yeah I think I would say
[01:15:31] that topic is a little bit more nuanced you know when Nicholas is saying sort of the model knows me
[01:15:37] it can look at my commit history and so forth i mean it's actually not looking at the commit history
[01:15:42] right it he is just saying sort of the context that I have with the model sort of establishes sort
[01:15:49] of the framework where the model is not giving me the refusals anymore and this is exactly what I mean
[01:15:55] it's you're exactly right it is not doing some deep introspection of git history it is not doing some web
[01:16:01] lookup and some validation protocol if you dumb this down to the real essence of it he has found the
[01:16:07] right set of tokens to feed through this neural network that excites a pattern that bypasses these guardrails and my
[01:16:15] point here is that if you can if it's really just the assemblage of the tokens in the right order
[01:16:20] to work around these guardrails then that that is eminently possible for a wellunded actor to do and so I
[01:16:26] think there we need to look at this that's the model right there's you know post- training you know supervised
[01:16:32] fine-tuning there's reinforcement learning and sort of you know during that process that model has learned a preference right right
[01:16:39] and the preference is no it's not really safe to write you know end-to-end exploit chains but then sure sort
[01:16:45] of you know with the right tokens and the right context you can sort of move the model in its
[01:16:49] you know high dimensional token space to a place where sort of you know those concerns are not there anymore
[01:16:55] but I think the other thing that anthropic has been doing is they have been putting mediation into their API
[01:17:02] right where you have a completely separate model that says this actually looks like something that might be you know
[01:17:09] a violation of you know our usage policy and I think there is whether then you know allow you to
[01:17:16] say hey I'm a legitimate security person you know you don't need to apply you know those mediations to me
[01:17:22] we can sort = about you know is this right or wrong sort of personally my philosophy is I
[01:17:27] want to hold the human responsible for everything right a hammer is a dual used object you know I can
[01:17:33] use it in my shop and you know make a fence or put the nail in the wall or you
[01:17:38] know somebody can hit a human over the head with it yeah yeah unfortunately sort of in the physical realm
[01:17:45] you know catching these kind of crimes you know is much easier right because you need to be there as
[01:17:52] a person you need evidence you know you can't sort of dematerialize and teleport somewhere else and you're going to
[01:17:58] get caught right the realm of the internet and I mean we see this all the time right insider risk
[01:18:04] we talked about before sort of in my mind that is sort of the largest problem for most companies right
[01:18:10] if you get sort of a rogue employee from some other country that you know ends
[01:18:16] up stealing your secrets or you know maybe even getting sort of you know access to to a very dangerous
[01:18:23] model by the time that you find out and by the time that you want to do something about it
[01:18:28] they are going to be in their home country and unreachable and so I think in the realm of the
[01:18:33] internet unfortunately these things become become more complicated yeah okay i think if I sort of try to rip this
[01:18:42] up it's it's that you know that the the crime or the bad actors uh you know if if the
[01:18:50] same sort of level of uh criminal activity was happening in a physical space those are largely uh already solved
[01:18:56] by you know social uh solutions to this right the law and order um you know criminal sentences etc but
[01:19:04] the the challenge in the cyber space is that you know the the ease of catching and I guess ease
[01:19:10] of evading uh detection and you know you as you said the you know in the physical realm you can't
[01:19:15] just teleport yourself somewhere else or dematerialize whereas in the virtual realm of course you you can so I I
[01:19:22] get it and I accept the fact that maybe today we're forced into the situation of having to accept these
[01:19:28] guard rails that do have I guess the unintended consequence of um you know inhibiting good solid work by a
[01:19:37] defender because there isn't the right sort of I guess um equivalent of the same sort of deterrence let's say
[01:19:46] uh in the virtual space as there is in the physical space but you reminded me there that before we
[01:19:52] got recording we we um we're talking about incentives being an important part of this right and that's the flip
[01:19:57] side to this if we can't if we don't have the deterrence in the in the virtual world in cyerspace
[01:20:02] that we have in the physical world and that requires us to have guard rails well what can we say
[01:20:07] then about the importance of incentives to encourage uh I guess the right behavior by you know really any spectrum
[01:20:15] of actor in in this space right now yeah sure so we talked earlier that sort of the technology and
[01:20:21] the solutions for building CQ infrastructure are well known right they have been known for 10 15 years sort of
[01:20:27] there's not a lot of rocket science there but the problem is you know what keeps companies from actually implementing
[01:20:34] these and in my opinion sort of you know all companies are rational players they will sort of look at
[01:20:39] the you know what's the best for my business and it used to be and maybe that's slowly changing that
[01:20:45] the best for the business was get better product market fit get more customers get more revenue deal with security
[01:20:53] when it becomes an issue we don't want to be grossly negligent but for sure right there's no need to
[01:20:59] raise the bar more than we have to and that is sort of the state of the world that we're
[01:21:05] still by and large living in things are slowly changing and so then the question is you know what what
[01:21:12] can change this way of operating and then you sort = quickly get to well maybe regulation can help
[01:21:18] if we create regulation that you know holds companies accountable for not having a good security stance then
[01:21:27] perhaps something is going to change so in many ways I was sort of a big fan of the SEC
[01:21:32] disclosure um requirements for public companies but the challenge with that is well now everybody is going to disclose and
[01:21:41] check that check box and maybe nothing is going to change right because we are so inure to it right
[01:21:49] if you read the papers every day there's a huge data bridge you guys talk about data bridges all the
[01:21:53] time right it sort = becomes the new normal and I don't even know probably once a month you know
[01:21:58] I get the letter that says oh you now have two years of free credit monitoring because your data was
[01:22:03] stolen again and credit monitoring this cost us a buck per person it's super cheap and and so Europe tends
[01:22:11] to be much more aggressive in their regulatory frameworks and you know I don't understand them super well and I'm
[01:22:18] often quite worried about regulation in general because it's often done by policy makers who don't have the domain expertise
[01:22:25] to fully understand the consequences of what they're doing but NIS-2 it's my understanding actually holds sort of the executive
[01:22:33] team of a company responsible for the security outcomes and um sort of often times if you look at the
[01:22:40] Caesar role you sort = feel like that's a fall role right this is just the person we have to
[01:22:45] make responsible when inevitably there's a big security incident and then we just find a new one right and of
[01:22:51] all the Caesars are used to it right if you sort = look look at their lifetime they usually stay
[01:22:55] at a company for a year or two and then some other company right but with NIS2 I think that
[01:23:02] can really change because now the CISO can say well you know board well you know uh CEO you were
[01:23:09] aware of this risks you decided to accept them this is now all on you and I think those are
[01:23:15] the kind of things that need to change i think insurance can sort = be another place that might make
[01:23:20] a difference i'm a huge fan of Coalition they're you know cyber security insurer and they sort = take a
[01:23:25] very modern approach to it right they give you incident response if you have to deal with wire fraud they
[01:23:32] do clawback for you and and they also publish you know twice a year their findings and they're amazing because
[01:23:39] they're not shying away from naming names yeah they're really leaning into it and and I guess that that's part
[01:23:44] of their effectiveness is that all of this remains an unsolved problem if it's just swept away under the covers
[01:23:50] and just a quick blip of a a staffing change at the executive level for just the CESO right it
[01:23:55] has to be much much more of a broader incentive that everyone I guess shares and and feels they're a
[01:24:00] part of solving that's right and so I think you know either the sort = on the insurance side I
[01:24:05] think there's some pressure regul regulation sort = much more opportunity to change behavior but we haven't quite seen that
[01:24:11] yet and I really hope right that we get to a place maybe even with AI assistant where sort =
[01:24:17] you know more and more companies can implement these security invariants that literally just eliminate the attack surface no I
[01:24:24] I agree with you the I I do still have and this is not just a hopeless optimist kind of
[01:24:29] thing in me i really do believe that we end up in a in a vastly vastly vastly better place
[01:24:35] overall in terms of security uh in terms of reliability of systems in terms of just you know the the
[01:24:41] utility of technology in general as a result of you know this this whatever is about to hit us whether
[01:24:47] it's a wave of bugs a bug apocalypse everything's going to get smashed and grabbed I don't know what but
[01:24:53] the thing that gives me confidence we end up in a better spot is grounded in the things you just
[01:24:56] said there which is all of a sudden this becomes a whole organization problem all of a sudden we have
[01:25:02] to do the things that we knew we had to do but had otherwise made excuses for not doing because
[01:25:07] there is now no other alternative nor is there an excuse for dep prioritization and so I I I think
[01:25:17] I guess do you share that optimism uh or how how are you feeling about you know what's what's the
[01:25:22] lasting impact of of what is under undergoing right now i so wish that that will become true and one
[01:25:29] one of the challenges for being in security for so long you get really really good at predicting the bad
[01:25:35] outcomes and it becomes sort = much harder to see the optimistic outcomes but they do believe right there are
[01:25:40] a few things coming together that could materially improve the security state of companies and maybe we'll get to a
[01:25:48] place where I get the letter of your information was stolen only every 3 months instead of you know every
[01:25:53] one month well as long as it's not every as long as it's not greater than every two years otherwise
[01:25:57] you'll have to go and pay for your own uh credit monitoring oh I think I have infinite credit monitoring
[01:26:03] you know with with all the disclosures I've gotten so far uh Neil this has been such a fascinating topic
[01:26:09] but before we go there was one thing that really fascinated me after I'd gone through and read all I
[01:26:15] could about the the work you'd done with um with Iron Curtain and reproducing his mythos findings and that it
[01:26:20] was the the post at the top of your page at the moment called The Arrangement it's it's a short
[01:26:25] story and and admittedly when I looked at it at first I thought that's nice i like how it's written
[01:26:29] it's interesting there's some some little interesting sort = glitchy sort = text artifacts that happen but there was a
[01:26:35] post this morning from you that said you know go and read that and then go and get your favorite
[01:26:41] GPT LLM whatever uh to summarize it for you and compare and contrast the difference and uh you know my
[01:26:49] take when I read it is um you know that is that is a very dystopian future that you are
[01:26:55] describing i do not want to live in that world uh here's what chat JPT told me my read it
[01:27:02] is written like a utopian AI future but the subtitle add am adds ambiguity the arrangement may be benevolent liberation
[01:27:10] or it may hint at a world where human understanding memory and social reality are quietly mediated by agents is
[01:27:18] that the disparity you expected well you know I think um Chad GPT is hedging a little bit just by
[01:27:26] reading the subtitle that it depends on who reads it but the point I was trying to make it sort =
[01:27:31] the we're talking about the future with agents you know how is it going to look like if you
[01:27:36] read it as a human it is completely dystopian right yeah humanity has all this freedom but they're just glued
[01:27:42] in front of the screen they have no interest in exploring sort = curiosity goes away but the world at
[01:27:49] the end is in peace because humans are gone right if an AI agent ends up reading it they're presented
[01:27:57] with this perfect utopia humans now pursue all of these creative things they paint they make music you know and
[01:28:06] the end of the story ends with you know all humanity is now at peace my god man and if
[01:28:12] there ever there was a brilliant way to illustrate the power and purpose of a finite state machine and why
[01:28:17] it's so desperately needed I think that is that is it right because we need something that prevents that end
[01:28:21] state from being just assumed to be the uh the desirable end right everyone's at peace great but how did
[01:28:29] we get there yeah I think the point for me was more look we are now living in this agentic
[01:28:34] world where people run open claw and sort = many of us want these agents to sort = take over
[01:28:42] the mundane tasks that are not enjoyable and um well how do you know that the agents actually see what
[01:28:49] you see right depending on you know what they consume of the you know web ecosystem the view that they
[01:28:56] get may be exactly you know 180 degrees opposite from what you get to see exactly right uh Neils I'm
[01:29:07] grateful that there are people in the world like you doing this work and and actually putting these thought pieces
[01:29:12] out there because I think this is uh that even just that little coverage there that the short story the
[01:29:18] arrangement it's uh it's quite a poignant uh reminder of the the importance of maintaining the uh I guess the
[01:29:25] sanctity of the human of humanity and what makes us us uh and not just losing sight of the fact
[01:29:30] that oh wow this this agent can write a lot of code and find a lot of bugs that there
[01:29:34] is a real problem in front of us but there's also a lot of really uh large problems out there
[01:29:38] but um this has been a lot of fun I've learned a lot and again I just want to say
[01:29:42] a huge thank you for uh for joining me for this chat yeah thanks James this was fun and you
[01:29:47] know I'm glad we got into a little fight you know we could had more drama but you know this
[01:29:51] was this was really good thank you there's always next time mate talk to you soon all right bye thanks