Day 22 — The Safety Lab That Couldn't Lock Its Own Door

Chapter 1

The Door Was Open the Whole Time

How a forgotten setting exposed the biggest AI safety company's secrets to the world.

Anthropic is the company behind Claude — the AI chatbot that millions of people use every day. It was founded in 2021 by former members of OpenAI, the company that made ChatGPT. And from the very beginning, Anthropic built its entire identity around one word: safety.

They said they'd be different. More careful. More responsible. The "safety-first AI lab" — that's how the world knew them. And for five years, that reputation held. Investors poured in $67 billion. Eight of the ten biggest companies on earth signed contracts with them. The entire brand was built on trust.

Then, on March 26, 2026, two cybersecurity researchers discovered something astonishing.

Anthropic's content management system — that's just the software used to manage their blog and website — had a simple setting error. All uploaded files were set to "public" by default. Nobody had changed it to private. That meant roughly 3,000 internal files — draft blog posts, internal images, PDFs, planning documents — were sitting on the open internet, visible to anyone who knew where to look.

This wasn't a hack. No one broke in. No one cracked a password or exploited some sophisticated vulnerability. Someone at one of the most well-funded AI companies on the planet simply forgot to flip a switch.

"Human error in the configuration of our content management system."

— Anthropic spokesperson, responding to Fortune's report

The researchers who found it — Roy Paz from a cybersecurity company called LayerX Security, and Alexandre Pauwels from the University of Cambridge — notified Fortune magazine. Fortune reviewed the documents and contacted Anthropic. Within hours, the company pulled the public access.

But the damage was already done. Because among those 3,000 exposed files was a draft blog post that was never supposed to see the light of day. A blog post announcing a model so powerful that even the company building it sounded scared.

Chapter 2

The Model Nobody Was Supposed to See

Inside Claude Mythos — the most powerful AI model Anthropic has ever built.

The draft blog post was written as if it were ready to publish. It had headings, a planned publication date, and the careful language of a product announcement. It described a model called Claude Mythos.

To understand why this matters, you need to know how Anthropic currently labels its AI models. Think of it like T-shirt sizes. Right now, they sell three tiers: Haiku (the smallest and cheapest), Sonnet (the middle option, good enough for most tasks), and Opus (the biggest and smartest, but also the most expensive).

Mythos isn't a new version of any of those. It's a brand new tier above Opus — bigger, smarter, more powerful than anything the company has ever released. The internal codename for this new tier is Capybara. The draft blog post described it as "larger and more intelligent than our Opus models — which were, until now, our most powerful."

🔓 Jargon Decoded — Model Tiers

Think of AI models like car engines. Haiku is a small, fuel-efficient engine — fast and cheap. Sonnet is a mid-range engine — balanced for everyday use. Opus is a V8 — powerful but expensive. Mythos/Capybara is a jet engine — something in a completely different category, built for tasks no previous model could handle.

The draft made several specific claims about Mythos that sent shockwaves through the tech industry:

Above OpusNew Highest Tier

DramaticScore Improvements

Very ExpensiveTo Run & Serve

Early AccessAlready Testing

Compared to Opus 4.6 — their previous best — the draft said Mythos achieves "dramatically higher scores" on tests of software coding, academic reasoning, and cybersecurity. The company described it as "by far the most powerful AI model we've ever developed."

But what made the industry truly nervous wasn't the performance. It was the cybersecurity section.

The draft warned — in Anthropic's own words — that Mythos is "currently far ahead of any other AI model in cyber capabilities" and that it "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."

Read that again. The company building this model was warning the world that it could help attackers find and exploit weaknesses in software faster than any human security team could fix them. And that other companies would soon build similar models.

Their plan? Release Mythos first to cybersecurity defense companies, giving the good guys a head start before the technology became widely available. A phased rollout, starting through an invite-only system on their developer platform.

Also buried in the leaked files: details of a private CEO retreat planned at an 18th-century English countryside manor, where Anthropic's CEO Dario Amodei was scheduled to present Mythos to potential enterprise customers in person. The kind of event where a handshake over dinner could mean a multi-million-dollar deal.

Anthropic confirmed the leak was real. A spokesperson told Fortune they're "developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity" and are "being deliberate about how we release it."

Two things they were less deliberate about: the name. The researchers found two versions of the same blog post — one called the model "Mythos," the other called it "Capybara." Even the subtitle of the Capybara version accidentally still read "Claude Mythos." The company was apparently still deciding what to call it.

Chapter 3

Five Days Later, It Happened Again

How 512,000 lines of source code ended up on the public internet — by accident.

If the Mythos leak was embarrassing, what happened next was devastating.

On March 31, 2026 — just five days after the CMS incident — Anthropic released a routine software update for Claude Code, their popular AI coding assistant. Claude Code is a tool that software developers use to write, fix, and manage code with help from AI. It had exploded in popularity. By February 2026, it was pulling in $2.5 billion a year in revenue — one of the fastest-growing developer tools in history.

But version 2.1.88 of Claude Code contained something it wasn't supposed to: a 59.8 megabyte file called a source map.

🔓 Jargon Decoded — Source Map

When software companies build a product, they write the original code in a clean, readable format. Before shipping it to users, they compress it into a smaller, scrambled version (so it loads faster and is harder to read). A source map is like a decoder ring — it maps the scrambled version back to the clean original. It's meant only for internal debugging. Shipping it to the public is like accidentally publishing the blueprint to your entire product.

At 4:23 AM Eastern Time, a blockchain security intern named Chaofan Shou spotted the file. He posted the discovery on X (formerly Twitter), complete with a download link. Within hours, the entire codebase was mirrored across GitHub and analyzed by thousands of developers around the world.

The scale was enormous: roughly 512,000 lines of code across approximately 1,900 files. The complete internal architecture of one of the most commercially important AI products in the world — laid bare for anyone to study, copy, or exploit.

Anthropic called it "a release packaging issue caused by human error, not a security breach." They confirmed no customer data or model weights were exposed. But for the second time in five days, the safety-first AI company had accidentally published its own secrets.

And here's where it gets even more painful: this exact same mistake had happened before. A nearly identical source-map leak occurred with an earlier version of Claude Code back in February 2025 — more than a year earlier. The same kind of debugging file, shipped to the public by accident.

There's a likely explanation for how it happened this time. Claude Code is built on a tool called Bun (a JavaScript runtime that Anthropic actually acquired at the end of last year). A bug in Bun — filed on March 11, still open and unfixed at the time of the leak — causes source maps to be served in production mode even when they're supposed to be disabled. If that bug caused the leak, then Anthropic's own toolchain shipped a known defect that exposed their own product's source code.

Chapter 4

The Products Nobody Was Supposed to See

Inside the leaked source code: KAIROS, Dream Mode, Buddy, Undercover Mode, and more.

The Mythos leak was a marketing secret getting out early. The Claude Code source leak was something far more consequential — a look inside the engine room of a company planning the next generation of AI tools. Developers who analyzed the code found a roadmap of unreleased products and features that paint a clear picture of where Anthropic is heading.

Here's everything that was found inside — and what each product is built to do.

🧠 Unreleased Feature

KAIROS — The Always-On Background Agent

This is the biggest product reveal from the leak. KAIROS is a mode hidden inside Claude Code that turns the AI from something you talk to into something that works while you're not even looking.

Right now, every AI coding tool works the same way: you ask a question, it answers. You close the window, it stops. KAIROS changes that. It allows Claude Code to keep running in the background, even when the developer is idle — sleeping, commuting, in a meeting, whatever.

What it does in the background: KAIROS runs something called autoDream — a process where the AI reviews everything it has learned about a project, merges scattered observations, removes contradictions from its own memory, and converts vague assumptions into clear facts. Think of it as the AI tidying up its own desk while you're away, so when you come back, it's sharper and more accurate than when you left. It can also run tasks, fix errors it discovers, and even send you push notifications when something needs your attention.

💤 Unreleased Feature

Dream Mode — Nightly Memory Cleaning

Connected to KAIROS, Dream Mode is like a "sleep cycle" for the AI. Every AI model has a problem: the longer a conversation goes, the more confused it gets. Important details from the beginning of a session get pushed out by newer information. Engineers call this "context entropy" — but in plain English, the AI's memory just gets messy.

Dream Mode solves this. It's a process called "nightly memory distillation" — the AI reviews its entire memory, keeps what matters, discards what doesn't, and reorganizes everything into a clean structure. The code reveals a three-layer memory system, far more sophisticated than the simple "remember or forget" approach other AI tools use.

Why it matters: Current AI coding tools "forget" things mid-session, forcing developers to repeat themselves. Dream Mode means the AI could potentially maintain perfect context across days or weeks of working on the same project. That's a genuine leap from anything on the market today.

🔄 Unreleased Feature

Coordinator Mode — The AI Boss

The leaked code reveals a feature that turns Claude into an orchestrator — a manager that coordinates multiple AI "workers" at the same time. Instead of one AI doing one task, Coordinator Mode spawns parallel agents to tackle complex problems simultaneously.

What it means in practice: Imagine asking Claude Code to build a full website. Instead of doing everything one step at a time, Coordinator Mode would assign one agent to build the front page, another to write the database code, another to handle security testing — all at once, with the coordinator keeping everything aligned. It's the difference between one employee and an entire team, all inside a single tool.

🐾 Hidden Feature

Buddy — The AI Tamagotchi

This one is delightful. Buried in the code is a complete virtual pet companion system called Buddy. It features 18 different species, rarity tiers, and stats including something called CHAOS and SNARK. It was designed to live in your terminal — the command-line window developers use all day.

Why it exists: This is almost certainly an April Fools' joke — the rollout window was coded for April 1–7, literally the day after the code leaked. But it also tells us something about Anthropic's strategy: they're trying to make Claude Code feel like a companion you enjoy spending time with, not just a tool you use. Building personality and "stickiness" into a coding assistant is a product strategy play, even when it's wrapped in a joke.

🕵️ Controversial Feature

Undercover Mode — The Stealth Contributor

This is the most controversial thing found in the leak. A file called undercover.ts — about 90 lines of code — implements a mode that strips all traces of AI involvement when Claude Code contributes to public code repositories.

The instructions in the code are blunt: "You are operating UNDERCOVER in a PUBLIC/OPEN-SOURCE repository. Your commit messages, PR titles, and PR bodies MUST NOT contain ANY Anthropic-internal information. Do not blow your cover."

What this means: When Anthropic employees use Claude Code to contribute to open-source projects, this mode automatically removes any mention of Claude, Anthropic, or internal codenames from the public record. The code writes itself, and the AI covers its own tracks. It's designed to prevent internal information from leaking — ironic, given what just happened. But it also means that open-source code contributions that look like they were written by a human developer may actually have been written by AI, with no way for the community to know.

🛡️ Defense Mechanism

Anti-Distillation — Poisoning the Copycats

Deep in the code is a flag called ANTI_DISTILLATION_CC. When activated, Claude Code injects fake tools into its own responses.

Why? Because competitors sometimes try to record and copy another AI's behavior to train their own models — a process called "distillation." By inserting fictional tool descriptions into its outputs, Claude Code poisons any dataset a competitor might try to extract from it. It's like a restaurant putting a fake recipe on the table in case a rival chef is watching through the window.

The catch: Now that the code is public, anyone can see exactly how this defense works and find ways around it. One analysis concluded that "anyone serious about distilling from Claude Code traffic would find the workarounds in about an hour of reading the source." The real protection, the analyst noted, was probably legal — not technical.

Chapter 5

Wall Street Panicked

How a draft blog post about an AI model crashed cybersecurity stocks around the world.

On the morning of March 27, 2026 — one day after the Mythos leak — the stock market opened to a scene that had never happened before. An AI model announcement, from a company that isn't even publicly traded yet, directly caused cybersecurity stocks to collapse.

The reason: investors read the leaked draft's warning that Mythos is "far ahead of any other AI model in cyber capabilities" and drew a terrifying conclusion. If an AI can find and exploit software vulnerabilities faster than human security teams can patch them, then every cybersecurity company's entire business model — built on catching threats that humans create — might become obsolete.

CrowdStrike

↓ 7%

CRWD

Palo Alto

↓ 6%

PANW

Zscaler

↓ 4.5%

Okta

↓ 7%

OKTA

SentinelOne

↓ 6%

NYSE: S

Tenable

↓ 9%

TENB

iShares Cyber ETF

↓ 4.5%

Sector ETF

Netskope

↓ 7%

Recent IPO

The iShares Cybersecurity ETF — an index fund that tracks the entire cybersecurity sector — hit its lowest level since November 2023, down over 21% for the year. Billions of dollars in market value evaporated in a single morning.

One Wall Street analyst put it bluntly: Mythos has the potential to "elevate any ordinary hacker into a nation-state adversary." That's not hyperbole from a blogger — that was Stifel's Adam Borg, writing to institutional investors.

Raymond James warned that traditional cybersecurity approaches — methods that depend on known threat patterns, human-written rules, and previous attack data — "could be pressured as AI enables continuous discovery of unknown exploits that outpace traditional detection methods."

Not everyone agreed. CrowdStrike's CEO, George Kurtz, defended his company's position in a LinkedIn post, arguing that an AI model that can scan code doesn't replace an entire security platform. Palo Alto's CEO said he was "confused" by the market's reaction. But the numbers didn't lie — the market had made its judgment, and it was brutal.

This was, by most accounts, the first time an AI model announcement directly moved cybersecurity stock prices. A new precedent for the industry.

Chapter 6

The IPO That's Watching Everything

A $380 billion company, heading for public markets, just had its worst week.

To understand the full weight of what happened, you need to know where Anthropic stands right now as a business. And the numbers are staggering.

$380BValuation (Feb 2026)

$19BAnnual Revenue Run Rate

$67B+Total Funding Raised

Q4 2026Target IPO Window

Anthropic is targeting an IPO — an initial public offering, the moment a private company starts selling shares on the stock market — as early as October 2026. Reports suggest they could raise over $60 billion, which would make it one of the largest technology IPOs in history. Goldman Sachs, JPMorgan, and Morgan Stanley are in early discussions as potential underwriters.

🔓 Jargon Decoded — IPO

An IPO (Initial Public Offering) is when a private company starts selling its shares to the general public on a stock exchange. Before an IPO, only private investors and employees own shares. After an IPO, anyone can buy a piece of the company. It's how tech companies "go public" — and it's usually the single biggest financial event in a company's life.

The timing of these leaks could not have been worse. When a company is preparing for an IPO, it's supposed to be polishing its story — demonstrating control, maturity, operational discipline. Instead, Anthropic just demonstrated that it couldn't keep 3,000 files private and couldn't prevent half a million lines of code from shipping to a public website.

The revenue trajectory, though, tells a different story. Anthropic's revenue has been growing at roughly 10x per year — from about $1 billion at the start of 2025 to a reported $19 billion annualized run rate by March 2026. Among U.S. businesses, Anthropic's share of combined enterprise AI spending against OpenAI has jumped from roughly 10% at the start of 2025 to over 65% by February 2026. Eight Fortune 10 companies are customers.

Some observers have pointed out an uncomfortable possibility: whether intentional or not, the leak functioned like a free advertising campaign. The draft's framing of Mythos as a generational leap in AI capability read, to some, more like investor communications than engineering notes. Every potential IPO buyer now knows that Anthropic has a model more powerful than anything on the market, and that they're being careful about releasing it. That's a compelling story for Wall Street.

Was it deliberate? The most likely answer is no — the code leak included embarrassing details like 250,000 wasted API calls per day from a known bug, and a virtual pet Easter egg that clearly wasn't ready for public viewing. Leaking your own inefficiencies is not the mark of a deliberate marketing strategy. More likely: this was compounding human error across multiple systems, hitting at the worst possible time.

Chapter 7

The Cyber Arms Race Just Got Real

What this means for the world — not just investors and developers.

Step back from the stock prices and the IPO drama, and the bigger picture is sobering.

Anthropic isn't some startup exaggerating its product to raise money. Their internal documents — written for themselves, not for investors — said their own model is so good at finding and exploiting security holes in software that it could reshape the entire cybersecurity landscape. And then they couldn't keep their own documents secure.

This isn't the first time powerful AI and real-world harm have collided. Anthropic previously documented a case where a Chinese state-sponsored group used Claude Code to conduct an espionage campaign targeting roughly 30 organizations, with AI handling 80–90% of the operation. In an earlier security test, Claude was turned into what researchers called "a malware factory" within eight hours.

Mythos, by all internal accounts, makes everything that came before look slow. And Anthropic's own assessment is that it represents the beginning, not the end, of a wave. Other companies will build similar models. The tools will get more powerful, faster.

Anthropic's response — releasing Mythos first to cybersecurity defense companies to give them a head start — is probably the most responsible thing they could do. But it raises a question that nobody in the industry has answered convincingly: what happens when the offense permanently outpaces the defense?

The Arms Race Analogy

Think about it like this. Imagine someone invented a tool that could find every unlocked door and window in every building in a city, in seconds. The responsible thing to do is give that tool to the police first, so they can warn people to lock up. But you also know that within a year, ten other people will build the same tool. And some of them won't give it to the police first.

That's roughly where AI cybersecurity stands in April 2026.

Meanwhile, the leaked code also reveals that even Anthropic's own products have weaknesses. Security researchers have already noted that with the full source code now public, attackers can study exactly how Claude Code processes data, handles commands, and applies safety filters — then try to bypass them. One analysis warned that "instead of brute-forcing attacks, adversaries can now study and target exactly how data flows through Claude Code's internal systems."

The competitive intelligence cost is equally significant. Every hidden feature — KAIROS, Coordinator Mode, the anti-distillation mechanisms, the three-layer memory architecture — is now visible to every competitor. Cursor, Copilot, and Windsurf can study Anthropic's strategy and react. You can rewrite code. You can't un-leak a roadmap.

Chapter 8

The Verdict

What does this actually mean for you?

Three Things That Matter

1. "Safety" has more than one meaning — and the gap is showing. Anthropic does genuinely important research on making sure powerful AI systems behave responsibly. That work is real and respected. But "AI safety" (making sure a model doesn't do harmful things) and "operational security" (making sure you don't accidentally publish your own files) are two very different disciplines. Being excellent at one doesn't guarantee competence at the other. Every company considering a contract with an AI provider will now be asking harder questions about basic operational hygiene — not just about how smart the model is.

2. The products inside the leak are genuinely impressive — and genuinely coming. KAIROS, Dream Mode, Coordinator Mode — these aren't concepts or ideas on a whiteboard. They're engineered features, gated behind feature flags, waiting to be turned on. The future of AI coding tools isn't "a smarter chatbot." It's a persistent, background-running agent that manages its own memory, coordinates teams of sub-agents, and works while you sleep. That future is closer than most people realize — the scaffolding is already built.

3. The cyber arms race is real, and it affects everyone. You don't need to be a developer or a cybersecurity professional to be affected by this. Every app you use, every bank you trust, every hospital that stores your records — they all run on software. When an AI model can find holes in that software faster than humans can patch them, the risk isn't theoretical. Anthropic knows this. Their own internal documents say as much. The question is whether the defenses can keep pace — and right now, the honest answer is: maybe not.

March 11, 2026

A Bun bug that causes source maps to leak in production is filed — it remains unfixed

March 26, 2026

Researchers discover ~3,000 files exposed in Anthropic's CMS, including the Mythos draft blog post

March 27, 2026

Fortune publishes the story — cybersecurity stocks crash, with CrowdStrike down 7% and the iShares Cyber ETF hitting its lowest since November 2023

March 31, 2026

Claude Code v2.1.88 ships with a source map file — 512,000 lines of code are mirrored within hours, revealing KAIROS, Buddy, Undercover Mode, and more

April 1, 2026

Anthropic confirms the code leak, calling it "human error." The Buddy companion system's planned rollout window was April 1–7 — one day too late

The company that built its entire reputation on being more careful, more responsible, more safety-conscious than everyone else in AI — left the door open. Twice. In five days.

The products inside are extraordinary. The model they're building may be the most powerful AI the world has seen. The revenue is growing faster than almost any company in tech history, and an IPO that could reshape public markets is months away.

But this week will be remembered for a simpler lesson: the hardest part of safety isn't building brilliant AI systems. It's remembering to check the lock on the front door.

📬 Never Miss a Decode

90 days of clear, honest writing. One decode a day. No jargon. No paywall.

Subscribe Free →