Nate B Jones explains Claude Code-work and argues task queues are replacing chat interfaces
A solo presentation by Nate B Jones of AI News & Strategy Daily on the launch of Claude Code-work, Anthropic's new file-system-based general-purpose agent tool.
Summary
Nate B Jones opens by recounting how Anthropic built and shipped Claude Code-work in just ten days after observing developers using Claude Code — a terminal-based coding agent — to organize expense receipts, sort photos, and manage downloads folders rather than write code. He argues this speed of observation-to-launch is itself a competitive advantage, and that the underlying architecture of Claude Code turned out to be the first truly general-purpose agent. The bulk of the episode is devoted to explaining why Co-work represents a fundamental shift from chat interfaces to task queues, why its file-system-first design gives it structural advantages over browser-based agents, and why its architecture is specifically designed to counter AI-generated work slop. Jones closes with a live screen-share demo showing Co-work building a PowerPoint presentation, analyzing a calendar, and scanning for duplicate files — all running in parallel.
Key Takeaways
FULL TRANSCRIPT
The Ten-Day Launch: What the Timeline Reveals
Nate B Jones: Ten days. That's how long it took Anthropic to build and ship Claude Code-work after they noticed something their product team was not expecting. Developers were using their own coding tool to organize expense receipts. And really, that story of the timeline matters more than anything else about the launch of Claude Code-work this week. It's not the expense receipts that are interesting. It's that the timeline reveals how Anthropic and AI-native organizations operate, and how that operational velocity is becoming as much a competitive advantage as the models themselves.
Here's what happened. Claude Code launched as a terminal-based agent coding tool. Engineers used it to write software, debug production issues, refactor legacy code bases. The tool sat in the terminal because that's where developers live. And it worked because the underlying architecture — a sandboxed agent that could read files, write files, execute plans, and loop humans in on progress — had turned out to be a genuinely reliable model for production work. Anthropic's internal data shows a 67% increase in merged pull requests per engineer per day. Engineers don't inflate those numbers for fun. If engineers were using it, it was because it was useful.
But then the Claude Code product team noticed something in the usage patterns. People were not just writing code. They were pointing Claude Code at folders full of receipts, full of photos, and asking it to produce expense spreadsheets, to categorize photos from the family vacation. They were asking it to organize messy downloads directories. They were using a coding tool for research synthesis, for transcript analysis, for file management — anything that could be expressed as "here are some files, here's what I want, make it happen."
Now, it's easy to think that a product manager would treat this as scope creep. Instead, Anthropic shipped the same underlying agent architecture you get with Claude Code, now wrapped in a UI that doesn't require anyone to be technical at all. Ten days from observation to launch.
Why the Speed Matters More Than the Feature
Nate B Jones: But here's what makes this more interesting than pure speed. People have been asking for exactly this capability for a while. The moment Claude Code demonstrated what agentic AI could do in a terminal, non-technical users started saying, "I'd love to get access to something similar — I'm not a coder." But demand alone doesn't tell you whether the capability is actually going to work. What Anthropic was looking for was validation, and they got it — both from their own product data showing developers already using Claude Code for those tasks, and from what they saw over the holidays, with people using general-purpose Claude Code agents to do everything from growing their tomato plants to building sensors for their homes, to writing and shipping production software, to writing and shipping their own to-do lists, things that would help you brief and get ready for your day.
When they saw all of those different use cases emerging, it became undeniable that what they were sitting on was perhaps the first truly general-purpose agent.
Compare their speed of response to classic enterprise software timelines. Claude Code is running at billions of dollars in run rate. In a traditional large company, a feature request would typically go through months of reviews before anyone writes a line of code. Obvious market demand would have to be approved, docs would have to be written. It's just not like that at Anthropic. They turned around, said "we're going to build it," used Claude Code to build it, and then built Co-work in about a week and a half.
This matters because the AI race is no longer just about models. It's about who can observe user behavior, recognize what's actually working, and rapidly ship responses before competitors jump in and grab the market.
The Visibility Gap: Non-Technical Users Could See the Capability But Not Access It
Nate B Jones: If you were anywhere near tech Twitter over the 2025 holidays, Claude Code was all over your timeline. Engineers were posting about their productivity gains. Founders were building entire products in a weekend. There was an entire Google principal engineer thread that hit five and a half million views because Jana Dogen said she had prototyped the product she spent an entire year on with her team at Google in one coding session with Claude Code. Helen Lee Kupp, a mom who voice-records ideas during morning walks — not a developer — was writing about how she figured out how to use Claude Code anyway to build what she wanted.
So it's not that Claude Code was a secret. The story was getting out and people were figuring out how to use the terminal despite themselves. And that's exactly the problem. Non-technical users could see the capability. They could watch engineers accomplish in hours what used to take days. They could read the threads. But it takes a special kind of non-technical user to jump into the terminal, look at the blinking cursor, not get intimidated, and just go with the text. The capability was really visible in testimonials from all kinds of people, but the access was not.
What gradually emerged over the last month or two is a conviction that what was special about Claude Code wasn't the code part at all. The underlying capability — an AI that can read your files, understand your instructions, make a plan, and execute a multi-step workflow — works for almost anything expressible as a task with inputs and outputs. The "code" ended up being a constraint for branding, an insistence on something that isn't true for general-purpose work.
What Claude Code-work Actually Is
Nate B Jones: Co-work keeps all the best of Claude Code — same architecture — and puts it in a friendlier package. You can point it at a folder using an interface. You just click and point. You can describe what you want in a chat and walk away. It makes a plan, shows you the plan, executes the plan autonomously, and loops you in on the progress, just like Claude Code does, but you're not in the terminal. You can queue up multiple tasks and let Claude work through them in parallel, which feels less like a conversation and more like leaving multiple messages for a co-worker.
I think this is a very 2026 experience. Instead of saying "I'm going to have a long-running iterative chat and try to prompt everything exactly right," it's going to look more like: "I have six different things I want to do. I'm going to type in six different messages and get six different threads going, and the agent is going to work on all of them at once."
File System vs. Browser: The Strategic Difference
Nate B Jones: Here's where the strategic picture gets interesting. Microsoft Copilot is a coding agent — it lives in the browser, in the cloud. Google Workspace AI lives in the browser and the cloud. There are other tools. Do Anything is a great example of a new tool that came out in 2026 — it lives in the browser. The interaction surface is web applications. The value proposition is "we navigate websites on your behalf."
Co-work is different because it operates at the file system level and can also use the browser. The interaction surface is the folders on your local machine plus anything it can touch on the web. The value proposition is that it processes the work artifacts that are already in your world, and anything you can touch on the web. That's pretty powerful.
In a sense, these aren't directly competing paradigms — they're complementary. Anthropic knows that Co-work integrates with Claude in Chrome precisely in order to bridge those modes. And the file-system-first design reflects a specific thesis about where your leverage as a worker actually lives.
Browser agents are really constrained by the adversarial nature of the web. The web is designed for humans. Sites can block them. CAPTCHAs can stop them. Login flows break them all the time. Every interaction ends up being mediated by interfaces that are designed for people, maintained by companies that are interested in selling to people, and that have no particular interest at this time in making life easier for AI agents — although that may soon change. The error surface is enormous because you're navigating systems that you can't control.
I will say these web agents have made enormous progress in getting more accurate at navigating the web and in reliably asking you to intervene. I see that across not just Claude Code in Chrome but across the Atlas browser system, across Comet, across others as well.
On the other hand, file-system agents operate in territory that is entirely yours. Your files don't have bot detection. Your folders don't require authentication — most of them. The agent can read, write, and execute with permissions that you explicitly grant. The environment is cooperative rather than adversarial. And that's a huge difference.
The strategic implication is simple, but it kind of pops out once you look at it. Browser agents will always be a little bit brittle for high-stakes tasks because the web fights back. The web is adversarial because it needs to be, from a security perspective. File-system agents can be robust because your local machine is not adversarial. Your local machine is friendly.
Anthropic's bet is that long-term, most valuable knowledge work ends up living in your files — your docs, your spreadsheets, your notes, your receipts, your recordings, stuff that gets on your hard drive or in your Google Docs. And that processing these artifacts is where the real productivity leverage sits long term.
Of course, they added web browsing and you can use it in Co-work. I tried it — it works really well. All you have to do is ask Co-work to do a task, make sure you provide it the appropriate login directly in Chrome, and you'll see a handy little yellow tab group that belongs to Claude and you're off to the races. It's not that Claude is limiting web access. It's more that Claude is recognizing that the leverage comes from owning a friendly place where work happens, which is your file system — a non-adversarial space that Claude can touch easily.
The Desktop-Native Agent Wars of 2026
Nate B Jones: This may force Microsoft's hand. Neuron Daily came out with a prediction that Microsoft will have to launch a desktop-native general agent to compete. I actually think they're underselling it. I think everybody is going to launch a desktop-native general agent in 2026. This is the year of the desktop-native general agent wars, because everybody is going to get disintermediated by this effectively handy little inbox where you can do work.
Wouldn't you rather be in one place and say, "Get me my briefing for the day. Get me these three metrics I care about from my dashboards. Make sure my presentation is ready and give it a final polish"? And it's all done in one place. You don't have to switch between PowerPoint and Tableau and whatever else you're doing. Claude for the first time offers that kind of promise with Co-work. That's why this is such a huge deal. This is a cruise missile aimed at the heart of knowledge work. Everything you do as a knowledge worker is about file ins and file outs. It's about modifying information. And for a long time in 2024 and 2025, you chatted with something and then you had to take those inputs and outputs and put them somewhere else. Not anymore. You can actually directly interact with them.
The Anti-Slop Architecture
Nate B Jones: Now, the immediate question I have — and I bet you have — is how does that relate to concerns about sloppy work? We've had a lot of concerns, especially in late 2025, about people just throwing AI work that they didn't check and didn't pay attention to kind of over the wall and saying "Good luck." That's not good citizenship. It doesn't help you build a community. It doesn't help you in your career. It's slop and it's bad.
The interesting thing about Co-work is that it's designed to be anti-slop. It doesn't mean you can't misuse it — you can — but it's designed to be more thoughtful. And this deserves some unpacking because the anti-slop thesis is much more interesting than I first thought. The more I dug into Co-work, the more I saw that thoughtfulness underneath.
Ultimately, the work slop crisis isn't about AI being bad at writing. It's about AI making it frictionless to produce very passable-looking output that shifts the cognitive burden — the real thinking you need to do — just down the street. The person receiving the AI-generated memo now has to do the thinking the sender skipped. If you generate your PRD and don't look at it, the engineer has to think about it instead of the PM. The result is communication that looks like work but functions as a tax on attention. A study by BetterUp quantified this at nearly two hours spent per piece of work slop received, which adds up to a lot of lost productivity.
Co-work's design makes several specific bets against this pattern.
First, unlike a chat, the core output of this tool is an artifact, not a text blob. When you ask Co-work to process your expense receipts into a spreadsheet, it produces an Excel file with working VLOOKUP formulas and conditional formatting — not a CSV that you then clean up, not markdown you have to copy-paste. The output is the deliverable. This matters because work slop typically lives in the gap between the AI-generated draft and the usable work product. Co-work tries to close that gap by producing files that don't require a human cleanup pass. If you can define your intent well enough, Claude Code — now dressed up as Claude Code-work — is able to do a good enough job to get it all the way done. And of course, that depends on your ability to define intent well, which is one of the key skills of 2026.
The second thing to call out is that the architecture is borrowed from a context where slop is immediately fatal. Claude Code users are typically writing software, often production software. If the output required constant cleanup, engineers would just drop it. Anthropic's thesis is that the same architecture that produces trustworthy code can produce trustworthy, anti-slop knowledge work. Software engineers who already trust Claude Code enough to ship what it produces are going to be okay using Claude Code-work for knowledge work — and more importantly, the rest of us will too. Even if we haven't had the experience of shipping code with Claude Code, we can understand that the difference between slop and not slop is about work quality, and we can appreciate the finished, polished quality of the artifacts you tend to get out of Co-work.
The third anti-slop element is subtle but important. Claude Code keeps you in the steering loop rather than the editing loop. The interface is designed around task delegation with very visible progress visibility — you literally see checkmarks down the side. It's not about prompt-response cycles. You don't just prompt it and see more text appear. You describe an outcome, Claude makes a plan, you see the plan, and you can redirect mid-execution.
One of the nice things Anthropic added here is that you can send a message to the agent in the middle of executing and just hit a button marked Q, and the agent will pick up your piece of context and add it into the long-running work without interrupting itself. This fixes a major blind spot I've seen in a lot of AI tooling, where you have to either interrupt a valuable piece of work or wait for it to finish to add an important piece of context. Not with Claude Code-work. As long as you can describe an outcome, Claude can write the plan, you can see the plan, you can redirect it. The cognitive work is on you, but it happens at the top — it's the steering work, it's articulating what you want — not downstream cleaning up what you got.
A fourth element: the file-system sandbox forces specificity, and this is a safety feature I really like. You cannot vaguely ask Co-work to help with your expenses. You must point it at real folders that contain real files. You manually touch the mouse and say "Please add expenses folder." This constraint means the AI must operate on real work artifacts rather than just generating content randomly in a vacuum. The input is concrete, and the output has something it can attach to and be faithful to. This is going to reduce hallucination.
There's a fifth element that's easy to miss. The task queue model changes the social dynamics of AI-assisted work. In chat-based AI, you're constantly prompting, evaluating, prompting, evaluating — back and forth. The rhythm encourages fast and shallow interactions. It's like batting a tennis ball back and forth. Co-work's design encourages deeper thought — deeper thought about what you want, deeper thought about what you're willing to step away from and let Claude work on for a while. The AI is not waiting for your next message anymore. It's executing a plan. This shifts the cognitive load from "what do I prompt next?" to "what do I actually need done?" — which is by far the more interesting question. And that requires thoughtfulness. And thoughtfulness is anti-slop.
Will all of this actually solve work slop? It's too early to tell — it just came out this week. But this is the kind of anti-slop architecture we need to see more of.
Jana Dogen's Thread and What It Means
Nate B Jones: I want to get a little more into the story I mentioned briefly earlier about Jana Dogen, who is a Google principal engineer and who posted the thread that got five and a half million views. What she said was: "I'm not joking and this isn't funny. We've been trying to build distributed agent orchestrators at Google since last year. There are various options. Not everyone's aligned. I gave Claude Code a description of the problem. It generated what we actually built last year in an hour."
Now, it turned out that what Claude Code built was a prototype, not the full production code. I don't want to overstate the promise. But the idea that Claude Code could look at the problem set, independently derive the correct solution, and begin to prototype that quickly should not be underestimated. That is still a very meaningful step toward what we would typically describe as artificial general intelligence.
This same power is now available in Co-work. Co-work is just a nice user interface dressed up over Claude Code. So if you've had friends telling you that you ought to use Claude Code and you've been resisting — "I'm not a terminal person" — use Claude Code-work now. It's in the Max plan for now, and that's only available for individuals. It's an alpha. It's in the expensive plan. But Anthropic historically brings things down-market quickly — into enterprise, into teams. I am trying to give you a sense of what you can actually do with it so that you can understand it.
Where This Is All Going in 2026
Nate B Jones: I want to get at where this tells us we're going in 2026.
First, I think this is showing us that the chatbot was a transitional form. It existed because LLMs could generate text before they could reliably execute plans. I don't think that's true anymore. Claude Code has proved that agentic execution works not just for software engineering but for everything else. And if that hypothesis holds, several things follow.
One: task queues are going to start to replace chat interfaces in 2026. And that's much more than a UX change. The Co-work model — where you queue up tasks, let Claude work through them in parallel, and get notified on completion — is closer to an email or a ticketing system than a conversation. But the deeper shift is in the relationship between the human and the AI. Chat interfaces position the AI as a respondent: you ask, it answers, you ask again. Task queues position the AI as your worker: you delegate, it executes, you review. This is not about asynchronous versus synchronous interaction. It's about whether you're having a conversation with the AI or managing it like an employee. The management framing changes what kinds of tasks feel appropriate to delegate, how much context you provide upfront, and how you evaluate the output. People manage workers differently than they converse with their advisers. As AI interfaces shift toward the management model, I would expect the way we use AI to shift accordingly.
Two: verification is going to continue to be a scarce skill. The second-order effects on organizational structure of everybody having Claude Code-work have not been thought through at all. When AI can execute multi-step workflows in parallel across multiple threads across the whole organization, the bottleneck shifts to knowing whether the output is correct and whether you formed the task correctly. What Jana Dogen was talking about applies more broadly. The tool amplifies people who already know what they're doing while potentially misleading people who don't. This is why AI fluency is such a critical piece in 2026.
Consider what this means for how teams are structured. Junior roles have traditionally served as execution layers — you give them well-defined tasks, they complete them, and senior people review them. If AI handles execution, we're going to continue to see pressure on junior roles. Firms that are not creative are going to say they don't need juniors. Firms that are more creative are going to say they need AI-native juniors who can teach them new patterns of work. Organizations that figure out how to develop domain expertise and anti-slop mechanisms in an AI-augmented environment are going to have a very significant competitive advantage over those that accidentally eliminate their career development pipeline by overindexing toward killing their junior roles. And that's going to be a temptation, because the power of this system is addictive. It's hard to step away from. You can do so much with the Co-work interface.
Three: I think the file-system and browser convergence is inevitable, but the way we get there matters. Co-work plus browser automation covers most knowledge work in principle. The next step is going to be seamless handoffs — how do you start with files, push to web services, pull results back to files, share with a colleague? The integration points between file-system agents and browser agents are going to break. I know my Google Calendar has trouble recognizing Claude even when I give it a login — it works sometimes, it doesn't work other times. I think that might be intentional on Google's part. Whoever is able to solve these integration problems is going to be able to get a unified execution layer in place that will unlock a ton of productivity. My guess is that this will take a little bit longer than people expect, because the hard part isn't making any type of agent work in isolation — it's making them work together reliably enough that users don't have to think about what mode they're in.
Safety: Prompt Injection and the Sandbox
Nate B Jones: That brings us to a safety piece. How safe are these? I get asked this a lot. Anthropic's safety disclosure is worth looking at more closely because it's unusually direct, and the implications cut in multiple directions.
Anthropic warns about prompt injections right up front. Prompt injections are attempts by attackers to alter Claude Code-work's plans through content it might encounter on the internet. What they share is that they've built defenses against prompt injections, but that they cannot promise it will always be safe. It looks like they've built an intermediation summary stage between raw internet input received and what the agent gets to complete the task. If that's the case, it gives us a sense of how the Anthropic team is thinking about multi-layered defenses from prompt injection — you can imagine it as a series of walls, trying to keep hostile bots and hostile actors out.
In the short term, cautious enterprises may decide that anything with a prompt injection warning is too risky. But honestly, I kind of doubt it, because the promise of accelerating tasks that used to take days into hours or less is so great that people are willing to trade that. And in practice, as someone who has used Claude Code a fair bit and now Claude Code-work, the instincts that AI has are pretty solid. It asks you for permission when it wants to touch website pages and interact with them. It does not tend to take actions like login or payment unless you specifically authorize it. And even then, on high-consequence actions like payments, it usually says "you've got to do this — I can't do this." The constitutional AI principles that the Anthropic team built into Claude help Claude make good common-sense choices in the wild and woolly world of the internet.
The file-system sandbox also helps. If you are mounting files locally, you are putting them into a safe and secure container. Let's say I have my receipts — the actual receipt file can be in my receipts folder on my hard drive. If I copy that folder into my sandbox, I can manipulate it, do things on it, and it's very low consequence because it's a copy in a secure container and I'm not touching the core folder. Now, this doesn't mean that Claude can't touch your folder. Just because it mounts it in a sandbox and containerizes the folder doesn't mean it doesn't touch your hard drive — it does. It can make changes to your files. That's part of the value. But the idea that you are securely containerizing the area of operation matters a lot when you are building with a tool that is even potentially vulnerable.
Two Signals to Watch
Nate B Jones: If I were looking to the future, I'd watch for two big signals. The first is how quickly Microsoft, OpenAI, or Google respond. If any of them ships something in the next two to three weeks or the next month, my sense is not only does the competitive picture remain open, but everyone is seeing signals on the ground that this is enough of the future of work that they have to pay attention.
The other thing I would look at is unit economics and pricing. We are in a world blessed with so many models. Do we start to see Claude Code-work come down into more economical price tiers — perhaps with a less capable model, perhaps with a limited number of max queries, whatever it takes? Ultimately, I think the incentive to give everyone these kinds of tools is very high, as long as users can show that they use those tools to produce useful products, and as long as companies can be confident that the touches on the web and the integrations with the rest of corporate systems are secure enough that the work can be usefully done, saved, and secured. I fully expect those kinks to be worked out as Anthropic inevitably pulls this over into their teams and enterprise products.
I'll close with a deeper question. What happens when a product team can observe a user behavior on Monday and ship a fully-fledged product on Thursday? That's the thing that keeps sticking with me. I started with that and I keep thinking about it. This took ten days to build. They built it with Claude Code.
Live Demo: Claude Code-work in Action
Nate B Jones: Now I'm going to show you what it looks like. This is Claude Code-work.
You see that they're giving you affordances right away — suggestions. You can create a file, you can crunch data, you can make a prototype, you can send a message. Yes, it will really send a message. You can prep for the day or organize files. That's just a preview. This progress bar is where you'll see actual plans getting made. The artifacts section is where you'll see artifacts getting made.
Let me give you an example of what we could do here. I'm going to type: "Please produce a full PowerPoint describing the launch of Claude Code-work. Conduct all the necessary research you need to do so. And when it's complete, please place it in my downloads folder as a PPTX file." Then I go to files, choose my downloads folder, allow Claude to change it, and that's it. I just tell it to go.
And you see how it's starting to get into this. You're going to start to see a progress bar being made here. Notice that it's using those Claude skills we've talked about before. Now we have a plan. It's already researched Claude Code-work details — check. I can ask a question or recommend a change right here. I can read the PPTX skill documentation and change the way it makes a PowerPoint. And it's now designing a presentation structure and aesthetic. I can give it feedback on the aesthetic right here. You see how different this is from chat. Before, in chat, I would have had to say "wait, stop — I want it to be like a modern presentation." Not anymore. I can just adjust it.
It's giving me a suggested slide structure. I'm going to say: "Please add a note on non-obvious insights and implications to the presentation." And it's right in the middle of the work — I'm just going to throw it in. You can see where it's working. It's got a shared CSS file it's working on. You can see the context it's got. It's now starting to create the slides. I love the transparency here.
And if you want to do something else, you can immediately slip over, open up a new task, and say: "Can you please look at my Google Calendar and give me an assessment of how busy I am and what would be the most useful shift to my daily ritual to prepare more effectively for work." And this is all happening in the background — Claude is still working on the other presentation. I can just start this one off. I have my Google Calendar open in my browser. It's going to continue doing its analysis.
We can go back and check in on all of the work Claude is doing. I have multiple agents running — Claude is doing research on Claude Code-work to build me a slide presentation, and the same Claude Code-work is also working to analyze my schedule. You can do five, six, seven of these.
I asked it to be a little impersonal so I don't reveal people's private information, but it talks about how I'm busy, how I need to defend my breakfast block, how I need to defend my wake window, and how having time to work out every day is a good thing. I will be honest — these are not absolutely groundbreaking assessments. The thing that's significant is I can do this in parallel, looking at the calendar, and it'll give me assessments at the same time it's working on my PowerPoint deck. That's the thing I want you to grab hold of.
And yes, it's still working on the PowerPoint deck. You can actually see all of the different artifacts it's created along the way.
Let's start a new task. Now I'm looking for duplicate files in downloads — where have I got extra files? It has access to the downloads folder because I gave it access at the beginning of the task, and it's just running. Still working on creating slides. I'll go back to the downloads.
This is what the future of work looks like. It looks like jumping back and forth between these different tabs. You can see what it's running. Now it's copying the PowerPoint to the downloads folder. Look at that. It gives me my sources — all the things it looked at. And it's going to give me a handy little button to open in PowerPoint.
And yes, it really did make the PowerPoint. It made it from scratch. You can go through and see the key features, how it works, real-world use cases, availability and pricing, non-obvious insights — which it added in the middle of the work — and the bigger picture. This was all done in the middle of doing three or four other things.
This is what I mean by the future is here. There was no code in what I described. It was just asking the AI agent to do stuff for you, and it did it.