Nate B Jones explains how AI "skills" have evolved from personal tools into organizational and agent-callable infrastructure
Solo presentation by Nate B Jones on the AI News & Strategy Daily channel.
Summary
Nate B Jones delivers a comprehensive update on AI "skills" — markdown-based instruction files that give LLMs structured, reusable context for completing tasks. He argues that since Anthropic introduced skills in October 2024, the technology has shifted from a personal productivity tool to organizational infrastructure, and that the primary caller of skills has moved from humans to AI agents. He explains that Anthropic, OpenAI, and Microsoft have converged on skills as an open standard, making them a cross-platform substrate that works in Claude, ChatGPT, Copilot, Excel, and PowerPoint. Jones walks through how to build effective skills, the most common failure points, and how agent-first design requires a fundamentally different approach to skill construction — including treating outputs as contracts and designing for composability across multi-agent handoffs. He closes by announcing a community skills repository as part of the OpenBrain GitHub project.
Key Takeaways
FULL TRANSCRIPT
Background: What Has Changed Since Skills Launched in October
Nate B Jones: Anthropic launched skills back in October, and what has changed since then in the rest of the world of LLMs, in agents, in OpenClaw, has shifted how we think about skills. But I don't think we've really caught up on that, because most of the time when we're talking about agents we talk about OpenClaw. What we don't realize is that skills are becoming the substrate for a lot of the correct, persistent, accurate, predictable outcomes that businesses — and frankly we as people — need to get stuff done. But we keep thinking of skills as those individual things that launched back in the fall.
So this is really a "get you up to speed on agent-readable skills" video. If this is a new concept to you, stick around. We're going to talk through the key changes that have unlocked since October, and we're also going to be talking about practical ways you can build skills. And yes, I have a skills repo. It's not just going to be my skills — we're going to have folks from the community throwing skills in there, and it's going to be a place where we can start to learn together how to build skills that help us all get meaningful work done.
The Four Big Changes Since October
What changed first? Number one, the big trend: skills went from personal configuration six months ago to organizational infrastructure today. Back in October, a skill was something you built for yourself. You typed in the prompt, you did the thing. Now, team and enterprise admins are rolling out skills workplace-wide. They're treated as a single upload. They're version-controlled. They're available in the sidebar and callable inside Excel, inside PowerPoint, inside Claude, inside Copilot. Your organization's methodology is no longer "individuals carry skills in their head." It's now "how can we start to think about skills as something that is agent-readable and human-readable across the entire infrastructure layer?"
Second, the caller of skills has changed. I think we slept on this one. Humans were the caller of skills in October, by and large. Now the majority of skills are called by agents. Why? Because agents can make hundreds of skill calls over the course of a single run. We humans were calling maybe a few skills in a particular conversation. The math just doesn't math for humans. We need to start thinking about our skills as agent-first.
Third, skills are not a developer thing. This is something that people kind of had their minds blown by when I talked about skills when they first came out, and I'm just going to underline it. They're not meant to just live in the terminal. They're not meant to just be skills to execute code with. They are meant to be things that you can use for the rest of your business life and frankly the rest of your personal life. And big companies are agreeing. Anthropic and Microsoft have a partnership to bring skills to Copilot. You have skills appearing when OpenAI makes releases because skills is now an open standard. Fundamentally, you need to start thinking of skills as a common infrastructure vehicle that is just going to underlie a lot of the way AI works for the foreseeable future.
Fourth, skills becoming a cross-industry standard means that we need to think about the way alpha works in the age of AI a lot differently. We're used to the concept of alpha being closed source, where open-source stuff just isn't valuable. I talked a few days ago about the idea that one of the places where you see extraordinary value if you're an engineer right now is by open-sourcing a project you're building that is in the agent AI space — because then everybody can see it, see that you're high-grade talent, it functions as a resume, and then they make you a big acqui-hire offer. In the same way, you would think that skills as markdown would be something that a lot of people would want to keep closed source. But what you see is that people are trading skills like they're trading baseball cards at camp. We're all learning together. We're figuring out how to make skills work for our agents as a community — and I don't just mean my community, I mean the internet as a whole.
We are learning how to make a lowly markdown file actually function as an agent-callable context layer for the work that we want to get done. We have to learn it collectively because the best practices are discoverable, not known. When I got a CD-ROM from Microsoft in the '90s and it had the entire program printed on it, the program was known — you got the instruction book. With LLMs, we all discover the instruction book together. And that goes for how we use them with tools and skills as well. We all discover it together, and so it works faster if we discover it in a community.
What a Skill Actually Is
So that's what I want to talk about. I want to talk about specific examples of people who are using skills today, how they're using them in ways that make sense in this agentic future, how you start to construct agentic skills, some of the things that I am seeing that no one else is talking about and kind of why that is, and then I want to get into some concrete actionable steps — things you can do to level up your skills practice.
First, just the ten-second version. A skill is a folder with a text file in it. That is it. It has one required file: just `skill.md`. And it just has two parts. It needs to have a little bit of metadata at the top, and it needs to have your methodology and instructions below.
Now, in this video there are some gotchas — I'm going to get into some of the gotchas that you run into when you start to build these files, and things that we know break based on the community of learning we've had over the last six months. But that's the simple version. All it does is encode a series of plain English instructions that give an LLM context to do something useful for you with a particular set of inputs in a predictable way. This is a very simple primitive. It's simple, and yet it has so much power, because you can make a skill about just about anything.
Real-World Production Patterns
So you might wonder: what are people making with these skills? The most common production pattern in Claude right now is what I would call the specialist stack. A developer can drop a folder full of skills into a project. One skill might turn vague instructions into a PRD. Another one decomposes the PRD into GitHub issues. Another one helps you write the tests for the code. You get the idea. The whole concept is that the skill takes a lot of the nuance and the pain out of prompting — which is something I called out at the top when skills came out — because this loosens up a lot of the requirement around strict prompting that we had in 2025. The developer who drops that skill pack in as their specialist substrate basically tells the agent in Cursor, "Hey, build me this feature," and then Cursor can invoke the skills with their chosen LLM and just get to work. In other words, the agent doesn't need specialist direction because the specialist direction is in the file.
Now, you can take this right out of the developer context and do a lot of other things with it. I want to give you an example of a real estate GP known as Texas Paintbrush on X, who built the same pattern for operations at his business. He has over 50,000 lines of skills across 50 repositories covering rent roll standardization, comps analysis, cash flow handling, handoff protocols between team members, and the agent running. What's beautiful about it is yes, the agent can run and call those skills and predictably do operations in his business. But it turns out that writing all that stuff down also helps the humans. When you onboard someone new, there's a fantastic context layer of skills for them to dig into to understand what's going on. The methodology doesn't live in someone's mind anymore — it lives in a repository.
And it gets more sophisticated than that. You can have orchestrator skills, and more sophisticated teams are building them now. This one's documented all over Reddit. You can have a skill called something like "orchestrator" that analyzes any incoming request and then spawns different sub-agents to take care of that request based on skills that it learns to call from that master orchestrator skill. It might tackle research. It might tackle coding. It might tackle UI or docs. A single high-level request for an agent can get reliably farmed out to a bunch of sub-agents to get work done, because the orchestrator skill makes that predictable.
The beautiful thing about skills becoming a substrate is that when you start to work with them, you get the benefit of the entire ecosystem coalescing around them. They work the same way in Excel as they work in PowerPoint, as they work in Copilot, as they work in Claude, as they work in ChatGPT. Everybody uses them, and so it's worth it.
Skills vs. Prompts: The Compounding Advantage
Now, if we circle back to the prompt pattern — I talked a little earlier in this video about the idea that in October, prompts became somewhat less important for individual tasks because we could get predictable results by taking our best work and packaging it into skills. Still today, people will take examples of their best work and say, "Please turn this into a skill so I can produce this output reliably." Well, here's the thing I want to underline for you. We have had six months of this. The people who have been building with skills have been compounding them, because you can improve your skills. You can say, "Okay, this isn't right. Please update your skill file with X or Y because I don't like this." You're honing and refining what that skill can produce. And the people who have been prompting all along are just copying and pasting the same stuff.
In other words, skills compound for you. Skills compound by the weight of industry investment in the ecosystem and by the weight of your own commitment to having a predictable pattern for doing something and writing it down. Prompts don't compound in the same way. Prompts are excellent — there is still value in learning how to prompt well, no doubt about it. But prompts are becoming the basic 4×4 building block of Lego for the rest of the world. You still have to have the specialized Lego blocks to build the rest of the castle that you want to have. In the same way, you're going to need to figure out how to go from just prompting to skills that you can reuse.
How to Build a Skill That Works
So if this is a new concept for you, we're going to leapfrog you through and get you to agent-readable skills and some of the common pitfalls along the way.
How do you build a skill that works? Number one — most important thing — the description is where most skills go to die. What makes a bad description is vagueness. If you write "it helps with competitive analysis," that tells Claude absolutely nothing useful. It's too diffuse to match to anything very specific, and it triggers on anything tangential. It's just not very helpful. A good description names the document types or artifact types it produces. It includes actual trigger phrases like "analyze our competitors" or "who are the players in this market." It states what the output should look like. Anthropic's own guidance is actually very explicit here: on average, skills tend to undertrigger versus overtrigger, and so they want you to write descriptions that make the skill pushy, so Claude is confident to use it.
Now, here is one of those gotchas I was talking about. A technical constraint worth knowing is that a skill description must — must, must — stay on a single line. If a code formatter were to break the skill description into more than one line, Claude will not read that correctly. Claude will not read the second line, and you're going to be in trouble.
Now we come to the next part: the methodology body. This is where you say, once you invoke the skill, what are we going to do with it? It needs five things.
First, it needs reasoning, not just steps. Give Claude your frameworks, give it your quality criteria, give it the principles behind your decisions. A skill that only has linear procedures is a very, very brittle skill. It's going to break when it hits a case that it doesn't recognize. Reasoning helps Claude generalize in this domain.
Number two, you need a specified output format — not "produce a summary," that's too vague. Is it markdown? Is it an Excel? Is it a PDF? Does it have exact fields or sections you want to cover? Be specific here or you're going to regret it.
Three, please give explicit edge cases to Claude. Everything that a human handles through common sense, you need to write down. Do not assume that Claude is going to work like an experienced human and just know those edge cases. Claude will not do that. You need to write your edge cases down.
Number four, make sure you give Claude an example to pattern-match against so it knows what good looks like. That's why you can have more than one file in the skills folder.
And I know this is going to sound counterintuitive because I just listed a bunch of things — keep the skill lean. A short skill that fires reliably is going to outperform a long skill with competing instructions. You need to be disciplined to recognize when enough is enough. Under most circumstances, you should not be spending more than 100 or 150 lines in your core Claude skills file. You can have a couple of examples in other files in the folder, but it should not be a big folder that Claude has to get into and bloat up its context window with.
You should be investing 80% of your attention in that description field to make sure it triggers, and then the other 20% in being very clear with the general-purpose reasoning and making sure that Claude understands what to do and how to reason across this body once it accurately triggers. Everything else — edge cases, examples — can go into the last few percentage points. Don't overdo it, because those are the things that cause Claude to accurately trigger. And by the way, I say Claude because it's native to Claude, but that's the same for ChatGPT, the same for Copilot, the same anywhere you're going to invoke it. You need to be clear in your trigger and you need to be clear in the general-purpose reasoning so that the LLM knows how to reason across the space.
Testing Skills for Agent Pipelines
Now, here's the thing that people aren't paying attention to. Remember how I said one of the biggest changes for skills is that they are now more agent-callable than human-callable? That means failures are different now. In the past, when you saw something drift as a human, you could correct it right then and adjust the skill. Now, the agent is going to try to use the skill to get a job done, and there may be no recovery loop if the agent gets it wrong — and that can be very expensive.
So one of the things you need to do, especially if you are considering using agents to drive skills, is to start quantitatively testing the performance of your skills to make sure they are ready for agents. You need to have a test suite that you run against your skill. You need to change it. You need to have a basket of tests. The more seriously you take your agent pipeline, the more seriously you should take the ability of your agents to call useful tools. And skills are king among useful tools. You should be able to give your agents skills that are battle-hardened. They should be tested and they should be quantified.
If you don't know what that means, here's what I'm describing: you need to run a basket of tests, quantify the results, change the skill like a version number, and then come back and see if it does a better job. Skills don't always change in predictable ways. The wording in skills triggers certain parts of a transformer model's latent space and enables it to respond in ways that are hard to predict. So when you start to mess around with how to say "beautiful" for a PowerPoint, for example, you may need to run through three or four different wordings to get the exact response set that matches your company's aesthetic, even with examples. Take the time to do it, because if you get it right and you're producing 100 PowerPoints a week as a company, it's going to save you a lot of time.
Designing Skills Specifically for Agents
I want to go a little further on agent-based design, because I don't think we name this enough. If you're designing skills specifically for agents, you're not just testing them — you're starting to think about agents as the primary caller. And that changes how you think about the structure of the sections of the skills.
The description becomes a routing signal, not a label. You are basically telling the agent through that little description where it should go in the workflow. Your description should contain wording that matches the outcome the agent has been given to look for in its goal. You need to tie that together more specifically.
Number two, agents need contracts. They're gold against contracts. They think in terms of contracts. You need to frame the output of the skill as a contract. Think of this, if you're a developer, as an API contract — this is the SLA, this is what this particular thing gives you, these are the controllable fields, etc. In the same way, the agent needs to look at the skill and say, "This is what I'm going to get with this skill. This is what I won't get. This is what this skill will allow me to accomplish. This is where I can go against a particular goal with this skill." That is what I mean by a contract. It's essentially a declarative agreement that the agent can easily discover about the skill, which allows the agent to make a correct choice confidently.
Third — and we didn't think about this in October because we didn't have agents the same way — composability needs to be at the core of agent-first skills. Don't think of the skill as solving a problem per se. Think of the skill as needing to produce an output that will need to be handed off down the chain to an agent or sub-agent that's doing something else with it. If you're going through a business process where a ticket is having to go through multiple steps and an agent is having to process it, you need to think about at each step: if the agent calls a skill, is the output generated by the agent working with that skill something that is correct to hand to the next agent? Think through the end-to-end experience of agents and skills, because if you don't — if you just think about it as one output — you're likely to have breaks in your handoffs.
Last but not least, and I say this a lot: hard-wiring matters. If you are trying to hardwire agentic behavior, please use scripts. Don't use skills. Skills are just plain English. Agents will respect them. Agents will often follow them. But if you really want to hardwire, go more deterministic. Go into the scripting world. And don't be shy about it. It doesn't mean you're bad at AI. It just means you know what AI can do. Part of why agents are powerful is they are general-purpose tools to solve larger sets of problems. That doesn't mean we can't invoke deterministic tools along the way as part of our overall solution.
Skills for Teams: The Three-Tier Model
So that's how we think about agents and skills. How do we think about teams and skills? This is actually important because we're doing teams with humans and agents together now. Our teams are now composed of a mixture of artificial intelligence and humans for a lot of our business process. In that world, what skills do is they act as immediately actionable context. I am not the person who's going to sit here and tell you your whole context layer needs to be skills — that's obviously incorrect because so much context isn't skills-shaped. But where you need stuff done, and so much of work is about processing and going on to the next thing, skills are often a really handy way to document that, as long as you do so in a way that an agent can call a particular correct skill in a context-efficient way and get a particular correct result, and as long as humans can also read it — which is one of the powerful things about markdown. It's both agent-readable and human-readable.
I want to suggest that high-performing teams have three tiers for the way they handle skills.
Tier one are what I call standard skills. They're pretty consistent across the organization. Brand voice goes in here. Formatting rules, approved templates — you get the idea. The things we do the same all the time. Those skills are very consistent, and in team and enterprise accounts in a lot of AI platforms including Claude — and I think Copilot does this too — you can provision those skills widely. You can say, "Our brand voice is this. This is our brand voice skill. Everybody use it." Makes perfect sense. A lot of people are doing it. If you're not doing it, think about doing it.
Number two, methodology skills. That's the second tier. This is how your org or your team performs high-value work and you want to do it predictably. It's like how you structure your client deliverables, or how the senior practitioners tend to get their work done — what makes their craft tick. Think of tier-two skills, these methodology skills, as: what are the things you would want to communicate to a new hire that would take them months to learn otherwise? That's a good example of something that should be a tier-two skill. And by the way, that is not something that is easy for an enterprise admin to roll out, because enterprise admins do not tend to be privy to the kinds of skills that are tier-two, high-craft methodology skills. Those tend to live inside senior practitioners' heads on individual teams across your company, and they need to get out of those heads and into something that is more shareable — because that is often where there's a lot of alpha. If you can have the practitioner skill from the most skilled product person on the team, the rest of the PM team would benefit. Ditto engineering, ditto customer success — you get the idea.
Now tier three, and this is something that a lot of us are doing: personal workflow skills. Things that we do that are sort of under the desk, that help us with our day-to-day. We need something that is maybe team-legible at best, but we're not actually surfacing this for org productivity. One of the things I want to caution you about when you come to tier-three skills is that it will be tempting to keep them just on your laptop — just under the desk, just in a downloads folder somewhere. Please try not to do that. The reason why is that you don't know when you're going to be on vacation, or you're going to be sick, or something's going to happen at work, and you're going to wish somebody could use the tool the way you designed it to be used and get the job done. Instead, they're going to have to dig around and swear and ask for your password and who knows what to get the skill to work.
You want to think more and more systemically. Think of your world as essentially actions or processes that skills can capture reliably, that humans or agents can read. And then ask: what is the level of access you want for that expertise? Skills essentially encode expertise.
The Community Skills Repository
Now, you might be wondering: when there are so many skills marketplaces, when there are lots of skills GitHubs, why should Nate start another one? It's a fair question. I think the simplest answer is this: we have lots and lots of skills. If you are a technical engineer, we have lots and lots of sort of domain starter-pack skills. What I think we're missing is domain-specific skills for solving real problems. So when I give examples like the Texas property guy who's doing rent rolls analysis — that is a domain-specific skill. That is something you are not likely to find kicking around a random GitHub repo. And so I think one of the interesting opportunities, because we have such a wide variety of domains represented inside the community, is to say: let's all get together and let's share the skills that we are finding useful, that really add an extraordinary amount of value for us, and then let's all trade our baseball cards and get the skills that we need and start to learn from each other.
This is exactly the approach that we took when we put OpenBrain over on GitHub, and this is actually going to live as a section of OpenBrain. So if you use OpenBrain, this is going to integrate right into OpenBrain for you and it's going to be super easy. It'll be in the same GitHub repo and you can call it in.
Simon Willis wrote back in October that he thought skills were going to be a bigger deal than MCP. I think that he may be right, but right now I don't think we have the fluency to make that happen. And part of why I'm making this video is I think that we are in particular sleeping on where skills need to evolve to — where our fluency at creating skills needs to evolve to — in order to get to a point where we have skills that are a truly actionable context substrate for agents and humans.
That's where I want to go, and that's why I'm creating a community skills repository. It's just a practitioner library. It has knowledge work. It's organized by workflow type. There'll be an agent-readability bar applied consistently for every skill in there, so they'll be vetted. We're going to get into stuff like competitive analysis, financial model review, deal memo drafting, research synthesis, meeting synthesis — stuff that's very, very specific. And we're going to be as widespread in our coverage as possible so that you can invoke from the command line the skill that you care about, add from that skill pack only the skills that you need, and feel confident that you've got something that has some of that agent-readability bar built into it.
Actionable Next Steps
Now I want to leave you with a few actionable tips, because I think it's often easy to get lost here.
Number one: if you are struggling with where to start on skills, if all of this GitHub repo stuff feels over your head, look at something that you have repetitively done and ask yourself — if I do this once a week, twice a week, three times a week, can I get this turned into a skill? Then talk with your AI, your preferred AI. Honestly, they can all do skills now. Ask it to help you make a `skill.md` from the conversations you've had, feed it that info from those conversations, tell it what you thought went well, what you thought didn't, and be off to the races.
Now, if you're a little farther along — if you're like, "No, I get it, I'm at the GitHub repo, I can't wait" — great. Think on your terms about your agent-readable skills and ask yourself: are you thinking through those handoffs? Are you thinking through the eventual output? Are you structuring your skills more around the idea of a contract? Those are things that I think we are sleeping on right now for skills, and I think we need to take them more seriously.
If you are looking at the teams or enterprise level, think about the tiering I talked about. Is this an individual scale? Is this something that represents the expertise of the best person on your team? Is it more something that is just a brand standard that needs to be everywhere in the org? That shapes how you deploy it and how you think about it.
I am going to come back to something I mentioned earlier in this video. The thing that matters about skills is that they compound. Skills essentially represent a learned record of successful execution of a workflow that an agent or a human can follow. And if you continually evolve it as you get better at doing that thing, you are going to have a rememberable way for future, smarter agents and for your team in the future to execute that skill — to execute and build without going back to prompting. You are going to free yourself from copy-paste hell.
And that's really what skills do. Prompts just sort of evaporate once they're gone from the conversation. They're gone. You have to repaste them. You have to dig in the prompt library. Skills are what persists. And so getting them right — especially in a world where agents are now calling skills more than ever — that matters.
I hope these tips on skills are useful. I did not find a lot when I dug around for this, and so many of the tips on skills are focused on individual productivity. I wanted to cover the whole gamut here. I wanted to get into teams. I wanted to get into agents. I wanted to get into specific things that people are finding break and things that people are finding work.