podProse

Podcast transcripts, polished for reading

podProse

Stop accepting AI output that "looks right." The other 17% is everything and nobody is ready for it. | AI News & Strategy Daily | Nate B Jones Transcript

The case for treating AI rejection as a core professional skill

Nate B Jones argues that the most valuable AI skill is knowing when to say no to AI output — and systematizing those rejections into institutional knowledge.

Summary

In this solo presentation, Nate B Jones of AI News & Strategy Daily makes the case that the most undervalued AI skill is not prompting, workflow design, or model selection — it is the ability to reject AI output, articulate precisely why it falls short, and encode that judgment into durable, reusable constraints. He argues that while frontier AI models now match experienced professionals' output roughly 70% of the time on well-specified tasks (citing OpenAI's GDPVal benchmark), the remaining gap is entirely determined by human capacity to recognize and name what is wrong. Jones contends that most organizations are generating rejections constantly but allowing them to evaporate into email threads and chat windows, and that this represents the largest structural gap in the current AI tool ecosystem. He closes by describing a constraint library system he has built to capture rejections at the point of work, and argues that encoded domain judgment — not model choice — will be the primary competitive moat for organizations going forward.

Key Takeaways

Rejection, not generation, is the scarce AI skill. AI can now produce strategy decks, competitive analyses, product specs, and working code before lunch. The bottleneck is no longer production — it is the human capacity to evaluate output and identify what is wrong with it.

OpenAI's GDPVal benchmark reframes the AI capability story. Frontier models beat or tie professionals with an average of 14 years of experience 70% of the time on head-to-head tasks, at 100 times the speed and less than 1% of the cost. Jones argues the more important question is what determines the outcome for the remaining 30% — and what happens when the 70% that "looks right" fails in production.

Rejection has three distinct sub-skills that can be developed. Recognition (detecting that something is wrong, built through domain experience), articulation (explaining why it is wrong in a way that produces a usable constraint), and encoding (making that constraint persist beyond the moment of rejection) are each learnable and each currently underdeveloped across most organizations.

Encoded rejections compound into institutional competitive advantage. Jones uses Epic Systems and Bloomberg as examples of organizations that built structural moats not through superior technology but through decades of encoding domain constraints — clinical workflows and financial data logic respectively — rejection by rejection, failure by failure.

Most organizations are losing their rejections entirely. Constraints articulated in the moment of rejection evaporate into email threads and Slack messages, causing the same mistakes to recur. Jones argues the capture must happen inside the conversation, as a side effect of the rejection already being performed, rather than requiring a context switch to a separate tool.

Domain experts are becoming more valuable, not less, because of AI. The person who has reviewed 2,000 deals and can feel when something is off is now the most important person in the building, because AI multiplies their recognition capacity by roughly 10x — but only within the boundary of their expertise. Outside that boundary, AI multiplies confidence rather than expertise, which Jones describes as worse.

The junior talent pipeline is at risk. The breakdown of senior-junior mixing in knowledge work means juniors are not developing recognition through exposure. Jones argues that a shared constraint library, accessible via MCP server, could help juniors access accumulated senior judgment and accelerate the development of taste that would otherwise take years.

The competitive moat for organizations is encoded taste, not model choice. As AI models commoditize, the durable advantage will be the depth and specificity of an organization's constraint library — the encoded domain judgment that makes AI output reliable in a particular domain and that competitors cannot replicate simply by subscribing to the same APIs.

FULL TRANSCRIPT

The most valuable AI skill is saying no

Nate B Jones: Your most valuable AI skill is actually saying no. When was the last time you said no to AI?

I say no all the time. Not no to using AI, but no to AI output that sucks, that's not good enough. And I realize I reject much more AI-generated work than I accept — because it has the wrong framing, because it has sloppy reasoning, because it has confident-sounding analysis that would not survive contact with anyone who actually understands the domain. I send it back. I explain why.

And by the way, this is not necessarily a function of superpowered prompting. You can prompt really well, and if you know your domain, if you have high taste, you're still saying no a lot. When I want to see if someone is good at AI, I check how much they say no. And then I say no again, because the explanation I gave last time might have gotten the model from 90 to 95%, but my bar is higher.

I believe learning to say no is one of the missing skills in this whole nebulous judgment-and-taste category that everyone's waving at. It's not necessarily prompting. It's not workflow design, although those are valuable. It's not even model selection. The ability to look at AI output and say "this is wrong, and this is why" is a huge component of how you identify quality.

So I want to argue that we should treat rejection as the real AI skill. I want to argue that every skilled rejection creates institutional knowledge that did not exist before, and that we as individuals and we in our teams can compound those rejections into much more durable constraints if we start to pay attention to them — and we haven't been.

Why patterns of rejection are going uncaptured

I want to suggest to you that you probably don't track the patterns that you say no to. But there are patterns. If you start to keep those patterns, if you start to find ways to scale your nos, you start to map the dimensions of rejection as a core competency. You start to think about: what does recognition of something bad look like, systematically? What does articulating what needs to change about something bad look like, systematically? What does encoding that into a system that you can then hand to AI and say, "I've said this before, I don't need to say it again, because you now see my nos as a system" — what does that look like?

And yes, we're going to get to that.

Taste is a scalable asset. People love to say that, but what people don't talk about is that taste, if it's locked inside your human brain, ends up stressing you out. Because you look at this pile of AI-generated output that's gone up 10x or 100x or a thousandx, and you're like, I'm supposed to have taste across all of this — whatever that means. You're not going to get there if you don't learn how to both reject what's bad and also systematize your rejection so that you can start to scale it.

Your rejections are more valuable than your prompts. Yes, I said it.

What happens in the moment of rejection

So what happens in this moment of rejection? The person applying domain expertise the AI doesn't have is able to identify a specific gap between "hey, this looks right" and "this actually is correct," and they can articulate a constraint that wasn't an explicit rule before they said it.

So a strategy partner might send back an AI-generated competitive analysis and say, "Look, where's our proprietary insight on customer switching costs here? Any firm with access to the same model could have produced the framing that I'm seeing here." And there you go — you're differentiating the firm's work from commodity output.

Or a loan officer that rejects a covenant tracking prototype might say, "You can't treat a debt service coverage ratio the same as a minimum net worth requirement. Those have completely different monitoring triggers." And they're just specifying business logic that no requirements doc is going to capture.

An editor might kill a draft saying, "The thesis is buried in paragraph four. You've got to lead with provocation." Now we're encoding an editorial standard.

I'm including a range here — you'll notice I didn't lead with tech. I included a loan officer, a strategy partner, an editor, because this is about knowledge work. This is about anything we do with computers.

These kinds of rejections are not null. They're not void. They're actually knowledge creation events. We're not capturing them, mostly, but they are. And the thing I'm not hearing anyone say is: how do we take these and scale our nos? How do we take these and let this moment of rejection become a valuable moment that scales in our AI workflow? Because the knowledge created in that moment — the constraint, the rule, the encoded taste — that compounds if you let it. But right now, for almost everybody, it evaporates. It lives in an email thread. It lives in a chat window. It lives in a Slack message. Nobody really captures it. Nobody compounds it. And so the same rejection happens tomorrow when the deck comes back and the strategy partner says, "Well, our proprietary insight on customer switching costs is missing." Again.

Why generation skills are no longer the bottleneck

Most of the AI skill market is focused on generation skills. If you take a course, it's all about what are your generative AI skills — do you prompt, do you design workflows, do you select tools? Maybe it's multi-model orchestration. And they're all typically aimed at: can you produce more? Can you produce faster?

Look, production has been solved. You do need those skills, but fundamentally that is not the bottleneck. AI can generate a strategy deck or a competitive analysis or a product spec or a working application before lunch. The generation side is now effectively a commodity.

OpenAI's GDPVal — the most rigorous measurement we have against actual knowledge work — shows frontier models beating or tying professionals with an average of 14 years of experience 70% of the time on head-to-head comparisons, and they do it 100 times faster for less than 1% of the cost. And these tasks are created by those professionals, graded by those professionals, in a double-blind manner. And the models are still winning most of the comparisons.

And everyone tends to read this as either a story of AI capability or a story of workforce demise. Both of these are much less interesting readings. The more interesting reading is: if AI now matches your best people's output 70% of the time on well-specified tasks, what determines the rest of the story? What happens to the 70% that looks right on the surface but leaves the lab and doesn't hit production correctly? What happens to the 30% where the AI just completely whiffs?

The answer is the same in both cases. Someone has to look at the output and know.

And it turns out that the way you look at the output and know is by saying the word no a lot. It's by looking at a product spec and saying, "This doesn't reliably encode business intent." It's by saying, "This code demos beautifully but it's not going to actually work in production, and here's why." "This analysis is technically correct but there's no 'so what' here."

And this is something that there is no good metric to measure on AI capability. And I want to suggest that if you're operating at that frontier, if you're trying to figure out how AI can be practically applied, then look at rejection. Look at no as a skill, and look at it as a competency that you break down into multiple dimensions. Almost nobody's looking at this and measuring it deliberately.

The three dimensions of rejection as a skill

Recognition is the first dimension — the ability to detect that something is wrong. This is the part that depends on domain experience. It's hard to shortcut this. Junior analysts will not catch flawed regulatory assumptions without the deep experience that senior analysts have. Loan officers might not spot those covenant logic errors because they haven't seen enough deals. You get the idea. Recognition is the product of years of practice. And it's the reason experienced domain experts are becoming more valuable, not less, as AI floods every organization with lots and lots of output.

So the person who's reviewed 2,000 deals and can just feel when something is off is becoming the most important person in the building — not despite AI, but because of it. And recognition is the dimension most enhanced by AI. A domain expert with very strong recognition and access to AI tools can evaluate ten times the output that they could before. The leverage is very multiplicative, but it only works inside the boundary of their expertise. Inside that boundary, AI is a force multiplier. But outside of it, AI multiplies confidence, not expertise. And that's worse.

Articulation is another key skill in this larger rejection skill set — the ability to explain why something is wrong in a way that produces a usable constraint. "This isn't right" — that's just a rejection. "This isn't right because you're treating all of these requirements identically, and the PRD actually needs to be structured this way" — that's a constraint. It's the difference between taste that stays in someone's head and taste that we can start to encode and share with the team and begin to apply at scale.

This is a learnable skill, but almost nobody's teaching it. Ironically, GDPVal's methodology illustrates why this is important. Every task in GDPVal went through five rounds of expert review. Every review had a rejection event — an expert looking at a task and saying, "This is not representative enough," or "This isn't clear enough for evaluation." That iterative refinement through rejection is what made the benchmark interesting and useful. The expert taste therefore didn't just evaluate the AI — it built the evaluation infrastructure. Articulation is what turns taste from a personal attribute into something that the organization can use as an asset.

Encoding is the practice of making that constraint persist beyond the moment of rejection. And that's another skill in this cluster. This is where everything tends to break down right now, because you have someone who articulates a constraint and it lives in an email, and then next quarter a different team will make the same mistake because the constraint was never captured anywhere durable and the reasoning has to be recreated from scratch. And the time that you're spending is getting burned on that same fight with the AI over and over again.

Andrej Karpathy's framework — that AI systems improve fastest where success can be verified — has very direct implications here. Verification infrastructure that enables AI improvement does not just magically emerge. It's built, often off of encoded rejections. When you have a test suite with multiple acceptance criteria, when you have quality gates, when you have business rules in your system that work, you are looking at outputs and saying no with enough precision that no could be made permanent. What you need to do is think about scaling that to the daily practice of rejection that every good AI practitioner now does today.

How encoded rejections compound into institutional advantage

Here's where things get interesting. If you start to do this — if you start to properly encode your rejections as durable, reusable constraints — then you are now building a flywheel. You're not really scaling experts; you're scaling the encoded residue of expert judgment. You're scaling the outputs of human judgment. And this starts to compound across an organization's footprint.

So if you're a consulting firm that encodes partner rejections across thousands of engagements, you're effectively building a repeatable institutional bar for quality at that firm that no competitor can replicate by just subscribing to the same AI model APIs. If you're a media company, you can capture editorial judgment across thousands of pieces and start to develop a taste that helps individual editors scale their attention.

The companies that have been doing this, often informally for years, already tend to dominate their markets. Epic Systems did not win in healthcare by having better technology. It won by spending decades encoding clinical workflows from thousands of hospitals into a deeply integrated platform. And the result, decades later, is very clear. Clinical workflows were not always easy to discover. The development team from Epic had to go onsite, shadowing doctors, watching workflows, absorbing domain constraints to build the systems that clinicians need. And the result is a system that can handle over 300 million patient records and is so embedded in clinical operations that switching costs are structural. It's the ultimate system of record. And the moat here is not the software — it's the encoded judgment about what the software needs to get right, built rejection by rejection, workflow by workflow, failure by failure, across thousands of hospitals.

This is a lesson that people who are panicking about software as a service need to learn. When you have a system that is built essentially out of encoded taste at scale, to the point where it becomes structural for businesses, it is very difficult to rip out. Bloomberg did the same thing in financial data. When you have a vertical SaaS company that really owns a niche, you're doing some version of this.

What's new is that AI makes the encoding cycle much faster than it used to be. AI can generate a provocation. The expert can reject it. The rejection can get encoded. The library can grow. And the ratio of hours spent by expert to encoded constraints improves every single cycle because you're saving all of those nos.

The structural gap in the AI tool ecosystem

But the infrastructure to make all of this possible has really not been there in the age of AI. As far as I can tell, almost nobody is talking about how you scale your nos. This is not a small oversight. It's the largest structural gap in the AI tool ecosystem. All of the organizations using AI are generating rejections at the grassroots level — individual users see them all the time. And every single one of those, almost without exception, is falling on the floor.

And the right solution is not a separate tool. It's not a spreadsheet. It's not a database. It's not a dashboard, because people won't context switch. I believe the capture has to happen where the work happens — inside the conversation, as a side effect of the rejection you're already performing.

And I believe that so strongly I built something for it. If you head over to the Substack, I've put together a solution that is designed, at least at the personal or small team level, to enable you to start to log and encode your rejections without ever leaving your chat tool — via MCP server and a database. It's not that fancy. You can set it up on your own, I'm sure. But for those who want a quick head start, I've got it. I want to stop dropping as many rejections on the floor as I've been doing personally, as others have been doing. It became so much of a pain point for me I had to build something to solve it. And I realized, as I looked around, nobody else was looking at it this way.

Implications for hiring, talent development, and junior career paths

And this has some really interesting implications if you're in hiring or talent upskilling. Because one of the things that you get for free if you start to create a taste bar like this and make it accessible via MCP server is you start to enable people who would otherwise not have access to the taste of a very senior person. They can get that just by hitting the MCP server and coming back with it. They can get that perspective.

This means that junior positions where people develop recognition can be accelerated, because juniors can get a sense of "does this hit the bar or not" very quickly, and start to access that accumulated context of taste — that accumulated ability to articulate — that their seniors have, and that we have been losing in our talent and upskilling career ladders for the past few years.

It's one of the reasons that juniors are in crisis. We don't have the mixing we used to have between juniors and seniors in various disciplines, and we need to fix it. Because how is work going to get done in ten years if no one has learned anything?

So part of what I'm doing with something like a constraint library — which is kind of what I call this — is I'm trying to jumpstart the idea that taste is scalable, taste is a learnable skill, taste is something not only that AI can learn but humans can learn too. And because taste is individual, typically, to a particular team, a particular firm, you're going to need to build this for yourself. I can't give you a universal taste library because everybody has their own perspective inside their business. But I can show you how to build one and give you a kit for it and let you do it quickly. That part I can do.

And I think it's really important to think about it intentionally, regardless of whether you want to grab my kit or do it yourself, because if you're not thinking about your constraints, you're basically telling your AI: I'm willing to just iteratively work with you and waste my hours. And you're telling your juniors on your team: it's okay, we're not really going to teach you, and we're just going to hope you learn by osmosis on Zoom calls. And that doesn't work.

The frontier of AI value is identical to the frontier of your organization's taste

So Andrej Karpathy's framework — what you can verify, you can automate — has a corollary that should keep you up at night. The frontier of AI value is identical to the frontier of your organization's taste. Where your capacity to verify quality extends, AI can create more value. Where it doesn't, AI can generate more risk. Not theoretical risk, by the way, but compounding silent risk — of an organization that generates more and more and more while understanding less, and eventually doesn't know what's going on because the output kind of looks good but you've forgotten where the bar is.

So your anti-slop strategy is not "be more careful." It's not "give people more lectures." It's not even "write better prompts." It is developing, and institutionalizing, and eventually automating the human skill of rejection so that the human skill of rejection can scale. It's not that we see a world where humans aren't going to have taste — it's actually quite the opposite. We love that humans have excellent taste, and we want to give them more time to look at different kinds of mistakes. Because one thing I'm fairly confident of is that we're not going to run out of things to have a high quality bar about.

What this means for executives, managers, and individual contributors

So if you're an exec, the competitive moat here is not which AI vendor you choose. Models are getting commoditized. The moat is going to be the depth and durability of your organization's encoded taste — the constraint library that makes AI output reliable in your domain. You're going to have to audit it and ask yourself: where are the domain experts? Are those rejections being captured? Are they evaporating? You're going to have to start treating encoded domain judgment as an asset class, because it is one.

And if your job is managing a team, you're going to have to create space for articulation. When someone rejects AI output, you should be challenging them to explain why and to socialize that skill set. It's an investment. A team that can articulate rejections is building a shared understanding of quality that persists across projects, across personnel changes, across tool migrations. Whereas a team that is just silently and individually fixing AI output is not really growing at all.

And if you're an individual contributor, your most valuable professional development isn't learning the newest tool — tools are going to change. It's deepening your ability to recognize when something isn't working, practicing the ability to articulate what is wrong and how to fix it, and then making sure that you are part of, or helping to stand up, a system where that taste can scale.

And if you're an entrepreneur, I would challenge you to think about building a product that enables us to start to scale taste. Look at how we can take these rejections and scale them out. And I'm going to say it can't be a special new pane of glass. It can't be a special new website. We're not going to context switch. We're not going to tool switch. This needs to be seamless. This needs to be something that does not cost us attention, since attention is scarce enough.

We all have the job of learning to say no to new and more interesting things. And that means we have the job of articulating what is wrong with our current AI outputs at a high level of fidelity, storing them somewhere, and then using those to both improve AI outputs and also train folks who are younger in the career path. If we don't do that, we're going to regret it.

That's the job now. So much else is becoming commoditized. But being able to say "this is what's really excellent" — that's something you can get excited about, because you can raise your taste bar.

Polished transcript of AI News & Strategy Daily | Nate B Jones. All views are those of the original speakers. Watch on YouTube ↗

Published by @maverick

More from AI News & Strategy Daily | Nate B Jones

Microsoft CoPilot Decoded: 12 Flavors, 20x ROI Playbook3 Jul 2025

Deep Dive on OpenAI Data Connectors5 Jun 2025

The A-to-Z AI Literacy Guide (2025 Edition)9 Jul 2025

The 6 Proven AI Workflows That Survive Every AI Hype Cycle28 Jul 2025

I Was Wrong About AI Agents — This $200 Browser Actually Works11 Jul 2025

More from @maverick

BITCOIN: GOING LOWER!!! (accumulation zone, Q4 valhalla)5 Jun 2026

BITCOIN: COLLAPSING SO FAST!!!! (buy zone hit)4 Jun 2026

BITCOIN: IT IS REPEATING!!!!! (My strategy 2026)3 Jun 2026

BITCOIN: ANOTHER LEG DOWN STARTING!!! (how I profit from the bear)1 Jun 2026

The Science & Process of Healing from Grief | Huberman Lab Essentials28 May 2026

Summary