Podcast transcripts, polished for reading

The Al Prompting Mistake CostingYou Hours Every Week (10 Prompts to Fix It) | AI News & Strategy Daily | Nate B Jones Transcript

Polished transcript · AI News & Strategy Daily | Nate B Jones · 5 Dec 2025 · 10m · @maverick

AI model selection strategy: choose by task, not by workflow

Nate B Jones argues that the key to effective AI use is selecting models at the task level, not the workflow level.

Summary

Nate B Jones of AI News & Strategy Daily presents a single-presenter argument about how most people approach AI model selection incorrectly. The central claim is that asking "which model should I use for my workflow?" is the wrong question — the right question is which model should be used for each individual atomic task within a workflow. He argues that workflows are composed of many discrete microtasks, each of which may be best served by a different model, and that treating the whole workflow as a single unit of work is the primary reason AI automations fail, loop, or hallucinate. He also argues that investment in AI follows an exponential rather than linear return curve, meaning those who pay more and know how to use multiple models get disproportionately greater value.

Key Takeaways

  • The workflow is the wrong unit of analysis. Most people assign entire workflows to a single AI model, but workflows are made up of many atomic tasks — cleaning data, finding context, reasoning, transforming formats, producing artifacts — each of which may require a different model for best results.
  • Scoping is the root cause of automation failure. When AI automations stall, loop, or hallucinate, the problem is usually not the model — it's that the unit of work was not correctly scoped. A model cannot fix a badly scoped task.
  • Model selection requires fingertip feel, built through practice. Knowing which model to use for a given task comes from deliberately testing multiple models on real work and honestly comparing results. There is no shortcut to this.
  • Different models have different strengths at the task level. Nate gives a concrete example using a product requirements document (PRD): Gemini for synthesizing customer stories, Gemini with visual capability for UI analysis, ChatGPT 5.1 in thinking mode for roadmap alignment, and Claude Opus 4.5 for constructing the final document.
  • AI investment returns are exponential, not linear. Spending more on AI plans and knowing how to use them does not produce proportionally more value — it produces disproportionately more value, because higher-tier plans offer better intelligence access and higher usage limits.
  • The "which model" question is getting harder, not easier. As more models enter the market with varying capabilities and pricing, the decision of which model to use for a given task is becoming more complex, not simpler — which itself represents an area where human judgment and expertise will remain valuable.
  • Casual vs. serious AI use requires different approaches. For casual use, picking any single model and paying a flat monthly fee is fine. For serious, repeatable, high-quality work, a single-model approach is insufficient.
  • FULL TRANSCRIPT

    The wrong question most people ask about AI models

    Nate B Jones: I get asked all the time, "Nate, which model would you use for this or that workflow?" That's really the wrong question. And I want to spend this video talking about asking the right question so you can get farther on the AI work that you're doing.

    Don't ask which model should I use for my workflow. Instead, think about the atomic level of the task. Ask which model should be used for your task. And if you put up your hands at this point and say, "Nate, I'm asking the same thing — a task is a workflow," no, it's not. Tasks are bits of workflow. They're like Lego bricks inside a workflow. And the reason I'm insisting on that level of detail is because if we're not that honest about the individual pieces inside our workflows, we're not going to be able to pick the right model for the job.

    If you want reliability and speed and accuracy and the right model for the task, you have to be honest about how messy your data is. You have to be honest about how many steps the task requires, what the final output needs to look like. Most people just want to be told the answer. And that's why their automations fail. There's not a shortcut.

    Why model selection is getting harder in 2025

    People keep asking, "Will people have jobs?" This is an example of where we're going to have jobs. We're going to have jobs because "which model should I use" is a really hard question. And it is getting harder in 2025 — not easier. Do you know why? Because there are more and more models to choose from, more and more levels of intelligence, more and more unit economics to factor in. Even if you're a consumer and you don't care about cost per token, there are more consumer models to choose from. You can choose Kimi K2 thinking, or Claude, or ChatGPT, or Gemini, or Grok — you name it.

    The real problem for most people is that they have difficulty getting to a level of clarity about what they plan to do with the work that they assign the LLM. The workflow is too big a unit of work. Most workflows consist of something like a dozen different microtasks, and people tend to want to assign the entire thing to the AI.

    Now, I have to give credit to the model makers here. They are doing their very level best to give us models that can take that level of vague assignment and make it work. They're working really hard on that. I saw huge progress with Claude Opus 4.5 in particular, because it can take a big messy task like "make a deck out of this mess" and it will just work away until it gets done. Same thing with vibe coding in Opus 4.5 — it just works away and knocks down bugs until it gets done. And so we are getting to the point where, for some consumer applications, if you just hand the model a bunch of stuff, it will produce something at the end of a workflow, which is in and of itself kind of amazing.

    But if you want predictability, if you want repeatability, if you want high quality and high consistency, then what you need is to think in terms of the task.

    Breaking down a workflow into atomic tasks

    So I'm going to give you some examples. These tasks are very common — they occur across multiple workflows. And the more you can see workflows as composed of Lego bricks, because they're interchangeable pieces, tasks that we repeat over and over with different inputs, the better off you're going to be at finding the right model.

    Cleaning data — that's a great example of a Lego brick. Finding context — another great Lego brick. Inferring missing pieces from a pattern — that's a Lego brick. Reasoning — that's a Lego brick. Transforming format from A to B, checking correctness, producing an artifact, handing it off to the next step, passing the data along, making a plan to get something done — these are all individual tasks.

    So when someone says "my workflow," if we actually want to automate it, I ask myself: which of several LLMs do we need to get involved with here? Because you might pick one LLM for cleaning the data and a different one for reasoning. You don't often need a very fancy model for cleaning data unless the data is really dirty.

    And this is why people keep trying to throw a single agent at a 14-step process and then wonder why it stalls, why it loops, why it hallucinates. A model is not going to magically fix a bad scoped unit of work. A model will not repair something and make it work if you didn't scope it correctly to begin with. The unit has to be the task. If we want to win the workflow from an AI automation perspective, if we want AI automations that work, it starts with understanding our tasks.

    A practical example: writing a product requirements document

    So ask yourself, if you're doing a particular piece of work — and this by the way is not just for AI engineers, it's for anybody — what is the real sequence of irreducible atomic units of work here?

    What am I doing when I write a product requirements document? Well, I have to synthesize information from 50 different customer stories. Then I have to study the current UI and extract an understanding of where the feature would go. Then I have to think of three different ways the feature could go based on the three different insights I've gotten from the customer stories. Then I have to align that with the roadmap. And then — you see what I mean, right? These are all individual atomic units of work for just one flow around writing a PRD.

    And the trick is, if you're picking models, I find it's better to pick the model that goes with that unit of work. So if we play that back again: I would use Gemini right now to synthesize those customer stories — it's especially good with synthesizing video. I would use Gemini with its visual capability to study the UI and identify places and ways to put the new feature in. I would probably use ChatGPT 5.1 in thinking mode or pro mode to think about the relationship between the roadmap and the proposed idea. I would probably use Opus 4.5 to construct the PRD document once all of those inputs are in place. And I could use other tools — ChatPRD exists for a reason, it's a great tool, and you can use specialized tools in some of these instances.

    But if you want to get more fluent at AI, if you want to get more fluent at model selection, it starts with understanding the task.

    Building fingertip feel for models through deliberate practice

    I broke out that PRD example so you can see how I'm taking apart the task and picking a particular model for each piece based on my fingertip feel for the models. And if you want to know how to get to a fingertip feel for the models, the simple answer is it goes right back to the task. It goes back to giving the model a job.

    I know that I trust Gemini with customer stories because I have tried Gemini with Claude, and I've tried it with Grok, and I've tried it with ChatGPT, and I've tried it with Kimi, and I know that Gemini does a better job. It synthesizes in a way that allows me to read and understand it clearly, reads the whole messy context, and does a fine job with it. I have a fingertip feeling for Gemini in that particular area. And that comes from practice and it comes from deliberate exposure across models.

    How to think about budgeting for AI models

    This gets back to a little tip for you when you think about budgeting for models and how much you're willing to pay. When people ask me, "Nate, what is the model for this workflow?" — behind that question, they are often asking, "Nate, where do I spend my 20 bucks? Where do I spend the money I'm choosing to invest in AI? And can I just pick one?"

    As much as I wish the answer was yes, I don't think the answer is yes. The answer is not: pick this one model and it will just work for you. If you're doing casual work with AI, you can forget everything I just said, because you can pick any model, pay your 20 bucks, and it will just work for you. If you're doing serious work with AI, that answer does not work. It just won't, because you need the specialty characteristics of different models to do the serious work.

    And I think increasingly, if you look at return on investment for whatever amount you're paying for AI, it follows an exponential return on investment curve. In other words, you get X return on 20 bucks a month for one model and you're happy. If you invest more, and you know how to use it and push on it, and you're doing what I'm describing — picking the task, pushing as hard as you can, paying for two models — your budget might be a hundred bucks a month in some cases. Maybe it goes as high as 300 bucks a month. You're paying more. It feels like a lot more. But your return on investment is not linearly higher — it is exponentially higher.

    This is what I wish I could convey. The reason why the fancy plans work is because people who invest in the fancy plans get disproportionately more value. If you get 2x the value for investing in the 20-buck plan, you're going to get 10x the value if you know how to use it for investing in the fancy plan — because the limits are higher, because the intelligence access is better.

    And I'm not going to deny it: there's absolutely a correlation effect. The people who are willing to pay more are typically the people who know how to use the AI better. And that is another massive driver.

    The honest answer to "which model should I use?"

    So if you are asking me, "Nate, which model do I choose?" — I'm going to come back and say: as much as you can, as much as you are willing to, if you want to lean in on AI fluency, think about it in terms of the task. Then think about how much model power you can apply to that particular task and which model specializes in that task.

    And if you want to know how to get there on a fingertip feel, I've written up a lot on Substack about this and which model I pick. I even have a prompt on picking the right model for a task between Gemini, ChatGPT, and Claude. So there are resources available, but I don't want to hide the ball — you also need to practice. You need to touch the models a lot. You need to touch as many different models as you can, give them real work, compare the difference, and use your honest judgment to say: this sucks, this doesn't suck, this sucks less, this is worth doing. And that's how you get very rapidly to a sense — as in the PRD example I described — of which model you'd use for any given task.

    I hope this is helpful. I wish the answer were as easy as me recording 30 seconds of video and saying "always use ChatGPT." It is not. That is just not truthful. And so I hope this honest answer is helpful if you're trying to figure out which model to use and how to think about which model to use.


    Polished transcript of AI News & Strategy Daily | Nate B Jones. All views are those of the original speakers. Watch on YouTube ↗
    Published by @maverick
    More from AI News & Strategy Daily | Nate B Jones
    More from @maverick
    Summary