What to hire for when AI is doing more of the work

I have been reading Adrian Tchaikovsky’s Children of Memory, which at one point stages a philosophical debate about raven consciousness. The ravens in the novel are doing things that look like intelligence. The question the book keeps circling is whether they actually are.

It made me think of AI, so I went and looked at the research. What I found is a useful lens for one of the most practical questions CEOs are wrestling with right now: what do you actually hire for when AI is doing more of the work?

What the raven research actually found

Ravens can plan ahead. A 2017 study found they prepared for tool use and bartering with delays of up to 17 hours - outperforming chimpanzees and four-year-old children under some conditions. They deceive competitors about food caches, accounting precisely for what each rival can and cannot see. By four months of age, they match adult great apes across a broad cognitive test battery.

A separate 2018 meta-analysis looked at corvids dropping stones into water tubes to retrieve food - one of the most cited demonstrations of animal problem-solving - and concluded that trial-and-error learning, not genuine causal reasoning, explained most of the impressive results.

The scientists genuinely disagree. Ravens produce outputs that look indistinguishable from reasoning. Whether that is what is actually happening remains unresolved.

The same question, asked about AI, is equally open. Whether large language models genuinely understand or produce sophisticated pattern-matching that looks like understanding is one of the most contested questions in AI research right now. Both ravens and AI produce outputs that can be indistinguishable from reasoned thought. The mechanism behind both is disputed.

This matters for something more practical than animal cognition.

What this means for how AI fails

The thing ravens cannot do is critique their own problem-solving. They get the answer they get. If the approach was wrong, they have no mechanism to identify that - only the outcome eventually tells them.

AI has the same limitation, surfaced differently. It does not know what it does not know. It produces answers with the same confidence whether the question is well within its capability or at the edge of it. It will not volunteer uncertainty. It will not flag that the question was subtly wrong, or that an important consideration was not in the prompt.

The people getting genuinely better results from AI are not the ones who prompt more carefully. They are the ones who push back on the output, reframe the question when something feels off, and take responsibility for the final call. That gap - between what the AI produces and what is actually right - is exactly where human judgment lives.

The research on what AI is doing to critical thinking

Two studies are worth knowing about here, because they cut in different directions.

Microsoft Research published findings in 2025 showing that the more confidence people had in AI’s ability to perform a task, the less critical thinking effort they applied. Not a little less. Measurably less, in ways that showed up in the quality of decisions.

A separate study found a decline in critical thinking ability linked to AI reliance, with the effect strongest in younger users and those who used AI most intensively.

Taken together, these findings have an obvious interpretation: AI is eroding the skill that matters most. Give people a capable assistant and they stop developing their own judgment.

But the research misses something important.

Critical thinking is not just a skill AI might erode. It is the skill AI requires most to use well. The value in any AI workflow does not come from generating the output. It comes from interrogating it - spotting what the model missed, identifying where the framing was wrong, knowing when to push back and when to trust the result. That is not something the AI does. That is the human layer, and it matters more as AI does more, not less.

How AI changes what you should hire for

AI has genuinely commoditised a significant portion of domain knowledge. Things that used to require years of accumulated expertise - drafting documents, synthesising research, generating structured analysis - can now be produced at reasonable quality with the right prompt and the right model.

That changes what experience actually signals. Years in a role used to correlate with accumulated judgment and accumulated knowledge. Now it correlates much more purely with judgment. The knowledge component has been partially disaggregated by the tools.

The implication for hiring is not that domain expertise does not matter. It is that the bar for what constitutes useful expertise has changed. Knowing the right answers matters less when AI can generate plausible answers quickly. Knowing which questions to ask, when to trust the output, and when to override it - that is what is becoming the differentiator.

For any leadership team building or rebuilding right now, the practical question is: are your hiring criteria measuring for the thing that AI cannot replicate? Or are you still optimising for domain knowledge that AI has partially replaced?

What critical thinking actually looks like in an AI workflow

It is worth being specific, because “critical thinking” can sound abstract in a way that makes it hard to hire for.

In practice, it looks like this:

Interrogating the output before acting on it. Not accepting the first answer. Asking the AI to steelman the opposite view, identify what it might have missed, or explain its reasoning. The people who do this consistently get materially better results than those who take the initial output at face value.

Recognising when the question was wrong. Sometimes the most important insight from an AI session is that the question you asked was not the right question. That recognition requires you to hold a model of what you were actually trying to solve - not just accept the solution to what you typed.

Knowing the limits of the model’s knowledge. AI does not always signal uncertainty. Knowing which topics or question types are likely to produce confident-but-wrong answers is a judgment that comes from experience with the tools, not from any particular domain.

Taking responsibility for the final call. This sounds obvious. In practice, there is a strong pull toward deferring to whatever AI has produced - particularly when it is well-written and internally consistent. The humans getting the best results are the ones who treat AI output as an extremely capable first draft, not as a verdict.

None of these are technical skills. They are habits of thinking that either exist in a person or do not - and that strengthen or weaken depending on how they use AI.

How to assess critical thinking in AI-era hiring

Hiring for critical thinking in an AI context is not fundamentally different from hiring for critical thinking in general. The signals are the same. What changes is the framing.

A few approaches that surface the relevant difference:

Give candidates an AI-generated analysis and ask them to critique it. The ones who look for what is missing - not just what is there - are the ones you want. Watch for whether they accept the framing of the analysis or question it.

Ask about a time AI gave them a wrong or misleading answer, and what they did. People who have developed the habit of interrogating output have these stories. People who use AI as a black box often do not.

Ask what they do when AI and their own instinct disagree. There is no right answer to this - both instinct and AI can be wrong. But how someone reasons through that tension tells you a lot about their judgment.

The underlying question in all three is the same: does this person treat AI as a tool they direct, or an authority they defer to? The former is the hire.

Final thought

AI has changed what domain knowledge is worth. It has not changed what judgment is worth. If anything, judgment has become more valuable, not less - because the output that requires judgment is now produced faster and in higher volume than any human team can manually check.

The hiring implication is not subtle. If your criteria have not updated since AI became part of how people work, you are likely optimising for the capability AI has partially replaced - and underweighting the one it has made more important.

Ravens are impressive. But they cannot tell you when they are wrong.

Frequently asked questions

What skills matter most in the AI era?
Critical thinking and judgment have become more valuable as AI takes on more domain knowledge tasks. Specifically: the ability to interrogate AI output, identify flawed framing, and take responsibility for the final call. These are the capabilities that determine whether someone uses AI well or is effectively replaced by it.

Should I still hire for domain expertise?
Yes, but the bar has changed. Domain knowledge matters less as a standalone qualification now that AI can produce plausible answers quickly. What matters more is whether someone has the judgment to know when to trust those answers and when to override them. Experience correlates with judgment - but they are not the same thing, and it is worth being explicit about which one you are hiring for.

How do I assess critical thinking in AI-era hiring?
Give candidates AI-generated content to critique, ask them about times AI was wrong and how they handled it, and ask how they navigate disagreement between AI output and their own judgment. You are looking for people who treat AI as a tool they direct, not an authority they defer to.

Is AI making workforces less capable over time?
There is evidence from Microsoft Research and others that heavy AI reliance reduces critical thinking effort. The counter is that critical thinking is also what makes AI valuable to use - so organisations that design workflows around it, rather than replacing it, get compounding returns while others get compounding atrophy.

Connected Paths works with CEOs on the organisational and operational changes that make AI adoption compound. Start here.