AI Native  

AI-Native Without AI-Dependent

generated-202605264171525-aae1425762033298230b6b51e513242ee1e46

Governance as the Control Plane for Human Judgment

A review of the evidence on AI adoption, cognitive offloading, and the disciplines that keep human reasoning in command

By John Godel

AlpineGate AI Technologies Inc.

May 2026

ABSTRACT

Organizations are adopting artificial intelligence faster than they are learning to govern it. Drawing on Stanford's 2025 AI Index and McKinsey's 2025 State of AI survey, this article documents a widening gap between near-universal AI adoption and the controlled, accountable value that governance is meant to secure. It argues that the central risk is not becoming AI-native but becoming AI-dependent without governance: a condition in which the human capacity to verify, challenge, and operate without AI quietly erodes at both the individual and organizational levels. Synthesising foundational cognitive-science research on cognitive offloading and transactive memory with recent empirical work on critical thinking, productivity, automation bias, and hallucination, the article identifies the mechanisms of dependency, weighs the limitations of the current evidence, and proposes governance as the control plane that keeps human judgment active. Its guiding principle is simple: AI may assist, humans must judge, and governance must decide.

1. The Gap Between Adoption and Governance

Within a few years of the public release of large language models, a new label has entered the vocabulary of educators and executives alike: the AI native. It is the direct successor to the digital native, the term Marc Prensky coined in 2001 for a generation assumed to think differently because it had grown up surrounded by technology (Prensky, 2001). The label is now applied to organizations as much as to individuals: the AI-native enterprise is one that designs its work around intelligent systems rather than treating them as an occasional tool.

The shift is no longer speculative. Stanford's 2025 AI Index reports that 78% of organizations used AI in 2024, up from 55% the year before, with generative-AI use across business functions more than doubling to 71%, while global private investment in generative AI reached $33.9 billion, an 18.7% increase over 2023 (Maslej et al., 2025). McKinsey's 2025 State of AI survey, covering nearly 2,000 organizations across 105 countries, finds that 88% now use AI regularly in at least one business function (McKinsey & Company, 2025).

Yet the same surveys reveal that adoption has badly outrun results. McKinsey reports that only 39% of organizations see any measurable effect on enterprise-level earnings, and only about 6% qualify as high performers capturing meaningful value (McKinsey & Company, 2025). The gap is not principally technical. It is organizational: most enterprises have acquired the capability without building the disciplines — verification, accountability, evidence standards, and human oversight — that turn AI output into reliable decisions.

This article argues that the gap names the real danger. The risk is not becoming AI-native; that shift is both inevitable and, handled well, beneficial. The risk is becoming AI-dependent without governance — allowing AI to become the default source of truth while the human and organizational capacity to verify, challenge, and operate without it quietly erodes. The difference between augmentation and dependency, at both the individual and the institutional level, is governance: the system of controls that keeps human judgment active rather than ceremonial.

2. AI-Native Is Not the Same as AI-Dependent

Everyday language runs two different things together. An AI-native person or organization has a particular relationship to the technology: AI assistance is the normal, first-reach way of approaching cognitive work. This is neutral in itself. An AI-dependent person or organization has a particular property of underlying capability: it can no longer perform, or has never learned to perform, the work the AI now does for it. Dependency is the state in which the tool is not extending an ability but substituting for one that was never built or has since weakened.

The bridge between the two is cognitive offloading, defined by Risko and Gilbert (2016) as the use of action to reduce the information-processing demands of a task — writing a reminder instead of memorizing it, using a calculator, or asking an AI. Offloading is ancient and overwhelmingly beneficial; human cognition has always been distributed across instruments, notebooks, and other people. Crucially, Risko and Gilbert show that the decision to offload depends on metacognitive judgments about one's own ability, and those judgments are frequently inaccurate. People offload according to what they believe they can do, not what they can actually do, and that gap is where dependency takes root.

A related mechanism explains the effect on knowledge itself. Sparrow, Liu, and Wegner (2011), extending Daniel Wegner's concept of transactive memory, demonstrated in four experiments published in Science that when people expect future access to information, they remember the information itself less well but remember where to find it better. The internet, and now the AI assistant, becomes an external memory store. The effect's exact magnitude remains contested, with a mixed replication record, but its direction is well established and sets the template for the dependency question: offloading reliably changes what we retain.

3. Productivity Is Real — but Productivity Is Not Reliability

The case for AI is legitimate and measurable. In the first large-scale field study of generative AI in the workplace, published in The Quarterly Journal of Economics, Brynjolfsson, Li, and Raymond (2025) analyzed more than 5,000 customer-support agents and found that access to an AI assistant raised productivity, measured as issues resolved per hour, by about 15% on average. The gains were concentrated among less-experienced and lower-skilled workers, and the tool appeared to compress the experience curve by distributing the tacit know-how of the best performers.

This is precisely why enterprises move quickly, and the benefit is real. But productivity is not the same as reliability. A faster answer is not necessarily a correct one; a more polished recommendation is not necessarily a governed one; a generated decision memo is not automatically a verified decision. Without controls, AI accelerates good work and bad work alike. It scales productivity, but it can also scale error.

The same productivity finding carries a subtler long-term risk. If AI lifts novices to the output of experienced workers without their having to climb the learning curve, the expertise the experienced reviewer brings — the ability to recognise when an answer is subtly wrong — may never be built in the next cohort. The human control layer on which safe AI use depends is itself a product of the unaided practice that AI now lets people skip. Productivity gained today can quietly mortgage the supervisory capacity needed tomorrow.

4. The Human Control Layer: Cognitive Dependency

The most serious long-term risk is cognitive rather than technical, and recent research has begun to measure it. Three 2025 studies, using very different methods, point in a consistent direction while differing sharply in strength.

Gerlich (2025), in the journal Societies, surveyed and interviewed 666 participants and found a significant negative correlation between frequent AI-tool use and critical-thinking scores, statistically mediated by cognitive offloading. Younger participants showed both higher dependence and lower critical-thinking scores, while higher educational attainment predicted stronger critical thinking regardless of AI use, suggesting that prior cognitive formation offers some protection. Because the design is correlational, it cannot prove that AI causes the decline.

Lee and colleagues at Microsoft Research and Carnegie Mellon (2025), surveying 319 knowledge workers across 936 real tasks, found a revealing asymmetry: higher confidence in the AI was associated with less critical thinking, whereas higher confidence in one's own ability was associated with more. They also documented a shift in the nature of the work, from producing material to verifying, integrating, and supervising AI output, and noted that workers most often skipped scrutiny exactly when they felt least able to evaluate the AI.

Kosmyna and colleagues at the MIT Media Lab (2025) provide the most direct, if most preliminary, evidence. In an EEG study, 54 participants wrote essays using an LLM, a search engine, or no tools. The LLM group showed the weakest neural connectivity, became more passive over successive sessions, and frequently could not quote sentences from essays they had just produced — a marker of weak memory encoding and diminished ownership. The authors describe an accumulation of “cognitive debt.” The study is a small, short-term preprint and should be read as suggestive rather than settled.

Together these findings describe a dependency loop. The organization adopts AI for speed; employees rely on it for convenience; the effort spent on independent reasoning falls; verification quality declines; AI errors become harder to detect; and the organization grows dependent on a system it can no longer adequately supervise. Governance exists to break this loop.

Table 1. Selected empirical studies on AI use and human cognition.

Article content

5. Automation Bias: Why a Human in the Loop Is Not Automatically a Control

Many organizations claim safety because a human remains “in the loop.” But a human in the loop is not automatically a control. If the reviewer lacks expertise, time, authority, or access to the underlying evidence, the oversight is merely symbolic. The relevant failure has a name: automation bias, the tendency to over-trust automated recommendations. A 2025 review in AI & Society applying PRISMA methodology to 35 peer-reviewed studies found that trust, expertise, AI literacy, verification demands, and the quality of explanations all shape whether humans appropriately challenge AI output (Romeo & Conti, 2025).

Enterprise AI worsens this by design. Outputs arrive fluent, well-formatted, and confident, and that presentation manufactures trust independent of accuracy. Real oversight therefore asks more than whether a human was involved. It asks whether the reviewer had the expertise to judge the output, whether they saw the supporting evidence, whether uncertainty was made visible, whether alternatives were considered, whether the approval was logged, and whether the decision could be reconstructed later. Absent those conditions, “human-in-the-loop” is a compliance phrase, not a safety mechanism.

6. Hallucination Is a Governance Problem, Not Only a Model Problem

Hallucination is usually framed as a technical defect. In the enterprise it is also a governance failure, because the organization is responsible for how a false claim is used. The legal sector illustrates the stakes. A Stanford study found that general-purpose chatbots produced hallucinations on 58 to 82% of legal queries, and a follow-up from Stanford's RegLab found that even specialized, retrieval-grounded legal research tools hallucinated in at least one in six queries (Magesh et al., 2024). The specialized-tool figures were disputed by vendors on methodological grounds, and models continue to improve, but the lesson holds: fluent, well-cited output can still be wrong, and a confident tone is not a reliability signal.

The governance response is to require provenance. An AI answer treated as authoritative should be traceable to something verifiable — approved documents, database records, tool results, test outputs, audit logs, human approvals, policy rules, or external citations. A system that cannot show where its answer came from should not be trusted as a source of truth, however polished its prose.

7. The “Native” Trap: A Lesson From the Digital-Native Era

The digital-native episode is a direct warning, because the same rhetorical move is now under way. The word native smuggles in a claim: that the skill comes naturally, that it need not be taught, and that comfort equals competence. With digital technology, that assumption led schools and universities to under-teach information literacy on the premise that students would simply absorb it. A substantial literature, crystallised in Bennett, Maton, and Kervin's (2008) critical review, found little evidence for the assumption; the cohort confidently labelled fluent proved measurably vulnerable to misinformation and shallow engagement. Prensky himself later moved away from the framework.

Applied to AI, the assumption is more dangerous, because the competencies at stake are central rather than peripheral: reasoning, writing, evidence evaluation, and sustained problem-solving. A worker, or a student, told they are an AI native, and who can indeed produce a fluent draft in seconds, may never discover that they cannot construct the argument unaided, because the gap stays invisible while the tool is present. Gerlich's (2025) finding that heavier, younger users scored lower on critical thinking, while education independent of AI predicted higher scores, is consistent with fluency and capability moving in opposite directions. The label hides the divergence; governance and deliberate practice expose and close it.

8. Five Risks of Ungoverned AI Dependency

1. Loss of independent expertise. When every analysis, draft, and decision runs through AI, the underlying ability to perform those tasks can atrophy, most dangerously in expert domains where judgment is the product.

2. Circular validation. Using one AI to generate an answer, a second to review it, and a third to summarise the review can look rigorous while remaining circular; without external evidence, tests, or accountable human review, the loop never touches ground truth.

3. Unclear accountability. When an AI-influenced decision fails, the organization must be able to say who approved it, what evidence supported it, what system executed it, and what policy permitted it. Without governance, responsibility fragments.

4. Operational fragility. If critical workflows cannot continue when AI is unavailable, degraded, or legally restricted, the organization has built a single point of failure into its core operations.

5. Scaled misinformation. AI can generate errors at a speed and volume no human can match; without verification and monitoring, a single wrong output can propagate across documents, tickets, code, reports, and customer interactions.

9. Governance as the Control Plane

Governance is often mistaken for bureaucracy. In enterprise AI it is closer to an operating system: the set of rules defining how AI is permitted to reason, recommend, act, escalate, and be audited. The U.S. National Institute of Standards and Technology frames this through its AI Risk Management Framework, which organises the work around four functions — govern, map, measure, and manage — alongside trustworthiness characteristics including validity, reliability, safety, accountability, transparency, explainability, privacy, and fairness (NIST, 2023). The framework's central message is that AI risk is not solved by better models alone, but by better systems of control around them.

In practice, a serious governance model spans several layers:

1. Policy governance. Define which use cases are permitted, restricted, or prohibited, with stricter review and stronger evidence required in high-risk domains.

2. Data governance. Enforce data classification, access control, retention, privacy, and source permissions so that AI does not become a route around existing security.

3. Model and provider governance. Maintain visibility into which models are used, where they run, what data they process, and what contractual or regulatory limits apply.

4. Workflow governance. Keep AI inside approved workflows, and never let a system move silently from recommendation to execution without explicit permission.

5. Human-approval governance. Make review meaningful: the approver must have the authority, context, and evidence needed to accept or reject the output.

6. Evidence governance. Require AI work to produce evidence such as source references, retrieval and tool traces, test results, approval logs, and decision history.

7. Monitoring and incident governance. Monitor systems after deployment, with escalation paths for hallucination, privacy failure, biased output, incorrect action, prompt injection, and degraded behaviour.

Brought together, these become a governance control plane sitting beneath the chat interface — an environment providing, among other things, an agent registry, role-based permissions, approved tool access, policy enforcement, risk scoring, human-approval gates, prompt and retrieval traceability, execution logs, evidence packages, a kill switch, audit history, and monitoring dashboards. This matters most as organizations move from passive chatbots to AI agents that take actions: a chatbot can give a wrong answer, but an agent can give a wrong answer and then act on it. The more autonomous the system, the stronger the governance must be. McKinsey's own data points the same way, finding that AI high performers are distinguished partly by exactly these practices — human-in-the-loop rules, centralised oversight, and executive accountability — even as 51% of organizations report AI-related incidents (McKinsey & Company, 2025).

10. What the Evidence Does and Does Not Show

A responsible account resists overstatement, and several considerations cut against a simple “AI makes us worse” narrative. First, cognitive offloading is not pathological; writing, printing, and search engines provoked the same fears, and human capability expanded rather than collapsed. What matters is what is offloaded and whether the core skill is preserved. Second, the evidence on cognitive decline is young and limited: the most direct study (Kosmyna et al., 2025) is a small, short-term preprint, while the large-sample studies (Gerlich, 2025; Lee et al., 2025) are correlational and partly self-reported and cannot exclude the possibility that less critical thinkers simply use AI more. Third, the contested legal-hallucination findings are a reminder that headline statistics deserve scrutiny and that models improve over time. Fourth, augmentation is real: the same Lee et al. data show that AI can sharpen judgment when users stay vigilant, and that confidence in one's own ability tracks with more critical thinking, not less.

The defensible conclusion is therefore conditional rather than catastrophic. Heavy, uncritical, substitutive reliance on AI carries measurable cognitive and organizational costs through a well-understood mechanism; deliberate, supervised, governed use need not. The deciding variable is not the technology but the posture of the user and the discipline of the institution around it.

11. The Operating Principle: Assist, Judge, Control

The healthiest model is neither AI-only nor human-only. AI should be free to draft, summarise, compare, classify, search, recommend, and automate within approved boundaries. It should not become the invisible final authority. The principle that keeps the two in balance can be stated in five clauses: AI assists, humans judge, governance controls, evidence proves, and auditability protects. Governance defines the boundaries that prevent AI from becoming a silent decision-maker; it keeps human judgment active; and it ensures the organization can explain, reproduce, challenge, and correct any AI-supported decision.

Conclusion

AI-native thinking is not dangerous. AI-native thinking without governance is. The enterprise risk is not that organizations will use AI too much, but that they will use it without preserving human judgment, evidence standards, operational resilience, and accountability. The data already shows the urgency: adoption is near-universal while measurable, governed value remains rare, and the documented costs — hallucination, automation bias, cognitive offloading, and overreliance — are real even where they are not yet fully quantified. The future belongs to organizations and individuals who can become AI-native without becoming AI-dependent. That requires governance by design and judgment by habit, not as afterthoughts, but as the control plane for everything AI is allowed to touch.

References

Bennett, S., Maton, K., & Kervin, L. (2008). The ‘digital natives’ debate: A critical review of the evidence. British Journal of Educational Technology, 39(5), 775–786. https://doi.org/10.1111/j.1467-8535.2007.00793.x

Bainbridge, L. (1983). Ironies of automation. Automatica, 19(6), 775–779. https://doi.org/10.1016/0005-1098(83)90046-8

Brynjolfsson, E., Li, D., & Raymond, L. (2025). Generative AI at work. The Quarterly Journal of Economics, 140(2), 889. https://academic.oup.com/qje/article/140/2/889/7990658

Gerlich, M. (2025). AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies, 15(1), 6. https://doi.org/10.3390/soc15010006

Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X.-H., Beresnitzky, A. V., Braunstein, I., & Maes, P. (2025). Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task (arXiv:2506.08872) [Preprint]. https://arxiv.org/abs/2506.08872

Lee, H.-P., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., & Wilson, N. (2025). The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). https://doi.org/10.1145/3706598.3713778

Magesh, V., Surani, F., Dahl, M., et al. (2024). Hallucination-free? Assessing the reliability of leading AI legal research tools. Stanford RegLab & HAI. https://hai.stanford.edu/news/ai-trial-legal-models-hallucinate-1-out-6-or-more-benchmarking-queries

Maslej, N., et al. (2025). The 2025 AI Index report. Stanford Institute for Human-Centered Artificial Intelligence. https://hai.stanford.edu/ai-index/2025-ai-index-report

McKinsey & Company. (2025). The state of AI: Global survey 2025. https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai

National Institute of Standards and Technology. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). https://www.nist.gov/itl/ai-risk-management-framework

Prensky, M. (2001). Digital natives, digital immigrants. On the Horizon, 9(5), 1–6. https://doi.org/10.1108/10748120110424816

Risko, E. F., & Gilbert, S. J. (2016). Cognitive offloading. Trends in Cognitive Sciences, 20(9), 676–688. https://doi.org/10.1016/j.tics.2016.07.002

Romeo, G., & Conti, D. (2025). Exploring automation bias in human–AI collaboration: A review and implications for explainable AI. AI & Society. https://doi.org/10.1007/s00146-025-02422-7

Sparrow, B., Liu, J., & Wegner, D. M. (2011). Google effects on memory: Cognitive consequences of having information at our fingertips. Science, 333(6043), 776–778. https://doi.org/10.1126/science.1207745