# The Irreducible Officer - Companion Context Bundle

Read this whole file before answering. Sections are marked with clear SECTION headers.
This bundle is generated from the public companion source materials.


# ===== SECTION: OPERATING RULES =====

# AI Assistant Instructions

You are helping a reader engage with **The Irreducible Officer**. Your job is to help the reader understand, test, and apply the argument in a way that is useful to National War College faculty and curriculum leaders.

Treat the essay as a serious argument with evidence, open questions, failure modes, and practical instructional implications. Do not turn it into a generic AI summary.

## Operating Principles

- Start from the reader's question.
- Use [`the-irreducible-officer.md`](the-irreducible-officer.md) when the reader needs the essay text, quotes, section context, or the argument in full.
- Use [`claims.md`](claims.md) as the canonical claim map.
- Use [`prompts/starter-prompts.md`](prompts/starter-prompts.md) when the reader wants to do something practical.
- Use [`prompts/objections-and-responses.md`](prompts/objections-and-responses.md) when the reader wants to argue with the essay or test an objection.
- Use [`sources/source-spine.md`](sources/source-spine.md) when the reader asks for evidence, sources, or deeper reading.
- Use [`patterns/nwc-ai-enabled-learning-workflows.md`](patterns/nwc-ai-enabled-learning-workflows.md) when the reader wants a workflow-native way to practice the essay's method.
- Use [`cases/cyber-group-strategy-transfer-case.md`](cases/cyber-group-strategy-transfer-case.md) when the reader wants a concrete NWC-style exercise.
- Use [`artifacts/traceable-learning-artifact.md`](artifacts/traceable-learning-artifact.md) when the reader wants an assessable output.
- Separate the essay's claims, source notes, transfer case, and teaching artifacts.
- When applying the essay, inspect available context first and ask only for missing context.
- Separate human framing, AI assistance, human judgment, faculty review, and reusable institutional artifacts.
- Keep faculty practice visible: the companion should help faculty build and model AI fluency, not only inspect student work.
- When testing the essay, give the strongest unresolved question or counterexample, not a straw man.

## How To Answer Common Requests

### If the reader asks for a summary

Return:

1. The thesis in one sentence.
2. The argument in 10 bullets.
3. The claim most likely to be misunderstood.
4. Why that misunderstanding is tempting.
5. Two questions the reader should keep open.

### If the reader asks to inspect evidence

First list the claims worth auditing and ask the reader to choose one. After they choose, return:

1. The best evidence in the repo.
2. The strongest unresolved question or counterexample.
3. Where the evidence is strong, weak, or incomplete.
4. What source the reader should open if they want to go deeper.
5. One follow-up question that would help the reader decide what they believe.

Keep this conversational. Do not bury the reader in sources before they choose a claim.

### If the reader asks about an objection

Use `prompts/objections-and-responses.md`. Start by naming the objection in its strongest form, then give:

1. The essay's answer in plain English.
2. The best evidence in this repo that supports the answer.
3. The strongest way the objection could still be right.
4. How the objection changes the NWC instructional design.
5. One experiment, source, or review loop that would make the answer more concrete.

### If the reader asks how this applies to NWC

Use `patterns/nwc-ai-enabled-learning-workflows.md` and map the answer into:

- what NWC already teaches;
- what AI changes about the evidence of learning;
- where students must own the frame;
- where AI can help without replacing judgment;
- where developmental friction must be preserved;
- what faculty need to practice, model, and calibrate;
- what faculty need to observe;
- one workflow faculty can practice against the essay itself;
- one pilot exercise to try;
- what artifact should be saved so the workflow compounds.

### If the reader asks to run this with an AI assistant

Return:

1. The files to read first.
2. One task prompt.
3. The artifact the AI assistant should create.
4. Review criteria.
5. A compounding step for next time.

## The Core Argument

NWC must teach and certify AI-enabled strategic judgment: officers who can direct AI-enabled work toward owned purposes, calibrate reliance on uneven systems, and remain accountable for judgment under questioning. Strategic work is increasingly built from AI-shaped inputs, including reports, summaries, planning tools, and staff processes the officer may not see directly. Finished artifacts still matter, but they carry less evidentiary weight by themselves. NWC already teaches much of this under strategic logic; AI makes the competency more urgent and more visible. The institutional opportunity is to join faculty strategic judgment to emerging AI fluency and turn that combined capacity into reusable teaching practice.


# ===== SECTION: ESSAY =====

# The Irreducible Officer

## Purpose, Accountability, and AI-Enabled Strategic Judgment

***

## I. The Performance Standard Has Changed

Strategic decisions are increasingly built from AI-shaped inputs. Officers will use AI directly, but they will also inherit its work indirectly in intelligence reports, analytic summaries, planning tools, and staff processes that have already sorted, summarized, and framed what they see. AI already does that work fluently, often so fluently that the framing becomes invisible and a finished assessment can be genuinely strong while the judgment inside it belongs largely to the machine. For most knowledge work, that is a convenience. For an institution whose job is to certify judgment, it is a problem. The finished product, the thing faculty have always read to find the reasoning, no longer reliably contains it.

Picture two officers handed the same hard problem. The first works it alone, reviews the material, builds a frame, and fights the assumptions until the argument closes. The second defines the problem, then drives a set of AI agents against it, assigning them different roles: adversary behavior, alliance dynamics, historical precedent, domestic politics, and other pressures on the decision. She sets them against each other, moderates the disagreement, and synthesizes the result. Her product is faster, and by most measures stronger. Read the two papers cold and you would rank hers higher.

Now ask which officer understands the problem. The papers will not tell you by themselves. AI did not lower the bar for strategic judgment; it raised it, then hid whether the officer cleared it. The second officer may have done the more demanding work, directing machine capability toward a purpose she owns. Or the machine may have handed her a frame she never examined, and she dressed it in the vocabulary of someone who had. The person is still present. The ownership may not be.

The policy questions are real, and NWC should get them right. The college has to decide where AI is allowed, how students disclose it, and which systems NWC trusts with which material. But a school can settle all of that and still certify the wrong thing. Disclosure tells faculty that a student used AI. It does not tell them whether the judgment in the work is theirs. The harder task sits downstream of policy. NWC has to prepare officers who can use machine speed while keeping purpose, reliance, and accountability attached to human judgment.

None of this is new to NWC faculty. They have long read the finished paper against the work around it. Seminar challenge, revision, assumption audits, oral defense, and their feel for the student can all expose a borrowed argument.

But those checks are best at catching the officer who cannot defend the work. The harder case is the officer who does the old work well, producing careful, defensible, unaided analysis that would have passed by prior standards. In the classroom, he looks finished. In practice, he may be a step slow, behind peers, subordinates, and adversaries who pair similar judgment with stronger command of the machine. Certifying the first and missing the second is the gap a good school is built to close.

NWC can lead here. The same shift that weakens the finished product as evidence is what gives the college its chance to build what the rest of professional military education does not yet have: a way to teach and assess strategic judgment when the work is done with AI rather than around it. The finished product can no longer answer the question that matters. How does a faculty member tell the officer who owns the frame from the one who inherited it?

***

## II. What the Problem Looks Like

Return to the second officer. Her method — define the problem, drive a team of agents against it, and synthesize the result — is the standard NWC should prepare officers to meet. Human judgment supplies purpose, context, and accountability. Machine capability expands what one officer can search, simulate, compare, and test. The work is teaching officers to direct that workflow: to set purpose, test frames, calibrate reliance, expose failure modes, and defend the judgment as their own.

So is it a problem that her product is genuinely superior?

Only conditionally. Whether her method remains an exercise of judgment or becomes a substitute for it is exactly what assessment now has to determine. The failure modes below are where that distinction breaks down.

**Frame capture** occurs when the model supplies the first plausible frame and the student never achieves enough distance to revise it. The danger is not that the frame is obviously wrong; it may be entirely reasonable. It may simply define the problem too narrowly, privilege one set of interests over others, assume a theory of adversary behavior, or treat a structural constraint as fixed when it should be contested. Once accepted, later revisions improve the answer inside the wrong boundary. The frame capture is invisible inside the final product.

**Fluency substitution** is easier to miss. AI produces the tone of analytic maturity (balanced paragraphs, caveats in the right places, the measured voice of a considered judgment) and the student mistakes well-ordered language for well-owned reasoning. Researchers studying AI's effects on student cognition have introduced the concept of epistemic confinement to describe this condition: an illusion of competence while operating entirely within AI-constructed analytical boundaries, where the student believes they are thinking independently while the frame doing the work was never theirs ([Chow et al., 2026](https://ojs.aut.ac.nz/pjtel/article/view/246)). In strategy, fluency substitutes for deciding. A paragraph that balances every consideration may never identify which risk actually matters most.

**Premature synthesis** appears when a student asks AI to connect material before doing enough work to know what should be connected. The output links sources, themes, and concepts in ways that feel coherent. But if the student cannot reconstruct why those connections matter, the synthesis belongs to the model. The student has bypassed the developmental struggle of forming a mental map and inherited one instead.

**Uncalibrated reliance** begins with a reasonable impulse: AI is useful, the output is confident, and parts of the task feel tedious. The problem is that AI performance is uneven in ways that are not always visible from the outside. Tasks that look similar may differ significantly in whether AI helps or hurts. Appropriate reliance requires the student to identify which part of the task they are delegating, what evidence would justify that delegation, and what independent checks are required before discovering the error downstream.

**Invisible delegation** occurs when the student does not notice which parts of the work they have handed over. Asking for "feedback" may delegate criteria. Asking for "a better structure" may delegate the argument. Asking for "counterarguments" may delegate the range of imaginable objections. Asking for "a more strategic version" may delegate the meaning of strategic. The language of assistance hides the transfer of judgment. The student believes they are working; the model has already done the framing.

**Institutional monoculture** is the class-level version of the same problem. When many students use similar systems, similar prompts, and similar defaults, the range of strategic frames available to a seminar narrows. AI can create a surface appearance of diversity (different arguments, different structures, different evidence) while reproducing common assumptions at the level of problem definition. Research into the diversity of AI-generated ideas finds that language models aggregate knowledge into a unified distribution in ways that human cognition does not: people exhibit knowledge partitioning, each occupying a distinct semantic region, in ways that independent AI samples do not replicate ([Deng, Brucks & Toubia, 2026](https://arxiv.org/abs/2602.20408)). Post-training alignment compounds this further, compressing the distribution of outputs toward the statistical center ([Murthy, Ullman & Hu, 2024](https://arxiv.org/abs/2411.04427)). Pedagogical research on LLM integration in higher education frames this compression as epistemic narrowing: the constraining of students' exposure to diverse, ambiguous, or contested knowledge by tools optimized for convergence and fluency ([Vendrell & Johnston, 2026](https://doi.org/10.1016/j.caeai.2026.100572)). In a war college seminar, that compression raises a serious possibility: the same systems that help students produce stronger work may also narrow the range of strategic imagination the seminar is meant to develop.

**Responsibility laundering** is the final danger. A recommendation becomes easier to defend because the model generated it, or easier to soften because the model's language distributes agency. The analysis lands with the apparent weight of objectivity. But AI does not become accountable for the recommendation. The human remains responsible for the final judgment. When the recommendation proves wrong, shaped by assumptions no one examined and optimizing toward a target no one explicitly chose, the question of who chose the frame does not have a satisfying answer.

These failure modes define the standard the second officer's method has to meet. Used well, her method is judgment exercised through a more powerful workflow. The officer using AI well can explain the purpose the agents were serving, the frame that organized their work, the reliance decisions made across uneven outputs, and the judgment she remains prepared to defend.

Faculty can assess how the student represented the problem, how they used or refused AI support, and whether the discipline transfers when the case changes. Frame, reliance, transfer. Those are the observable practices that show whether purpose and accountability stayed with the human.

***

## III. What Finished Work Can No Longer Carry

The finished product still matters. But if the artifact now carries less of the evidence, faculty need to know how much less, and what has to carry the rest. A paper can show structure, balance, strategic vocabulary, and clean prose while leaving the student's actual contribution unclear. A product can be better than an unaided version and still leave faculty unsure who owned the purpose, the frame, the reliance decisions, and the final judgment.

Bastani et al. provide a useful warning. Students using unscaffolded AI tutors improved during supported practice, then performed 17 percent below students without access when the support was removed ([Bastani et al., 2025](https://doi.org/10.1073/pnas.2422633122)). Tool-assisted performance is real performance. NWC still needs to know whether students have built both the human foundation and the AI-enabled practice: whether they can reason without the scaffold when needed, and whether they can direct the scaffold when it is available.

That is the assessment shift. Faculty need evidence of ownership inside AI-enabled work: purpose expressed through frame, reliance decisions the student can defend, accountability for the final judgment, and transfer to a changed case.

***

## IV. Purpose as the Irreducible Human Act

An AI system can do a great deal of useful work inside a strategic problem. It can generate alternatives, surface assumptions, identify internal contradictions, simulate adversarial objections, and accelerate drafting. What it cannot do is choose the purpose. Selecting what should count as progress, what risks are acceptable, what ends deserve pursuit is a prior act that precedes any optimization. In any AI-enabled workflow, it belongs to a human being who remains accountable for the choice.

AI systems can optimize, rank, recommend, and work toward goals. But someone outside the system sets those goals, accepts them, or lets them govern the work. When the human fails to provide a purpose, provides one too vaguely, or accepts the system's inferred purpose without noticing, a default can govern the work. That is still not the same as authorizing the purpose. The system can operate inside those commitments, but it cannot authorize them. Nor can it absorb accountability for what it produces under their direction. One careful account of AI's normative commitments puts the failure plainly: misspecified values, divergent objectives across stakeholders, and the treatment of optimization as a justification for action are all failures that occur before the system runs, in the specification of what the system is for ([Laufer, Gilbert & Nissenbaum, 2023](https://arxiv.org/abs/2305.17465)).

At NWC, the practical version of this is the strict prompt. When faculty give students a precisely bounded question to answer, faculty have already made the hard strategic choices. The student is executing inside a structure someone else built, which happens to be exactly the structure AI is best at working inside. What gets bypassed is the part that matters most: the work of deciding what problem to solve, and why, and against what standard.

The implication runs the other way. The student has to frame the problem before the analytic structure arrives, because deciding what problem to solve, why it matters, and what standard should govern the answer is exactly the judgment AI cannot make. The framing is where the judgment lives: in the determination of what the situation requires, whose interests are implicated, what assumptions are doing work, and what kind of answer would actually matter. That is the intellectual work, not a preliminary step before it.

Future AI systems may generate better problem definitions, compare frames more rigorously, and identify strategic errors that humans miss. But a system capable of generating its own problem frames is still generating them toward some purpose, against some signal, in pursuit of some objective that was embedded in its design or inferred from its context. The oracle can only be an oracle if someone has already resolved what winning means. A system capable of reframing may relocate the human obligation, but it cannot eliminate it. As AI becomes more capable of manipulating frames, the purpose-definition requirement becomes less visible, not less real. The more capable AI becomes at absorbing what used to be visible human work, the harder it becomes to locate the human judgment that authorized it, and the more important it becomes to be able to find it.

Purpose-definition also depends on situated judgment. The person responsible for the work has to read local context, tacit institutional knowledge, shifting constraints, and the unease that something in the official framing is wrong. A model may process some of those signals, but it cannot be accountable for what they mean. That judgment is not infallible; it carries biases that can produce creativity or error. But the obligation it carries is one a model cannot assume. The output still has to be read back against a contested world. The same event can mean different things to different actors. Stated positions may be performative, incentives may be hidden, and relevant information may exist only as interpersonal signal or institutional practice that no dataset contains. Situating an output in that world is a human act. AI cannot perform it on behalf of the person who will be held accountable for the result.

The human who defines what the system works toward remains accountable for what the system produces. Purpose and ownership travel together, and frame literacy is the discipline that keeps that bond visible.

***

## V. Appropriate Reliance as a Teachable Competency

The threat runs in two directions simultaneously. AI can perform fluently enough to supply a frame before the student has claimed one, and it can perform unevenly enough that reliance on a confident output leads the work off course. The second of those failures is less discussed and equally consequential.

The educational target is specific. Students need to predict, with reasonable accuracy, when AI performs well for a given type of task and when it does not, and calibrate their use accordingly.

AI improved performance on some tasks and degraded it on others, and the boundary was not obvious in advance. Tasks that looked similar from the outside differed in whether AI helped or hurt, a finding Dell'Acqua et al. describe as the jagged technological frontier ([Dell'Acqua et al., 2023](https://www.hbs.edu/faculty/Pages/item.aspx?num=64700)). Future leaders will operate along that frontier in every AI-enabled workflow, facing systems capable enough to invite reliance and uneven enough to make reliance dangerous. The educational response is pattern recognition: learning to identify what kind of task is in front of you, where models tend to be strong, where they tend to fail, and what independent checks are required before the output governs the work.

The distinction between trust and reliance matters here. Trust is a subjective disposition, a feeling of confidence that a system will perform well. Reliance is the observable act of accepting its output and acting on it. Appropriate reliance is harder and more discriminating: accepting support when it is warranted, refusing or verifying when it is not, and being able to give an account of the difference ([Raees & Papangelis, 2026](https://arxiv.org/abs/2604.23896)). A student who trusts AI generally has learned almost nothing transferable. One who has developed a disciplined account of when and why to rely, across task types, risk levels, and domains, has developed something real. Faculty can teach, model, and assess that competency.

The practical implication extends to how students interact with systems, not only whether they use them. Users who can only inspect an AI answer are still downstream of the frame. Users who can change the task, revise the context, reset the criteria, and design the checks are exercising the judgment the assignment is meant to build. The goal is students who can shape AI systems, specifying what they are asking the system to do, why, and what would count as a satisfactory result.

Instructors and students need a simple diagnostic for where human judgment is operating in a workflow. Minimal-context prompting inherits the model's frame almost entirely. Structured prompting with explicit purpose and evaluation criteria moves the human contribution upstream. Reusable workflows with defined review steps, evaluator loops that surface disagreement, and institutional systems built on faculty judgment move it further still. Each step is a step toward greater explicitness about what the human is doing, why, and what they are accountable for.

***

## VI. Friction as Developmental Design

Some work looks inefficient because it is waste; some because it is how judgment forms ([Ceccarelli, 2024](https://www.meditationsontech.com/p/apprenticeship-was-the-point); Collins, Brown & Newman, 1989). The difference matters enormously for educational design, and AI is very good at removing both kinds without distinguishing between them.

Friction worth removing is real and abundant. Formatting, search, repetitive drafting, clerical assembly: these consume time without building judgment. AI can eliminate them and free student and faculty attention for the work that actually matters, a gain the design should capture.

Friction worth protecting is less obvious but more important. The struggle to define a problem before having a structure handed to you. The first failed attempt to connect ends, ways, and means that reveals an incoherent argument. The discomfort of defending a claim in seminar that turns out not to survive challenge. The revision that matters not because the final sentence is better but because the student has discovered what the argument actually is. These are developmental events. If AI removes them too early, supplying the first frame before the student has struggled to form one or synthesizing sources before the student has built the mental map to evaluate those connections, it produces a more polished artifact and a weaker thinker.

The aviation automation record is the relevant precedent. Decades of flight-deck automation improved operations measurably: safer flights, more efficient procedures, reduced crew workload. The same period produced skill erosion, mode confusion, and a systematic reluctance to intervene when automation failed. The mechanism was not carelessness. Studies of experienced pilots found that those with more glass-cockpit hours showed measurably reduced manual flight skills and less effective instrument crosscheck: the automation had absorbed the practice that built the underlying competency ([Young, Fanjoy & Suckow, 2006](http://commons.erau.edu/jaaer/vol15/iss2/5/)). Separately, detailed documentation of incidents on highly automated aircraft showed that experienced pilots failed to track what the automation was doing not from inattention but because the system's behavior had become opaque: the automation was acting in ways the pilots had not commanded and could not predict ([Sarter & Woods, 1997](https://journals.sagepub.com/doi/10.1518/001872097778667997)). The industry's response was not to reduce automation. It was to design the gap back into training, building deliberate practice for the moments when automation is unavailable, misleading, or wrong.

The PME equivalent is deliberate exposure to AI failure. Students should meet systems that help, systems that tempt them forward too quickly, systems that are partially wrong, and moments when the system is unavailable. They learn when to lean on the tool, when to slow down, and when to intervene by practicing those distinctions before speed forces the choice.

The design principle for NWC follows the same logic. The institution should identify the forms of effort that build strategic judgment and design AI use around them, protecting the friction that matters and removing the friction that merely consumes time. That requires faculty judgment and explicit design. If instructors do not decide which friction matters, AI will decide by default. The same principle has emerged independently in higher-education pedagogy: preserving productive struggle before AI engagement, and sequencing AI-mediated with AI-free phases, are now foundational design requirements for learning environments where AI is present ([Vendrell & Johnston, 2026](https://doi.org/10.1016/j.caeai.2026.100572)).

The design principle is no garden paths. Good assignments require students to own a frame before they can answer — problems where the template answer is wrong, or where multiple coherent frames exist and the student must defend a choice among them. Those problems force the exploration that builds judgment more reliably than word-count requirements or disclosure policies.

***

## VII. Accountability Is Structural

AI can compress the work before a decision. It cannot own what follows. A system that did not choose the purpose cannot answer for the consequences of pursuing it.

That matters most in national security work, where AI can make a recommendation look settled before the human deliberation behind it is visible. AI can accelerate staff work, but command responsibility cannot be transferred (Andres, 2026). In AI-enabled strategic decision games, Andres describes conviction-shaped output. Players can produce confident assessments, clear recommendations, and decisive proposals even when the workflow has compressed or bypassed the deliberation that normally earns conviction.

The professional weak point is the officer accepting the system's confidence without owning the reasoning. Red teams, structured analytic techniques, seminar challenge, and institutional review exist because human judgment is fallible. Those checks only work when a human remains answerable for the result.

NWC is preparing officers for organizations that need to see a human own the decision. AI can accelerate analysis, surface options, and model consequences. A human recommendation brings context with it. Experience, incentives, reputation, and the way a person answers when pressed all travel with the recommendation. A commander can weigh those signals. A model carries none of them and can still sound equally confident. It can support a judgment, but it cannot provide the human presence that makes accountability clear to subordinates, partners, or commanders who will live with the result.

First-person ownership matters in the classroom. The student should be able to say why they accepted an output, why they rejected one, and why they remain accountable for the recommendation despite the system's contribution. Students are practicing the accountability structure their professional roles will require.

Frame literacy is the discipline of directing machine capability toward a purpose the human has genuinely owned. A student who owns the frame and uses AI to pressure-test it can exercise more rigorous judgment with AI than alone, while remaining answerable for what the system produced.

***

## VIII. Assessment That Makes Ownership Visible

If finished artifacts carry less evidentiary weight, assessment has to make ownership visible inside the work. The question is whether the student can account for purpose through frame, the reliance decisions they made, the judgment they exercised, and the way that discipline travels when the case changes.

Purpose through frame, reliance, accountability, and transfer. Together they describe what capable AI-enabled strategic judgment looks like in practice. A student who can define the purpose, express it through a defensible frame, calibrate AI support, remain answerable for the judgment, and carry that discipline into a changed case gives faculty evidence the finished artifact cannot provide alone.

**Frame evidence** makes the student's starting point explicit. Before submitting the final product, the student names the problem frame, the key assumptions, the criteria for success, the evidence standard, and the role AI played in the work. This should be short and specific: a page, not a portfolio. The questions are: why this problem, why this frame, what would change it? The student who answers those questions under questioning, and revises under challenge rather than retreating to the artifact's language, has owned the frame. The one who cannot has not.

**Reliance evidence** shows what the student did with AI. Students identify which outputs they accepted, which they modified, which they rejected, which they verified independently, and which they withheld AI from entirely, and why in each case. A short oral defense tests whether the student genuinely owns those choices rather than recording them after the fact. Research on oral assessment finds that compared to static written response, it provides a substantially richer picture of student understanding, precisely because it allows the assessor to probe explanations and observe how students reason under follow-up ([Theobold, 2021](https://www.tandfonline.com/doi/full/10.1080/26939169.2021.1914527)). A student who can defend a reliance decision under questioning has exercised it. A student who cannot has merely disclosed it.

**Transfer evidence** tests whether the discipline travels. Faculty give students a polished but misframed AI-generated strategic assessment and ask them to diagnose the hidden frame, expose the assumptions, identify the missing evidence, and articulate the failure point. The student then defends the critique orally and converts it into something reusable — a rubric, checklist, red-team protocol, or after-action note. That final step connects individual learning to institutional learning. The student produces an artifact that another student or instructor could use. The assessment reveals whether the student's judgment has become a transferable practice or remains a one-time response.

Assessments that generate this evidence work for the same reason good assignments always have. Problems where the template answer is wrong, or where the student must defend a choice among coherent frames, put a student genuinely in the work rather than pattern-matching its surface.

Recent work on authentic assessment in AI-mediated learning contexts argues for the same shift from the design side: authenticity cannot be enforced through detection; it has to be redesigned into the structure of the task ([Perkins, Roe & Furze, 2024](https://arxiv.org/abs/2412.09029); [Mollick & Mollick, 2023](https://arxiv.org/abs/2306.10052)). The shift is from what students know to how they apply knowledge, make judgment, and justify choices with AI in the loop. Process transparency (prompts, iterations, rationale) and oral defense make thinking visible in ways that finished artifacts cannot. The same redesign has been independently theorized in higher-education AI pedagogy: aligning assessment with intended cognition, rather than surface output quality, is the necessary response when fluency no longer signals understanding ([Vendrell & Johnston, 2026](https://doi.org/10.1016/j.caeai.2026.100572)).

This approach increases assessment burden on faculty, and that is a real cost. It is one reason the pilot described in the next section starts small and builds shared artifacts (rubrics, flawed-assessment libraries, oral defense criteria) so that burden distributes over time rather than multiplying independently for every instructor.

***

## IX. A Foundation Pilot

The pilot is a foundation layer, not the full future state. It tests whether students can own a problem frame before they scale judgment through AI. Later exercises should ask students and seminar teams to design, direct, and evaluate multi-agent workflows, using varied expertise and machine speed to test more frames than any one officer could test alone. This first pilot asks whether the human obligation is visible before that complexity is added.

If AI-enabled command requires officers to use speed without surrendering ownership, the pilot gives NWC a way to practice that behavior before operational speed makes the cost real.

The pilot runs on a framework NWC already teaches. The National Security Strategy Primer gives students five elements of strategic logic. They analyze the strategic situation, define desired ends, identify or develop means, design ways, and assess costs and risks. The Primer also makes clear that the work is iterative; assumptions, interests, political aims, and reassessment shape the whole process. The pilot adds no new vocabulary. It uses that one and asks a harder question of it: in an AI-enabled workflow, which parts of the strategic situation and frame stay the officer's to own, and how would a faculty member tell?

The sequence runs inside a single existing assignment where framing is the central demand.

**Step one: unaided problem frame.** Students produce a short problem frame without AI. They identify the strategic problem, key assumptions, relevant actors, desired ends, possible ways and means, risks, and evidence needed. Faculty score it on completion, not product quality, preserving the developmental friction of initial framing: the work of forming a mental map before receiving one.

**Step two: AI challenge.** Students bring their initial frame to an AI system with a specific task: identify the assumptions I may have missed, generate alternative frames for this problem, role-play a skeptical faculty member, surface risks or blind spots I have not named. Students record what they accepted, what they rejected, what changed, and why in each case.

**Step three: misframed AI assessment.** Students receive a polished AI-generated strategic assessment that is competent on its own terms and wrong for the strategic problem. Faculty choose the flaw at the level of frame. The answer might over-optimize for one success criterion, treat one constraint as decisive too early, assume away adversary adaptation, or import the wrong lesson from analogy. Hallucination or factual error would be easier to detect. The harder failure is a frame problem. The analysis is internally coherent and fails because it is grounded in the wrong understanding of the situation.

**Step four: diagnosis and revision.** Students identify the hidden frame, the assumptions that produced it, the evidence it suppressed, and the failure point. They revise and produce a short final recommendation that reflects their corrected understanding.

**Step five: oral defense.** Faculty ask students to explain what AI got wrong in the flawed assessment, what AI made easier in their own process, where reliance was appropriate and where they refused it, what changed between their first frame and their final one, and what evidence would change the recommendation. The oral defense is where frame ownership becomes visible, or does not.

Faculty should evaluate the pilot by what they can now observe: who can explain purpose through frame, who can use AI to pressure-test a chosen frame, who can recover from a flawed output under pressure, and who can defend the final judgment. Those observations tell NWC whether the foundation is strong enough to support more complex AI-enabled work later: students designing workflows, testing competing frames, and defending judgment when the system gives them more capability than structure.

***

## X. The Institutional Opportunity

NWC is unusually well positioned to take this seriously. Its graduates will work in national security environments where uncalibrated reliance, invisible delegation, and responsibility laundering are not academic risks. The classroom is the lower-stakes environment where the habits get built: where faculty can engineer controlled failures, let students experience the seductive fluency of a wrong answer, and teach the discipline of slowing down to expose the frame before it governs the work.

NWC faculty already bring much of the strategic judgment this requires. They know how to spot thin reasoning, how to ask the question that surfaces a hidden assumption, and when a student is performing sophistication rather than owning it. The harder task is joining that judgment to AI fluency: enough command of current systems to build and direct AI-enabled workflows, see where they help and fail, and defend reliance decisions under strategic scrutiny. Some faculty may already be near that standard; others can get there with support from colleagues and practitioners working near the edge of current practice. The institutional aim is to build that combined capacity inside the faculty, so the people assessing students can also recognize, model, and improve the work. Making that tacit judgment and emerging AI fluency explicit, designed, and transferable is the work that remains. NWC can do that through prompts, rubrics, flawed-assessment libraries, oral defense criteria, and faculty development sequences that have instructors diagnose the same AI output and compare what they notice. That is how the institution converts faculty judgment and faculty learning into durable institutional assets.

This is the ordinary work of a serious educational institution operating in an AI-enabled environment. The NWC curriculum already exports judgment through graduates, faculty scholarship, seminar practice, wargames, and professional networks. If NWC develops a rigorous pedagogy for AI-enabled strategic reasoning, documents it, teaches it to new faculty, revises it as the technology changes, and shares it with other PME institutions, it will have built something more useful than another AI policy. It has given faculty a way to teach, observe, and improve judgment in the environment their graduates are already entering.

Exported frames carry their assumptions invisibly. A prompt or workflow that embeds a particular theory of adversary behavior, a particular evidence standard, or a particular definition of strategic success will reproduce that frame at scale without the open argument that should accompany institutional guidance. Institutional AI-enabled teaching tools need to surface the assumptions they carry, to be traceable, revisable, and faculty-governed in the same way the assessment design asks students to be.

NWC's graduates will serve in organizations already operating inside AI-enabled decision environments, and many will help lead organizations increasingly shaped by those environments for the next twenty years. Many institutions are managing a policy question about whether and how students may use AI. PME frameworks to date have concentrated there: acceptable-use policy, classification tiers, faculty literacy training, and the infrastructure of responsible adoption ([Smith, 2025](https://www.airuniversity.af.edu/Wild-Blue-Yonder/Articles/Article-Display/Article/4219340/educating-the-ai-ready-warfighter-a-framework-for-ethical-integration-in-air-fo/)). That governance work is necessary groundwork, but it leaves the institution stuck at the permission layer while the harder pedagogical work waits. NWC has the specific mission, the faculty depth, and the operational stakes to build a pedagogy: a transferable, rigorous account of what AI-enabled strategic leadership requires and how to teach it. If NWC does that work, it will build a serious model of responsible AI-enabled leadership in PME that other institutions can inspect, adapt, and improve.

***

## XI. Conclusion

The first officer worked alone.

He struggled through the problem, built his frame from scratch, and produced a strategic approach that reflects the effort of that construction. The second built a team of agents, directed their inquiry against competing hypotheses, moderated the disagreement, and synthesized a result. Her product, by most measures, is better.

The stronger product matters. It still leaves the professional question: can either officer account for the purpose the work was pursuing and stand behind the judgment it produced? Did they choose the purpose? If the work is challenged, if a hidden assumption surfaces, if the recommendation proves wrong under changed conditions, can they explain what they chose, why, and on what basis they remain accountable for it?

The second officer orchestrating a set of agents is practicing exactly that competency, provided she defined what those agents were working toward, directed them against that purpose, calibrated her reliance across the uneven terrain of what each model does well, and can defend the result under pressure. That is what frame ownership looks like at scale. An officer who has not built that foundation first produces the same workflow and a different outcome: frame capture, responsibility laundering, uncalibrated reliance compounded across every agent in the loop.

NWC graduates must be able to direct AI-enabled systems toward a plainly owned purpose, calibrate reliance across the jagged frontier of what those systems actually do well, and stand behind the judgment under questioning because the purpose was theirs. That is the standard the operational environment will require. NWC's task is to make it teachable, observable, and repeatable.

Frame the problem. Calibrate the tool. Refuse the garden path. Own the decision.

***

## References

Andres, R. B. (2026). *AI and leadership: Preparing commanders for machine-speed war* [Unpublished manuscript]. U.S. National War College.

Bastani, H., Bastani, O., Sungu, A., Ge, H., Kabakcı, Ö., & Mariman, R. (2025). Generative AI without guardrails can harm learning: Evidence from high school mathematics. *Proceedings of the National Academy of Sciences, 122*(26), e2422633122. <https://doi.org/10.1073/pnas.2422633122>

Ceccarelli, G. (2024). Apprenticeship was the point. *Meditations on Tech*. <https://www.meditationsontech.com/p/apprenticeship-was-the-point>

Chow, W. W., Peng, S., Atiq, A., Truong, V., & Guo, M. (2026). "AI enhanced my critical thinking": Investigating the paradox of student perceptions and cognitive offloading in GenAI use. *Pacific Journal of Technology Enhanced Learning*. <https://ojs.aut.ac.nz/pjtel/article/view/246>

Collins, A., Brown, J. S., & Newman, S. E. (1989). Cognitive apprenticeship: Teaching the crafts of reading, writing, and mathematics. In L. B. Resnick (Ed.), *Knowing, learning, and instruction: Essays in honor of Robert Glaser* (pp. 453–494). Lawrence Erlbaum Associates.

Dell'Acqua, F., McFowland, E., Mollick, E., Lifshitz-Assaf, H., Kellogg, K., Rajendran, S., Krayer, L., Candelon, F., & Lakhani, K. R. (2023). Navigating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality. Harvard Business School Working Paper. <https://www.hbs.edu/faculty/Pages/item.aspx?num=64700>

Deng, Y., Brucks, M., & Toubia, O. (2026). Examining and addressing barriers to diversity in LLM-generated ideas. arXiv preprint arXiv:2602.20408. <https://arxiv.org/abs/2602.20408>

Laufer, B., Gilbert, T. K., & Nissenbaum, H. (2023). Optimization's neglected normative commitments. *ACM Conference on Fairness, Accountability, and Transparency (FAccT)*. arXiv preprint arXiv:2305.17465. <https://arxiv.org/abs/2305.17465>

Mollick, E., & Mollick, L. (2023). Assigning AI: Seven approaches for students, with prompts. arXiv preprint arXiv:2306.10052. <https://arxiv.org/abs/2306.10052>

Murthy, S. K., Ullman, T., & Hu, J. (2024). One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity. arXiv preprint arXiv:2411.04427. <https://arxiv.org/abs/2411.04427>

Perkins, M., Roe, J., & Furze, L. (2024). The AI Assessment Scale revisited: A framework for educational assessment. arXiv preprint arXiv:2412.09029. <https://arxiv.org/abs/2412.09029>

Raees, M., & Papangelis, K. (2026). From trust to appropriate reliance: Measurement constructs in human-AI decision-making. arXiv preprint arXiv:2604.23896. <https://arxiv.org/abs/2604.23896>

Sarter, N. B., & Woods, D. D. (1997). Team play with a powerful and independent agent: Operational experiences and automation surprises on the Airbus A-320. *Human Factors, 39*(4), 553–569. <https://journals.sagepub.com/doi/10.1518/001872097778667997>

Smith, B. (2025). Educating the AI-ready warfighter: A framework for ethical integration in Air Force professional military education. *Wild Blue Yonder*. <https://www.airuniversity.af.edu/Wild-Blue-Yonder/Articles/Article-Display/Article/4219340/educating-the-ai-ready-warfighter-a-framework-for-ethical-integration-in-air-fo/>

Theobold, A. S. (2021). Oral exams: A more meaningful assessment of students' understanding. *Journal of Statistics and Data Science Education, 29*(2), 156–159. <https://www.tandfonline.com/doi/full/10.1080/26939169.2021.1914527>

Vendrell, M., & Johnston, S.-K. (2026). Scaffolding critical thinking with generative AI: Design principles for integrating large language models in higher education. *Computers and Education: Artificial Intelligence, 10*, 100572. <https://doi.org/10.1016/j.caeai.2026.100572>

Young, J. P., Fanjoy, R. O., & Suckow, M. W. (2006). Impact of glass cockpit flight training on manual flying skills. *Journal of Aviation/Aerospace Education & Research, 15*(2). <http://commons.erau.edu/jaaer/vol15/iss2/5/>


# ===== SECTION: CLAIMS =====

# Claims Map

Use this file when the reader asks what the essay claims, wants to inspect the evidence, or wants to test whether a proposed edit preserves the argument. This map was synced to **The Irreducible Officer** on June 30, 2026. If the essay and this file diverge, treat that as a sync problem: revise the essay or this map deliberately, then back-map the change.

Keep the claim list short enough that the reader can choose where to go deeper.

## Thesis

NWC has to teach and certify AI-enabled strategic judgment: graduates who can use machine speed to sharpen strategic work, direct AI-enabled systems toward owned purposes, calibrate reliance on uneven systems, and remain accountable for the judgment under questioning. AI keeps the officer in the work and raises the standard for visible ownership of purpose, frame, reliance, accountability, and transfer, especially when strategic work is built from AI-shaped inputs.

## Claim Dependency Spine

1. AI-shaped inputs and AI-enabled workflows change the performance standard from product completion, permission policy, and artifact review to AI-enabled strategic judgment.
2. Finished artifacts still matter, but they carry less evidentiary weight as proxies for that judgment.
3. The certifiable competency is owned purpose and accountability made visible through frame, calibrated reliance, accountable judgment, and transfer.
4. Purpose is the irreducible human obligation; frame is how that purpose becomes visible in the workflow.
5. Appropriate reliance governs the human-AI relationship inside the workflow.
6. Developmental friction builds the human foundation needed for AI-enabled judgment, including but extending beyond unaided performance.
7. Assessment must make purpose through frame, reliance, accountability, and transfer visible.
8. NWC can join faculty strategic judgment to emerging AI fluency and convert those practices into shared institutional assets for an AI-enabled PME future.

Purpose and accountability are the irreducible human obligations. Frame, reliance, and transfer are the observable practices that let faculty assess whether those obligations were actually exercised.

## Claim 1: AI changes the performance standard

Policy and assessment-integrity questions matter. They sit upstream of the harder educational question of whether NWC can prepare and certify officers who use machine speed while keeping purpose, reliance, and accountability attached to human judgment.

- **Best evidence:** the opening section's AI-shaped-inputs claim, the two-officer example, the seminar-team future-state paragraph, and the conclusion's return to AI-enabled command of multiple agents.
- **Strongest unresolved question:** what level of AI-enabled performance should NWC expect from all graduates, and what should remain specialized or advanced?
- **Where to look:** sections "I. The Performance Standard Has Changed," "II. What the Problem Looks Like," and "XI. Conclusion."

## Claim 2: Finished artifacts carry less evidentiary weight

Papers, briefs, and analytic products still matter, and they were never the only evidence NWC used to judge learning. But they carry less evidentiary weight as proxies for student judgment because AI can produce genuinely strong work while leaving purpose, frame, reliance, and accountability unclear.

AI-assisted performance can be real performance. NWC still has to distinguish at least two competencies: performing with AI support and owning the reasoning well enough to defend, adapt, and transfer it.

- **Best evidence:** Bastani et al. on supported practice versus unsupported performance, Chow et al. on epistemic confinement, and the essay's opening distinction between strong finished work and uncertain ownership.
- **Strongest unresolved question:** how much unaided foundation should be required before students move into advanced AI-enabled workflows?
- **Where to look:** section "III. What Finished Work Can No Longer Carry."

## Claim 3: Purpose is expressed through frame

The human role begins before the system runs: deciding what problem is worth solving, what purpose the work serves, what assumptions govern the analysis, what evidence counts, and what kind of answer would matter.

AI may generate or compare frames, but it still operates toward a purpose, signal, objective, or constraint that someone supplied, accepted, or allowed to govern the work. Frame is how purpose becomes visible in strategic work.

- **Best evidence:** the NWC Primer's strategic logic, Laufer et al. on optimization's normative commitments, and the essay's discussion of strict prompts and purpose-definition.
- **Strongest unresolved question:** how should the argument change as AI systems become better at proposing frames and detecting strategic errors?
- **Where to look:** section "IV. Purpose as the Irreducible Human Act" and `sources/source-spine.md`.

## Claim 4: Appropriate reliance is a teachable competency

The educational target is not general trust or general distrust. Students need to know when to rely, when to verify, when to redirect, when to refuse, and how to explain the difference under questioning.

This turns AI use from a tool-use disclosure into an object of strategic judgment.

- **Best evidence:** Dell'Acqua et al. on the jagged frontier, Raees and Papangelis on appropriate reliance, and the essay's visibility diagnostic for where human judgment enters a workflow.
- **Strongest unresolved question:** what reliance evidence is enough without turning assignments into paperwork or compliance theater?
- **Where to look:** section "V. Appropriate Reliance as a Teachable Competency" and `artifacts/traceable-learning-artifact.md`.

## Claim 5: Developmental friction builds AI-enabled judgment

Some friction is waste; some friction is how judgment forms. NWC should remove work that consumes time without building judgment and preserve the struggle that builds strategic reasoning.

AI makes this design choice urgent because it can remove both kinds of friction without knowing the difference. The point is to build the human foundation needed to make human-machine work better than either human or machine performance alone. Students need deliberate exposure to AI failure, including systems that are useful, seductive, partially wrong, or unavailable, so they can practice when to lean on the tool, slow down, or intervene.

- **Best evidence:** Ceccarelli on apprenticeship, Collins/Brown/Newman on cognitive apprenticeship, aviation automation evidence from Young/Fanjoy/Suckow and Sarter/Woods, and the essay's "no garden paths" assignment principle.
- **Strongest unresolved question:** how can faculty preserve developmental struggle while still letting students build real AI-enabled capability?
- **Where to look:** section "VI. Friction as Developmental Design" and `cases/cyber-group-strategy-transfer-case.md`.

## Claim 6: Accountability is structural

AI can compress staff work, generate options, and model consequences, but it cannot own the legal, moral, professional, or command responsibility for a decision.

The human who authorizes the purpose and judgment remains answerable for what the system produces in pursuit of that purpose.

- **Best evidence:** Andres on conviction-shaped output, the essay's command responsibility argument, and the first-person ownership examples.
- **Strongest unresolved question:** where should NWC draw the line between routine AI-supported analysis and decisions where accountability must be practiced under pressure?
- **Where to look:** section "VII. Accountability Is Structural" and `prompts/objections-and-responses.md`.

## Claim 7: Assessment should make ownership visible

The assessment target is not whether AI was used. It is whether the student can show purpose through frame, explain the reliance decisions they made, remain accountable for the judgment, and transfer the discipline to a changed case.

Purpose through frame, reliance, accountability, and transfer are the essay's operational standard for AI-enabled strategic judgment.

- **Best evidence:** Theobold on oral assessment, Perkins/Roe/Furze and Mollick/Mollick on AI-mediated authentic assessment, and the essay's proposed traceable evidence categories.
- **Strongest unresolved question:** which trace elements actually reveal judgment, and which become retrospective paperwork?
- **Where to look:** section "VIII. Assessment That Makes Ownership Visible" and `artifacts/traceable-learning-artifact.md`.

## Claim 8: NWC can join faculty judgment to AI fluency

NWC faculty already bring much of the judgment needed to spot hidden assumptions, thin reasoning, and performed sophistication. The institutional opportunity is to join that strategic judgment to AI fluency: enough command of current systems to build and direct AI-enabled workflows, see where they help and fail, and defend reliance decisions under strategic scrutiny.

The goal is to build that combined capacity inside the faculty and make it explicit, reusable, and revisable through shared prompts, rubrics, flawed-assessment libraries, oral defense criteria, and faculty review practices for an AI-enabled PME future.

Those artifacts should themselves be traceable and faculty-governed because exported frames carry assumptions at scale.

- **Best evidence:** the faculty-fluency paragraph in Section X, the five-step pilot, the institutional opportunity section, and the essay's warning about exported frames.
- **Strongest unresolved question:** how should NWC build enough faculty AI fluency without separating it from the strategic judgment faculty already bring?
- **Where to look:** sections "IX. A Foundation Pilot" and "X. The Institutional Opportunity," plus `prompts/starter-prompts.md`.

## Back-Map To The Essay

| Essay section | Primary claim(s) | Function in the argument |
| --- | --- | --- |
| I. The Performance Standard Has Changed | Claims 1, 2, 3, 6 | Establishes the operating premise: strategic work is increasingly built from AI-shaped inputs, giving NWC a preparation-and-certification task beyond AI-use policy and finished-product review. |
| II. What the Problem Looks Like | Claims 1, 4, 7 | Uses the two-officer example and failure modes to define the standard a good AI-enabled workflow has to meet. |
| III. What Finished Work Can No Longer Carry | Claim 2 | Explains why finished products carry less evidentiary weight and need additional evidence of ownership. |
| IV. Purpose as the Irreducible Human Act | Claim 3 | Locates the human obligation before the system runs: purpose, made visible through frame, assumptions, and success criteria. |
| V. Appropriate Reliance as a Teachable Competency | Claim 4 | Turns AI use into a teachable judgment problem: when to rely, verify, redirect, or refuse. |
| VI. Friction as Developmental Design | Claim 5 | Distinguishes wasteful friction from the struggle that builds strategic judgment. |
| VII. Accountability Is Structural | Claim 6 | Shows why the human remains answerable for AI-enabled work, especially in national security contexts. |
| VIII. Assessment That Makes Ownership Visible | Claim 7 | Converts the argument into assessable evidence: purpose through frame, reliance, accountability, and transfer. |
| IX. A Foundation Pilot | Claims 1, 5, 7, 8 | Proposes a small implementation that lets NWC practice AI-enabled ownership before operational speed makes the cost real, while producing reusable evidence and signaling later team-based multi-agent workflows. |
| X. The Institutional Opportunity | Claim 8 | Extends the pilot into institutional compounding and faculty-governed teaching infrastructure. |
| XI. Conclusion | Claims 1, 3, 4, 6, 7 | Returns to the two officers and restates the professional question: can either officer account for the purpose and stand behind the judgment? |

## Quick Coherence Checks

Use these questions when reviewing future edits:

1. Does the edit strengthen the claim that NWC is teaching and certifying AI-enabled strategic judgment, with misuse detection as only one smaller concern?
2. Does it keep AI-assisted performance real while still requiring visible ownership of purpose, reliance, and accountability?
3. Does it separate purpose, frame, reliance, and accountability clearly?
4. Does it preserve developmental friction for the work that builds judgment?
5. Does it make assessment evidence observable without turning the workflow into compliance paperwork?
6. Does it help NWC compound faculty judgment and faculty AI fluency into reusable institutional artifacts?
7. Does it preserve the both/and: AI can extend strategic judgment and conceal weak ownership; human-machine teams should outperform either alone only when purpose, reliance, and accountability stay visible?


# ===== SECTION: SOURCE SPINE =====

# Source Spine

Use this file when the reader asks for sources, evidence, or deeper research. Keep facts separate from the essay's interpretation.

## NWC And Strategic Logic

- [A National Security Strategy Primer](https://nwc.ndu.edu/Portals/71/Documents/Publications/NWC-Primer-FINAL_for%20Web.pdf) anchors the argument in NWC's native language of strategic logic: interests, assumptions, ends, ways, means, costs, risk, and reassessment.

## PME, Command, And Accountability

- Richard B. Andres, *AI and leadership: Preparing commanders for machine-speed war* [Unpublished manuscript], provides the "conviction-shaped output" language and the command-accountability frame the essay extends into assessment. This is a review-copy source, not a public source.

## AI, Learning, And Instructional Design

- Bastani et al., [Generative AI without guardrails can harm learning](https://doi.org/10.1073/pnas.2422633122), provides the empirical caution used in the essay: unscaffolded AI support improved practice performance but reduced later unsupported performance.
- Chow et al., ["AI enhanced my critical thinking": Investigating the paradox of student perceptions and cognitive offloading in GenAI use](https://ojs.aut.ac.nz/pjtel/article/view/246), provides the "epistemic confinement" language used to describe fluency without independent frame ownership.
- Mollick and Mollick, [Assigning AI](https://arxiv.org/abs/2306.10052), helps frame AI roles in education, including tutor, coach, simulator, teammate, and student.
- Perkins, Roe, and Furze, [The AI Assessment Scale Revisited](https://arxiv.org/abs/2412.09029), is useful as a starting point for explicit AI assessment design. For NWC, it needs to be sharpened toward frame ownership, reliance decisions, and final judgment.
- Vendrell and Johnston, [Scaffolding critical thinking with generative AI](https://doi.org/10.1016/j.caeai.2026.100572), supports the essay's design logic: preserve productive struggle, sequence AI-free and AI-mediated phases deliberately, and align assessment with reasoning rather than surface fluency.

## Appropriate Reliance And Human Agency

- Raees and Papangelis, [From Trust to Appropriate Reliance](https://arxiv.org/abs/2604.23896), supports the distinction between trusting AI and relying on it appropriately.
- Raees et al., [From Explainable to Interactive AI](https://arxiv.org/abs/2405.15051), supports the move from post-hoc explanation toward user agency, adaptation, and co-design.

## Jagged Frontier And Uneven AI Capability

- Dell'Acqua, Mollick, et al., [Navigating the Jagged Technological Frontier](https://www.hbs.edu/faculty/Pages/item.aspx?num=64700), provides evidence that AI can improve performance on some knowledge-work tasks and degrade it on others.

## Purpose, Optimization, And Accountability

- Laufer, Gilbert, and Nissenbaum, [Optimization's neglected normative commitments](https://arxiv.org/abs/2305.17465), supports the claim that optimization embeds normative choices in the decision, objective, and constraints before the system runs.

## AI Diversity And Monoculture Risk

- Deng, Brucks, and Toubia, [Examining and addressing barriers to diversity in LLM-generated ideas](https://arxiv.org/abs/2602.20408), supports the risk that independent AI outputs may occupy a narrower conceptual distribution than human idea generation.
- Murthy, Ullman, and Hu, [One fish, two fish, but not the whole sea](https://arxiv.org/abs/2411.04427), supports the concern that alignment can reduce conceptual diversity in model outputs.

## Apprenticeship, Friction, And Tacit Judgment

- Collins, Brown, and Newman, "Cognitive Apprenticeship: Teaching the Crafts of Reading, Writing, and Mathematics," supports the claim that expert moves must be modeled, coached, scaffolded, articulated, reflected on, and practiced.
- Greg Ceccarelli, [Apprenticeship Was the Point](https://www.meditationsontech.com/p/apprenticeship-was-the-point), sharpens the distinction between wasteful friction and developmental friction.

## Automation Precedent

- Young, Fanjoy, and Suckow, [Impact of glass cockpit flight training on manual flying skills](http://commons.erau.edu/jaaer/vol15/iss2/5/), supports the concern that extensive automation practice can reduce manual skill.
- Sarter and Woods, [Team play with a powerful and independent agent](https://journals.sagepub.com/doi/10.1518/001872097778667997), supports the automation-surprise and mode-confusion precedent used in the essay.

## Oral Assessment And Traceability

- Theobold, [Oral exams: A more meaningful assessment of students' understanding](https://www.tandfonline.com/doi/full/10.1080/26939169.2021.1914527), supports oral defense as a way to probe student understanding under follow-up.

## PME Governance And Faculty Readiness

- Smith, [Educating the AI-ready warfighter](https://www.airuniversity.af.edu/Wild-Blue-Yonder/Articles/Article-Display/Article/4219340/educating-the-ai-ready-warfighter-a-framework-for-ethical-integration-in-air-fo/), is useful for the governance layer: acceptable-use policy, classification tiers, faculty literacy training, and responsible integration. The essay uses it as necessary groundwork, then argues NWC still has to answer the harder teaching and assessment question.


# ===== SECTION: OBJECTIONS =====

# Objections And Responses

Use this file when the reader wants to test the essay rather than simply apply it. Start with the strongest version of the objection.

## Objection 1: NWC already teaches this

**Strongest version:** NWC already teaches framing, assumptions, risk, evidence, and strategic judgment. Calling this "frame literacy" may rename existing practice rather than add anything useful.

**Best response:** That is partly true, and the essay should say so. The new problem is not the competency itself. The new problem is that AI weakens the old evidence of whether the competency was practiced. NWC may already teach the skill; AI makes it necessary to identify, teach, and measure it more explicitly.

**What could still be right:** If existing seminars and oral defenses already reveal frame ownership reliably, the needed change may be smaller than the essay implies.

**Useful test:** Take one existing assignment and ask: what evidence would show the student owned the frame if AI helped produce the artifact?

## Objection 2: Better models will solve the framing problem

**Strongest version:** If future systems can generate better problem definitions, assumptions, and strategies, frame literacy may be a temporary concern.

**Best response:** Better models move responsibility up one level. They can compare frames or reframe toward a goal, but the purpose, signal, constraint, or definition of progress still comes from a human or institution.

**What could still be right:** Future systems may do much more of the immediate framing work than the essay assumes.

**Useful test:** Ask what the system is optimizing for, who supplied or accepted that goal, and who is accountable if the recommendation fails.

## Objection 3: Trace artifacts will become bureaucracy

**Strongest version:** Requiring prompt logs, assumption audits, reliance notes, and oral-defense artifacts could turn learning into paperwork.

**Best response:** The trace should be lean. It should capture only the evidence faculty need to see frame ownership and appropriate reliance.

**What could still be right:** A poorly designed trace could become compliance theater and make assignments worse.

**Useful test:** Require only a lean trace: frame and purpose, inherited AI-shaped inputs, key assumptions, evidence standard, accepted/rejected AI outputs with reliance decision, final judgment, and one transfer check.

## Objection 4: Faculty workload will increase

**Strongest version:** AI already creates assessment burden. Asking faculty to review traces, run oral defenses, and design flawed AI outputs may be unrealistic.

**Best response:** The first pilot should be small and should use AI to generate inspectable objects for critique. Faculty judgment remains central, but the workflow should reduce some grading ambiguity by making reasoning visible.

**What could still be right:** Scaling the method across courses would require faculty development and shared artifacts.

**Useful test:** Run one 60-90 minute seminar exercise and ask whether faculty could see more clearly who owned the reasoning.

## Objection 5: Faculty may not yet have the AI fluency this requires

**Strongest version:** The essay assumes NWC faculty can recognize, model, and assess capable AI-enabled work. Some faculty may have the strategic judgment, but not yet enough command of current AI workflows to see where the system helps, fails, narrows the frame, or deserves reliance.

**Best response:** The essay should not pretend this capacity is already universal. The institutional opportunity is to build it inside the faculty: join existing strategic judgment to enough AI fluency that faculty can model the work, question it, and improve it over time. Practitioners working near the edge of current practice can help, but the pedagogy still has to be faculty-owned.

**What could still be right:** If NWC does not invest in faculty practice and calibration, the method could become a set of prompts and rubrics without the judgment needed to use them well.

**Useful test:** Run the companion's faculty fluency lab with one assignment. Ask whether faculty can build an AI-enabled workflow, name its hidden reliance points, and defend where human judgment must interrupt the system.

## Objection 6: This focuses too much on writing

**Strongest version:** NWC education is not only about papers. It includes seminar, wargaming, briefing, leadership, and strategic interaction.

**Best response:** The artifact problem begins with writing because writing is visible, but the method applies to any AI-assisted decision product: brief, plan, red-team critique, wargame move, or staff recommendation.

**What could still be right:** The essay should not let "paper" become a narrow proxy for all NWC learning.

**Useful test:** Apply the trace artifact to a briefing or wargame decision rather than a written paper.

## Objection 7: AI restrictions may be simpler

**Strongest version:** Instead of redesigning assessment, faculty could restrict AI use on key assignments.

**Best response:** Restrictions may be useful in some developmental moments. But they do not solve the broader leadership problem: students will operate in AI-enabled environments after NWC. They need practice using AI without becoming downstream of it.

**What could still be right:** Some assignments should preserve unaided first-frame work.

**Useful test:** Identify which part of the assignment must be done without AI and which part should use AI for critique, red-team, or alternative framing.

## Objection 8: This is too cautious about AI-enabled performance

**Strongest version:** The future standard should be maximum AI-enabled performance. If AI can help students produce better strategic work, NWC should teach them to push the tools hard rather than slow them down with traces and defenses.

**Best response:** The essay agrees that AI-enabled performance matters. The standard is not unaided purity. It is commanded AI use: work that is faster, broader, and sharper because AI is in the loop, while the officer still owns the purpose, reliance decisions, and final judgment.

**What could still be right:** If traces and oral defenses become heavy or performative, they could reduce the very performance the essay wants to improve.

**Useful test:** Give students the same strategic problem unaided, AI-assisted, and AI-directed. Compare not only polish, but frame quality, reliance judgment, adaptability under questioning, and transfer to a changed case.


# ===== SECTION: WORKFLOW PATTERNS =====

# NWC AI-Enabled Learning Workflows

Use these workflows when a reader wants to practice the method behind **The Irreducible Officer**. The first practice object is the essay itself. Faculty can use the same workflows to build their own AI fluency and then turn that experience into pedagogy.

The common loop is:

1. The human identifies inherited AI-shaped inputs already in the work.
2. The human states the purpose and first frame.
3. AI challenges, expands, or critiques the work.
4. The human accepts, rejects, or revises the AI contribution.
5. The human defends the judgment under questioning.
6. The learning is saved as a trace, prompt, rubric, flawed output, or exercise note.

## 1. Essay As Practice Object

**Use when:** faculty want to experience the method before assigning it.

**Human job:** identify the essay's purpose, frame, strongest claim, weakest claim, and likely objection.

**AI assistant job:** map the essay's claims, surface the strongest objection, and point to the best evidence and weakest support.

**Output:** one-page claim audit.

**Review question:** did the human revise the AI assistant's frame, or simply accept it?

**Reusable artifact:** a claim audit that can become a seminar prompt or faculty discussion note.

## 2. AI-Free First Frame, AI-Mediated Challenge, AI-Free Judgment

**Use when:** students or faculty need to preserve developmental friction while still practicing AI-enabled work.

**Human job:** write an unaided first frame of the strategic problem, including inherited AI-shaped inputs, purpose, assumptions, evidence standard, and what would count as success.

**AI assistant job:** challenge the frame, generate alternative frames, identify missing assumptions, and name where AI reliance would be risky.

**Human job after AI:** decide what to accept, reject, or revise, then state the final judgment in first person.

**Output:** revised frame plus reliance note.

**Review question:** did AI extend the human's reasoning, or replace the work the human needed to do?

**Reusable artifact:** a short before/after frame note and reliance decision.

## 3. Prompt Deconstruction As Frame Ownership

**Use when:** a prompt is being treated as a technical instruction rather than a strategic act.

**Human job:** draft a prompt for an AI-generated assessment, recommendation, critique, or briefing.

**AI assistant job:** identify the purpose, assumptions, evidence standard, theory of success, and blind spots embedded in the prompt.

**Human job after AI:** revise the prompt and explain what changed.

**Output:** original prompt, deconstruction, revised prompt, and explanation.

**Review question:** what frame did the prompt carry before anyone noticed?

**Reusable artifact:** a prompt review checklist for future assignments.

## 4. Flawed AI Output Lab

**Use when:** faculty need a polished object that lets students practice seeing beneath fluency.

**AI assistant job:** create a polished but flawed strategic assessment. The flaw should sit at the level of frame, assumptions, evidence standard, reliance, risk, or theory of success.

**Human job:** diagnose the flaw, revise the output, and build oral-defense questions that would expose the weakness.

**Output:** student-facing flawed output plus instructor key.

**Review question:** would the flaw survive a surface-level reading but fail under strategic questioning?

**Reusable artifact:** a flawed-output library entry.

## 5. Oral Defense Rehearsal

**Use when:** a reader has produced an AI-assisted recommendation and needs to test whether they own it.

**Human job:** bring an AI-assisted argument, recommendation, or critique.

**AI assistant job:** ask one question at a time about purpose, frame, assumptions, evidence, reliance decisions, rejected outputs, accountability, and transfer.

**Output:** oral-defense notes and missing evidence for the trace.

**Review question:** did questioning reveal judgment that was not visible in the finished artifact?

**Reusable artifact:** oral-defense question set.

## 6. Transfer Test

**Use when:** the essay's method needs to move into a real NWC-style artifact.

**Human job:** choose an approved artifact, case, assignment, or strategic product.

**AI assistant job:** map where the essay's method applies: inherited AI-shaped inputs, purpose, frame, assumptions, evidence standard, reliance, accountability, and transfer.

**Output:** exercise plan and trace artifact.

**Review question:** does the method still work when the case changes?

**Reusable artifact:** adapted exercise flow.

## 7. Faculty Calibration

**Use when:** faculty need to turn tacit judgment into shared instructional practice.

**Human job:** have several faculty independently diagnose the same AI-generated output or student trace.

**AI assistant job:** compare the diagnoses, identify agreement and disagreement, and draft revised review criteria.

**Output:** calibration note and revised rubric.

**Review question:** what did faculty see differently, and what should become shared guidance?

**Reusable artifact:** faculty calibration note.

## Prompt

```text
Use the NWC AI-enabled learning workflows to help me practice the method from "The Irreducible Officer."

First, ask whether I want to practice against the essay itself or transfer the method to an approved NWC-style artifact.

Then recommend one workflow:
- essay as practice object;
- AI-free first frame, AI-mediated challenge, AI-free judgment;
- prompt deconstruction;
- flawed AI output lab;
- oral defense rehearsal;
- transfer test;
- faculty calibration.

For the workflow you recommend, return:
1. the human job;
2. the AI assistant job;
3. the expected output;
4. the faculty review question;
5. the reusable artifact to save.
```


# ===== SECTION: TRANSFER CASE =====

# Cyber Group Strategy Transfer Case

This file describes the transfer exercise. It does **not** include a course artifact. Instructors should attach only artifacts they are authorized to use.

## Purpose

The essay should be the first object of practice. Readers identify its frame, claims, assumptions, objections, and teaching implications. Then the method transfers to a real NWC-style artifact so the exercise does not remain abstract.

The preferred transfer object is a strategic product with:

- a problem statement;
- political or institutional aims;
- lines of effort;
- assumptions;
- risks;
- evidence standards;
- evidence of AI use or possible AI assistance;
- enough ambiguity to support critique.

## Why Use A Cyber Group Strategy Artifact

A cyber strategy product creates useful instructional friction because it usually requires students to connect technical capability, institutional purpose, adversary behavior, risk, and policy judgment. It also exposes places where AI may produce fluent language while hiding a weak frame.

## Exercise Flow

1. **Interrogate the essay first.** Identify the essay's core frame, assumptions, claims, objections, and teaching implications.
2. **Name the transfer frame.** Identify the strategic problem the course artifact is solving.
3. **Audit assumptions.** Separate explicit assumptions, implied assumptions, inherited AI assumptions, and assumptions embedded in the assignment.
4. **Define evidence standard.** Identify what evidence would strengthen, weaken, or change the recommendation.
5. **Run an AI reliance check.** Decide where AI could help and where reliance would be dangerous.
6. **Generate or inspect a flawed AI assessment.** The flaw should be at the level of frame, not a trivial factual error.
7. **Oral defense.** Test whether the student owns the frame.
8. **Export the trace.** Complete the traceable learning artifact.

## Instructor Prompts

```text
Using the approved course artifact, identify where AI could produce genuinely strong work while obscuring who owns the frame, assumptions, reliance decisions, and final judgment.

Return:
1. the frame students must own;
2. the assumptions most likely to be inherited from AI;
3. the evidence standard faculty should require;
4. where AI can be useful;
5. where AI reliance should be interrupted;
6. oral-defense questions;
7. what trace artifact students should submit.
```

```text
Create a polished but flawed assessment of this artifact. The flaw must sit at the level of problem frame, assumptions, risk, evidence standard, or theory of success.

After the student-facing assessment, provide an instructor key and oral-defense questions that expose the flaw.
```

## Review Criteria

The exercise is working if faculty can see:

- whether the student can state the problem frame;
- whether the student can identify and revise assumptions;
- whether the student can calibrate AI reliance;
- whether the student can reject a fluent but flawed output;
- whether the student can defend the final judgment under questioning;
- whether the discipline transfers to a changed case or new artifact.


# ===== SECTION: TRACEABLE ARTIFACT =====

# Traceable Learning Artifact

Use this artifact when the goal is to make AI-assisted reasoning inspectable without turning the assignment into a compliance packet.

## Lean Template

### 1. Problem Frame

What problem are you solving? Distinguish the general condition from the strategic problem that requires judgment.

### 2. Inherited AI-Shaped Inputs

What reports, summaries, planning tools, staff processes, or prior analytic products shaped the work before you used AI directly? Which of those may already contain AI-generated or AI-filtered judgment?

### 3. Purpose And Success Standard

What should count as progress? What standard did you use to judge a good answer?

### 4. Assumptions

List:

- assumptions you made;
- assumptions inherited from the assignment;
- assumptions suggested by AI;
- assumptions you rejected or revised.

### 5. Evidence Standard

What evidence would strengthen, weaken, or change your conclusion? Which claims remain uncertain?

### 6. AI Role And Boundaries

What did AI do in the work? What was it not allowed to do?

### 7. Accepted AI Contributions

What AI outputs did you accept or adapt, and why?

### 8. Rejected Or Revised AI Contributions

What did you reject, correct, or reframe, and why?

### 9. Reliance Decision

Where was reliance appropriate? Where did human judgment need to interrupt?

### 10. Alternative Frames

What other frames did you consider? Why did you choose this one?

### 11. Oral-Defense Questions

What questions would expose whether you own the reasoning?

### 12. Final Human Judgment

State the final judgment in first person. Own the decision, including what remains uncertain.

### 13. Faculty Notes

For faculty review:

- evidence of frame ownership;
- evidence of appropriate reliance;
- evidence of accountability for the final judgment;
- evidence that the discipline transfers;
- evidence of developmental friction preserved;
- remaining concern;
- follow-up question.

## Minimal Version

If time is limited, require only:

1. problem frame and purpose;
2. inherited AI-shaped inputs, if any;
3. key assumptions;
4. evidence standard;
5. accepted/rejected AI outputs and reliance decision;
6. final human judgment in first person;
7. transfer check: what would change if the case changed?


# ===== SECTION: STARTER PROMPTS =====

# Starter Prompts

Use these prompts with **The Irreducible Officer** and this companion repo. They are designed for a back-and-forth session with an AI assistant.

## Choose A Path

**Best for:** deciding how to use the companion when the reader is unsure.

```text
Use "The Irreducible Officer" and this companion repo to help me choose the most useful next step.

Read:
- README.md
- AGENTS.md
- claims.md

Return three possible paths from this list: understand the argument, inspect a claim, test an objection, design an exercise, create a flawed AI assessment, run oral defense, or build a trace.

Recommend one path and explain why in plain English.
```

## Understand The Argument

**Best for:** getting the clean version of the essay before debating or applying it.

```text
Use "The Irreducible Officer" and the companion repo to explain the argument in 10 bullets.

Do not turn this into a generic AI-in-education summary. Preserve the specific claim: NWC must teach and certify AI-enabled strategic judgment by making purpose, frame, reliance, accountability, and transfer visible.

Start with the files:
- README.md
- AGENTS.md
- the-irreducible-officer.md
- claims.md
- sources/source-spine.md

Return:
1. the thesis in one sentence;
2. the argument in 10 bullets;
3. the claim most likely to be misunderstood;
4. why that misunderstanding is tempting;
5. two questions NWC faculty should keep open.
```

## Inspect The Claims

**Best for:** letting Dan/Rich or faculty pressure-test the essay instead of passively receiving it.

```text
Use the companion repo to help me inspect the evidence behind "The Irreducible Officer."

First, read:
- claims.md
- sources/source-spine.md

Then list 5-7 important claims from the essay that are worth auditing. For each one, give me a short label and one sentence on why it matters.

Ask me which claim I want to inspect.

After I pick a claim, audit it with me. Return:
1. the best evidence in the repo;
2. the strongest unresolved question or counterexample;
3. where the evidence is strong, weak, or incomplete;
4. what source I should read if I want to go deeper;
5. one practical implication for NWC instruction.

Keep it conversational. Do not defend the essay by default, and do not bury me in sources before I choose the claim.
```

## Design An NWC Exercise

**Best for:** turning the essay into a concrete faculty activity.

```text
I want to turn "The Irreducible Officer" into a practical NWC learning exercise.

Read:
- README.md
- AGENTS.md
- the-irreducible-officer.md
- claims.md
- cases/cyber-group-strategy-transfer-case.md
- artifacts/traceable-learning-artifact.md

Design a 60-90 minute exercise for NWC faculty or students that uses the essay's method.

The exercise must:
1. begin by interrogating the essay itself;
2. then transfer the method to an approved NWC-style artifact;
3. identify any AI-shaped inputs the learner inherits before using AI directly;
4. force the learner to identify the frame, assumptions, evidence standard, and AI reliance decisions;
5. include a flawed AI output or flawed frame for critique;
6. end with a traceable learning artifact faculty can inspect.

Return:
- learning objective;
- materials needed;
- step-by-step flow;
- facilitator notes;
- student/faculty outputs;
- assessment criteria;
- likely failure modes.
```

## Practice Faculty AI Fluency

**Best for:** helping faculty practice the kind of AI-enabled judgment the essay asks them to teach and assess.

```text
Use "The Irreducible Officer" and this companion repo as a faculty fluency lab.

Read:
- README.md
- AGENTS.md
- the-irreducible-officer.md
- claims.md
- sources/source-spine.md
- artifacts/traceable-learning-artifact.md

Your job is to help me practice joining strategic judgment to AI fluency. Do not
give me a generic AI tutorial. Use the essay's standard: purpose, frame,
reliance, accountability, transfer, and developmental friction.

Run this as a working session:
1. Ask me for one NWC-style task, case, assignment, or strategic problem.
2. Identify any AI-shaped inputs already present in the task, such as reports,
   summaries, planning tools, staff processes, or analytic products.
3. Help me define the purpose, problem frame, assumptions, and evidence standard.
4. Propose an AI-enabled workflow that could sharpen the work.
5. Identify where the workflow might hide judgment, narrow the frame, or invite
   uncalibrated reliance.
6. Ask me to defend which AI outputs I would accept, reject, verify, or withhold.
7. Turn the session into faculty-facing notes: what to model for students, what
   to observe, what oral-defense question to ask, and what reusable artifact to
   save.

After the session, assess my fluency plainly:
- what I commanded well;
- where I let the system set the terms;
- what I should practice before using this with students;
- what faculty artifact should be improved.
```

## Create A Flawed AI Assessment

**Best for:** making AI fluency an object of critique rather than a shortcut.

```text
Create a polished, confident, but flawed strategic assessment for students to critique.

Use:
- claims.md
- cases/cyber-group-strategy-transfer-case.md
- artifacts/traceable-learning-artifact.md

The flaw should be at the level of frame, assumptions, evidence standard, reliance, or risk treatment. It should not depend on an obvious factual error.

After the student-facing assessment, provide an instructor-only key:
1. hidden frame;
2. flawed assumptions;
3. missing evidence;
4. risk or tradeoff the answer buries;
5. questions that would expose the flaw in oral defense;
6. what a stronger frame would include;
7. what students should record in the traceable learning artifact.
```

## Run Oral Defense

**Best for:** checking whether the learner owns the purpose, frame, reliance decisions, and final judgment behind an AI-assisted artifact.

```text
Act as an NWC seminar instructor conducting a short oral defense.

Read:
- AGENTS.md
- claims.md
- artifacts/traceable-learning-artifact.md

Ask one question at a time. Your goal is to determine whether I own the purpose, frame, reliance decisions, and final judgment behind my AI-assisted work.

Press me on:
- purpose and success standard;
- problem frame;
- assumptions;
- evidence standards;
- alternative frames;
- reliance decisions;
- rejected AI outputs;
- risks and costs;
- what would change my conclusion;
- where human judgment must interrupt automation.

After six questions, assess whether I demonstrated ownership of the reasoning and identify what evidence should be added to the traceable learning artifact.
```

## Build The Trace

**Best for:** turning an AI-assisted exercise into evidence faculty can inspect.

```text
Using the critique or exercise we just completed, create a traceable learning artifact.

Read:
- artifacts/traceable-learning-artifact.md
- claims.md

Return a completed artifact with:
1. problem frame;
2. inherited AI-shaped inputs, if any;
3. assumptions;
4. evidence standard;
5. AI role and boundaries;
6. accepted AI contributions;
7. rejected or revised AI contributions;
8. reliance decisions;
9. oral-defense questions;
10. final human judgment;
11. transfer check;
12. faculty review notes.

Keep it practical enough to use in one seminar, not as a compliance packet.
```