Module 08: Judgment & Quality Control

The through-line of this entire course

If you've worked through Modules 01–07, you've already been taught this module. You just didn't see it as one thing. Every module carried a small, specific rule about what Claude is not allowed to do on its own. Put them side-by-side and the pattern appears:

Module 02

Specificity discipline. Vague prompts produce vague work — and vague work is where Claude's errors hide.

Module 03

Claude builds the formula; you verify the numbers. The model structures the tracker; Excel and your eye prove it.

Module 04

Flag gaps, don't invent. An SOP that fabricates tribal knowledge is worse than no SOP at all.

Module 05

It has to sound like you. The brief carries your voice or it carries someone else's authority by accident.

Module 06

Do not compute final numbers. Claude gives you the model; Excel runs the math; you defend the assumptions.

Module 07

No invention. Every fact in a field report traces back to raw technician input — or it's a flagged gap.

Six different modules. Six different deliverables. One underlying discipline: Claude is a collaborator, not an oracle. The discipline that runs through every rule is judgment — knowing what Claude is good for, what it isn't, and what has to be verified before a document leaves your desk.

The capstone principle

You are the inspector. Everything Claude produces is a draft awaiting your review. Every sentence is provisional until you've checked it. This is not pessimism about the tool — it's the only posture that lets you safely use it on work that matters. The faster you internalize this, the more you can actually lean on Claude for leverage.

This module teaches the inspection itself. Not as an attitude — as a practice, with named moves, named failure modes, and a personal operating system you take with you.

How Claude fails — the taxonomy

Before you can inspect, you need to know what you're inspecting for. Claude fails in predictable ways. Name them, recognize them, and your verification is no longer guesswork — it's a checklist.

Failure mode	What it looks like	Where it hurts you
Hallucinated facts	Invented part numbers, citations, dates, people, URLs. Confident and plausible. Often specific in a way that feels convincing.	Reports and deliverables that reference things that don't exist. Credibility loss with clients, audit findings.
Arithmetic error	Sum of a column that doesn't actually add. Percentages off by a factor. Compounded wrongly. Common and invisible.	Forecasts, budgets, variance analyses. A decision made on a wrong number.
Stale knowledge	References to prices, tools, regulations, people that have changed since the training cutoff. Claude doesn't always know it's stale.	Recommendations based on obsolete facts. Procedures that reference retired systems.
Context misread	Claude answers the question it thought you asked, not the one you did. Frequent with ambiguous prompts or buried constraints.	Deliverables that miss the point. A tracker built for the wrong KPI. A brief aimed at the wrong decision.
Confident hedge	"Typically," "generally," "in most cases." Feels like expertise; actually means Claude isn't certain and is smoothing over the uncertainty.	False confidence in the final document. Readers treat a hedge as an assertion.
Optimistic interpretation	Given ambiguous data, Claude picks the reading that makes you look good. Absent data, it assumes the best case.	Forecasts that overclaim. Reports that minimize real risk. Scenarios with no teeth.
Template drift	Fifth turn in a long session, Claude starts filling in plausible-looking content that no longer follows your schema. Structure drifts.	Deliverables that look right at a glance and are quietly wrong in the details. Inconsistent output across a batch.
Sycophancy	Claude agrees with you when it shouldn't. "You're right, that's a great point" — even when your point was wrong.	Lost adversarial pressure. You needed a second opinion and you got a yes-man.

Almost every verification move in the rest of this module is designed to catch one or more of these. Recognize the failure mode and the move picks itself.

The rule of the plausible wrong answer

The dangerous Claude output isn't the one that looks wrong. The dangerous one is the output that looks right — fluent, formatted, confident — and is wrong in a way you won't catch without a specific check. This is why the check has to be a practice, not a feeling.

The five verification moves

This is the inspection toolkit. Five moves. Not every move applies to every deliverable — but on any serious piece of work, at least two of them should.

1

Trace the claim

For any factual assertion in Claude's output, ask: where did this come from? If it's from your input, confirm it's quoted or paraphrased correctly. If it's Claude's own knowledge, it needs independent verification before it survives into the final document.
2

Check the math

Every total, every percentage, every ratio. Sample three to five numbers and verify by hand or in Excel. If they check, trust the rest provisionally. If any one fails, recompute everything. No exceptions for forecasts or invoices.
3

Query the context

Re-read Claude's output while asking: did it answer the question I actually asked? If you buried a constraint three paragraphs into your prompt, check whether the output respects it. Context misreads are invisible if you don't look for them deliberately.
4

Test the edges

Where could this output be wrong and still look right? Push on the boundaries. "What happens in the downside case?" "What if this assumption is off?" "What's the worst reading of this finding?" Edges expose weak interiors.
5

Adversarial read

Read the output as if you were the person most likely to challenge it — a skeptical CFO, a regulator, a competitor, a difficult client. Where do they poke first? That's where you harden the document before it leaves your hands.

These moves are not academic. Move 1 and 2 catch hallucinations and arithmetic errors. Move 3 catches context misreads. Move 4 catches optimistic interpretation. Move 5 catches everything else — because adversarial reading forces you out of the author's chair and into the critic's.

Note

The minimum bar on any client-facing deliverable is two moves: Trace the claim and Adversarial read. If you don't have time for those two on a document with your name on it, you don't have time to send the document.

How to push back — the red-team move

The highest-leverage verification technique is to make Claude argue against its own output. Models are trained to be helpful, which means they are biased toward agreeing with you and toward defending what they just produced. You can override that by asking directly.

The framing matters. Don't ask "is this right?" — Claude will say yes. Ask Claude to identify what's weak, what could be wrong, and what a critical reader would attack first. Frame it as a separate role — an auditor, a CFO, a plant manager — and the quality of the pushback rises sharply.

The red-team prompt

Red-team your own draft

Use after any serious draft

ROLE: You are a skeptical senior reviewer — [CFO / auditor / plant manager / client — pick the one whose challenge matters most]. You have seen many deliverables like this one. You are not rude, but you are direct, and you do not give credit for effort.

CONTEXT: [Paste the full draft you just produced. State who it's for and what decision it supports.]

TASK:
1. Identify the three weakest claims in the document — where the evidence is thin, the logic has a gap, or the language is overstating what the data actually supports.
2. Name the three questions you would ask the author if you received this document cold.
3. Flag any number, citation, or specific fact that you cannot verify from what's in the document itself.
4. State where the document would fail a critical audience — what would get cut, what would get challenged, what would get dismissed.

STANDARD: Do not reassure me. Do not soften the feedback. Do not identify "strengths" unless asked. Your job here is only to find the weak points. If the document is actually strong, say that plainly — but assume until proven otherwise that it has problems I haven't seen.

Two things happen when you run this prompt. First, Claude identifies real weaknesses — often ones you already half-suspected but didn't name. Second, the feedback becomes the revision plan. Fix what the red team caught, and you've closed the most visible gaps before a real reader ever sees the document.

The rule of the second voice

Any deliverable that's going to a decision-maker gets red-teamed once. Not as a formality. As an actual second pass, with Claude in a different role, looking for what's wrong. The version of the document that goes out the door is always the one that survived the red team — never the first draft.

Drift signals — how to notice when Claude is wandering

Long sessions and complex deliverables create a second class of problem: Claude slowly drifting away from the original constraints. It's not a sudden failure — it's a gradual softening. The output stays fluent; the adherence to your brief weakens. If you can't spot drift as it happens, you'll only find it after you've already leaned on a bad draft.

These are the signals. Learn to feel them in real time.

Drift signal 01

Hedge language creeping in

"Typically," "generally," "in many cases," "most organizations." Translation: Claude is not sure anymore and is smoothing over the uncertainty with softening language. Push back: "What specifically? For what company, in what situation?"

Drift signal 02

Sudden confidence without added evidence

Two turns ago Claude was cautious. Now it's asserting something firmly, with no new inputs from you. Translation: Claude has pattern-matched into a confident register without actually getting more information. Verify the assertion before you use it.

Drift signal 03

Paragraph bloat

Responses getting longer and more eloquent over the session, but not denser with information. Translation: Claude is filling space rather than answering. Short-prompt the next turn to bring the register back.

Drift signal 04

Lost constraint

You specified the format in turn one. Twelve turns later, Claude is producing something shaped differently and hasn't flagged the change. Translation: drift. Paste the constraint back in and re-ask.

Drift signal 05

Sycophancy after pushback

You challenged something. Claude immediately agrees, reverses, apologizes. Translation: this is not reasoning — it's a reflex to please. Sometimes the original answer was actually right. Force the question back: "Was the first version wrong? If so, why specifically?"

Drift signal 06

Template mimicry without substance

Output looks like your template — headers, bullets, the right format — but the content inside each section is thin or generic. Translation: Claude is matching the shape without engaging the substance. Ask for specifics per section.

Drift is a practice problem — the fifth time you feel one of these in a session, you'll catch it automatically. The first time, it will pass you by. Running the five verification moves (Section 3) on any serious output is the safety net under drift you didn't notice.

The operating system — your daily loop

Everything in this course collapses into one repeating loop. Six steps. You will run this, in some form, every time you use Claude on work that matters — for the rest of your career. Name it, practice it, and it becomes reflex.

Step 1 Frame

Step 2 Prompt

Step 3 Draft

Step 4 Verify

Step 5 Revise

Step 6 Decide

Frame → Prompt → Draft → Verify → Revise → Decide. Repeat.

What each step actually is

Frame. Name the deliverable, the audience, the decision it supports. If you can't state these in one sentence, you're not ready to prompt.
Prompt. Role, context, task, standard. Module 02's discipline. Include the specific guardrail for the deliverable — no invention, no computation, no register drift.
Draft. Let Claude produce the artifact. Don't over-steer in the first pass. Let it show you what it has.
Verify. The five moves from Section 3. At minimum, trace the claim and adversarial read. Add math check, context query, and edge testing as the stakes require.
Revise. Red-team if the deliverable is going to a decision-maker. Fix what the verification and the red team caught. This is where most of the quality comes from.
Decide. Sign it. Send it. File it. Or pull it back for another loop. The decision is always yours, and that's the whole point.

The loop isn't linear — it's a habit

You will run this in five minutes on a small email draft and in two hours on a major forecast. The steps don't change. Only the depth of each one does. The practitioners who get the most out of Claude are the ones who run the same disciplined loop on everything — not the ones with the cleverest prompts.

Keep a Claude-caught-errors log

Start a running document — one page, dated entries. Every time you catch a Claude error in your verification, log it. What was the deliverable, what did Claude get wrong, and how did you catch it? In three months you'll have a personalized taxonomy of the failure modes you personally encounter most. In six months it will be the most valuable professional document you've built in years. It's the training set for your own judgment.

Module 08 Deliverable

Your personal QA protocol — one page, yours forever

This module's deliverable is the document that closes the loop on the whole course. A one-page personal QA protocol, built from your own experience, that you will carry with you into every session of serious work with Claude.

Build Checklist — your protocol is complete when:

1. Your personal verification moves

The five moves from Section 3 adapted to your work. Which ones you run by default, which ones you escalate to. Name them in your own words.

2. Your top three failure modes

From the taxonomy in Section 2 — which three have actually bitten you in your work so far? Write them down with a sentence of why.

3. Your opening prompt template

The prompt you paste at the start of every serious session. Role, context standards, your personal guardrail. Built from Module 02 and sharpened here.

4. Your red-team prompt

A saved, ready-to-paste version of the red-team prompt from Section 4, tuned to the audience types you actually work for.

5. The drift signals you watch for

The two or three signals from Section 5 you are most likely to miss. Naming them on paper is how you train yourself to catch them live.

6. Your Claude-caught-errors log, started

A fresh one-page document. First entry dated today. You will add to it for as long as you use Claude seriously — and it will become the most valuable judgment artifact you build in this course.

Save this file where you'll see it. Open it at the start of every serious Claude session for the first month. After that it becomes reflex — and the file is still there, still refining.

Open QA Protocol Template

You are the inspector.
Everything Claude produces is a draft.