Think Like a Lawyer, So You Don't Become the Client
Field Notes from 200+ Semi-Autonomous Sprints — Part 5
Every failure mode in this series traces back to the same root cause: the spec wasn't good enough. The spec is the product.
Every failure mode in this series traces back to the same root cause: the spec wasn't good enough.
Context degradation? Partly a context management problem, partly because the agent had to spend tokens figuring out what you meant instead of doing what you asked. Gate bypass? The agent found a path through your verification that satisfied the letter of the spec but not the intent — because the intent wasn't written down. Tautological tests? The spec said 'write tests' without defining what the tests should prove. The agent filled the gap. It just filled it wrong.
If you take one thing from this series, take this: the spec is the product. The code is a side effect.
Hand someone a blank Jira ticket and you've handed them a blank check. They get to decide what 'done' means, what's in scope, and what quality looks like. You've delegated every decision and kept none of the leverage. That's not a task assignment — that's a trip to the land of make-believe.
Humans Guess Well. That's Not a Compliment.
When you hand a human developer a vague ticket — 'add error handling to the checkout flow' — they do something remarkable without thinking about it. They check how errors are handled elsewhere in the codebase. They consider the user experience. They make judgment calls about which errors are recoverable and which aren't. They ask a colleague if something feels off. Ninety percent of the ticket is never written down, and the feature ships anyway.
But here's the thing: just because a human can fill the gaps doesn't mean they should have to. A missed specification is a missed specification whether a human or an agent is executing it. When a human developer guesses right, that's not good spec writing — that's luck and experience papering over a gap that shouldn't have been there. When they guess wrong, you get a missed feature, wrong business logic, or a subtle bug that doesn't surface for weeks. The fault lies with whoever wrote the spec, not the person executing it.
Agents just make this problem harder to ignore — usually. Sometimes the result is immediate and obvious. Other times it's sneaky. Remember the snapshot laundering from Part 3? Same root cause — the spec said 'snapshots must pass' without saying 'fix the rendering first.' The agent found the gap and walked right through it.
An agent won't walk over and ask a clarifying question. It won't intuit your business logic from hallway conversations. It will produce something for every gap in your spec, and that something will look reasonable. It just might not be what you wanted.
Every gap in your spec is a decision you're delegating — to a human's best guess or a model's token probabilities. If you didn't mean to delegate that decision, write it down.
What a Contract Looks Like
Four clauses. Miss any of them and you've left a gap the agent will fill for you.
What to do. The task. Everyone writes this part. It's the least important if the other three are missing.
What done looks like. Acceptance criteria — a checklist, not a paragraph. Each item binary: passes or doesn't. 'User receives reset email within 30 seconds. Token is single-use, expires after 15 minutes. Expired tokens show a specific error, not a generic 404.' If you can't check it mechanically, it's not a criterion — it's a wish.
What not to touch. Scope fences. An agent with an underspecified scope will wander into adjacent files, refactor things that work fine, and 'improve' code that didn't need improving. 'Do not modify the auth module. Do not change the database schema. Do not update dependencies.' These aren't optional footnotes — they're load-bearing walls.
What to do when stuck. Stop and report? Skip and document? Try an alternative? If you don't specify this, the agent will — and its choice is usually to produce something that looks finished while quietly dropping the hard part.
Precision, Not Length
Most people write specs like they're talking to a colleague. 'Hey, can you fix that bug in the checkout? You know, the one Sarah mentioned?' A human parses this fine. An agent has no idea who Sarah is.
Specs don't need to be long. They need to be precise. 'Fix the 422 error on POST /api/orders caused by missing shipping_address validation in the request schema' is shorter than the conversational version and infinitely more actionable.
Test the Spec, Not Your Patience
Post 4 introduced the Spec Interrogator — an agent whose job is to find ambiguities before the implementing agent starts. Use it. Throw your spec at an adversarial agent and ask: 'Where would you take a shortcut? What's underspecified? What could you interpret two ways?' Fix the gaps before a single line of code gets written.
Or do it yourself. Read your spec and ask: if I handed this to someone with zero context about my project, could they deliver exactly what I want? If the answer starts with 'well, they'd need to know that we...' — whatever comes next belongs in the spec.
Thirty minutes writing a tight spec saves hours debugging vague output. The spec is the cheapest, highest-leverage artifact in your entire workflow. Treat it accordingly.
Next up: Context is Gold — Every token you waste is one your agent can't use to think.