The best prompt you'll ever write is the one you delete.

An agent that waits for you to remember to ask it is a toy. Wire a deterministic event to a headless run that reads your rules, and the routine context work fires on its own — with your conventions already baked in.

The best prompt you'll ever write is the one you delete.

You open the agent, paste the diff, and type: “Draft a changelog entry for this release, group by feature, skip the dependency bumps, match the tone of the last one.” It does a decent job. You copy the output, tweak two lines, ship it. Smooth.

Then three weeks pass and you forget. The release goes out with a changelog that just says “various fixes,” because the step that produced the good one lived entirely in your head and your fingers. The agent was never the bottleneck. You were the trigger — and you misfired.

This is the part nobody benchmarks. Everyone evaluates agents on prompt quality: how well does it answer when you ask it well? But the most expensive failure isn’t a bad answer. It’s the answer that never got requested because the human who had to remember was busy, on leave, or three context-switches deep. The real unlock isn’t a better prompt. It’s removing the prompt.

A trigger is a hook with a wider blast radius

Section titled “A trigger is a hook with a wider blast radius”

Inside a session, you already trust deterministic wiring. A hook fires on a fixed event — a file write, a tool call, a commit attempt — and runs code you control, no model judgment in the loop. PreToolUse blocks a command; PostToolUse runs the formatter. The agent doesn’t decide to lint; the event causes the lint.

A trigger is the same primitive aimed outward. Instead of “after the agent edits a file, run this,” it’s “after this event happens in the world, run the agent.” The event is the prompt. A merged PR, a cron tick, a new issue label, a failing nightly build — each is a fact about your repo that should cause context work, and right now that work waits on your memory.

The mechanism that makes this possible is the same one that powers the in-session hook: the agent can run headless — no chat window, no human turn-taking, just a one-shot invocation with a prompt baked in and an exit code at the end.

Terminal window
# .github/workflows/changelog.yml — the event is the prompt
on:
release:
types: [published]
jobs:
draft-notes:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with: { fetch-depth: 0 }
- run: |
claude -p "Draft CHANGELOG.md for the diff since the last tag.
Follow the conventions in CLAUDE.md exactly." \
--allowedTools "Bash(git log:*),Edit" \
> notes.md
- uses: peter-evans/create-pull-request@v8
with: { commit-message: "docs: changelog draft", branch: changelog-bot }

No human typed that prompt at release time. The release was the prompt.

Without rules, an unattended run is a loaded gun

Section titled “Without rules, an unattended run is a loaded gun”

Here’s the question that should bother you: if no human is in the loop to tweak two lines and fix the tone, what stops the unattended run from producing confident garbage? In an interactive session, you are the error-correction layer. Remove yourself and the agent’s contextlessness is no longer caught — it’s committed, signed, and shipped.

This is the gap in its purest form. The agent is broad and fast but doesn’t know your changelog groups by feature, that you skip dependency bumps, that “fix” entries get a ticket reference. A human reviewer carries that knowledge in their head and applies it live, every time. Take the human out of the loop and that knowledge leaves with them — unless you wrote it down first.

So you write it down. Rules — the CLAUDE.md / AGENTS.md your agent reads on every run — are what turn a trigger from reckless into reliable. The trigger supplies the when; the rules supply the how, with your conventions.

# CLAUDE.md — changelog conventions
When drafting CHANGELOG.md:
- Group entries under ## Features, ## Fixes, ## Internal.
- Skip commits matching `^(chore|deps|bump)`.
- Each Fix links its ticket: `- Foo no longer crashes (#1423)`.
- Match the voice of the existing top entry. Never invent a version number;
read it from package.json.

That file is the whole difference. Hand the same trigger to an agent with an empty CLAUDE.md and you get a generic bullet list. Hand it to one that reads those eight lines and you get your changelog — at 2 a.m., with nobody watching, in a PR you can review on your own clock instead of producing on the release’s clock.

A changelog draft is forgiving — worst case it’s bland and you rewrite it. The payoff climbs when the forgotten work is also the work humans are worst at: tedious, judgment-heavy, easy to skim past at 5 p.m. on a Friday. Dependency review is the canonical case. A bot opens a PR bumping a package two minor versions. You glance at it, see the tests pass, and click merge — because reading a changelog and cross-checking it against your own call sites is exactly the kind of careful, boring work a tired human waves through.

Same primitive, harder problem. The event is “a dependency PR opened.” The standing instruction is the thing you’d do if you weren’t tired.

.github/workflows/dep-triage.yml
on:
pull_request:
paths: ["**/package.json", "**/*.lock"]
jobs:
triage:
if: github.actor == 'dependabot[bot]'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: anthropics/claude-code-action@v1
with:
claude_args: >
--allowedTools "Bash(git diff:*),Read,Grep"
--max-turns 15
prompt: |
A dependency bump opened this PR. Read the version diff, then grep
for every place we import the changed package. Cross-check the new
version's breaking changes against how we actually call it. Post one
comment: risk level, the exact call sites that could break, and a
one-line verdict. Follow CLAUDE.md. If nothing in our code touches
the changed surface, say so in one line and stop.

Dependabot already tells you a version moved. What it can’t tell you is whether your code cares — and that’s the entire gap this site exists to close. The agent is broad: it can read a foreign package’s changelog faster than you can. It’s contextless: it has no idea you only ever call two functions from that library. The grep over your own imports is where the broad-but-shallow agent borrows the narrow-but-deep knowledge that normally lives in a senior engineer’s head. The PR supplies the when; the diff and the rules supply the how, against our actual usage. You stop merging dependency bumps on faith and start merging them on a read you didn’t have to remember to do.

Notice what changed between the two examples and what didn’t. The trigger moved from release to pull_request. The tools moved from Edit to read-only Grep. The output moved from a file to a comment. The shape — deterministic event, headless run, rules-supplied conventions, a human-reviewable artifact at the end — is identical. That stability is the point: once you’ve wired one trigger, every other one is the same four moves with different nouns.

The objection writes itself: this is just brittle automation that’ll page me at 2 a.m. when it misfires. It would be — if the trigger shipped. It doesn’t. Look again at both examples: one ends at a pull request, the other at a comment. Neither merges, deploys, or replies to a customer. The trigger drafts; a human still commits. You haven’t removed the review gate, you’ve moved it off the critical path — the work is waiting for you instead of you being the bottleneck the work waits on.

But “it only drafts” doesn’t make an unattended run safe by itself. Take the human out of the loop and you also remove the human’s failure-handling, and that loss shows up in three specific places.

Duplication. A network blip makes CI retry the job. Now you have two identical changelog PRs, or the same triage comment posted twice. The agent didn’t malfunction — retries are normal — but nothing checked whether the work already existed. The fix isn’t a cleverer prompt; it’s a deterministic guard, the same spirit as a PreToolUse hook: before opening a PR, check for an existing bot branch keyed to this commit SHA; before commenting, check for a prior comment from this run. Idempotency is config, not model judgment, and it has to be your job to add — the model will cheerfully do the work a second time.

Runaway spend. Interactively, you notice a loop going sideways and hit escape. Unattended, the meter just runs, and the cloud API has no soft limit to save you. So you bound it from the outside: cap --max-turns so a stuck run fails instead of spending, wrap the job in a wall-clock timeout, route routine work to a cheaper model, and limit concurrency so a flood of triggers can’t fan out into a flood of billed runs. None of these make the agent smarter; they make its worst case cheap.

Noise. An agent that comments on every PR — even when there’s nothing to say — trains your team to scroll past it inside a week, and a gate everyone ignores is worse than no gate. Give it a false-positive budget: the dependency prompt above ends with “if nothing touches the changed surface, say so in one line and stop.” Better still, let it exit silently when the risk is negligible. The discipline that makes a trigger trustworthy is the same as the rules that make it competent — write down not just what to do, but when to stay quiet.

The work that earns this is the work you keep forgetting

Section titled “The work that earns this is the work you keep forgetting”

Not everything wants a trigger. One-off questions belong in a chat. The candidates for triggers share a shape: deterministic event, repeating need, context-heavy output, low tolerance for being forgotten.

Each of these is context work you’d happily do if you remembered, structured by rules you’d happily write once. The trigger is the part that survives your memory.

The boundary matters as much as the pattern, because a trigger pointed at the wrong work quietly costs more than it saves. Skip it when the when isn’t deterministic — “when the design feels off” has no event to bind to, so it stays a conversation. Skip it when the output’s natural endpoint is an irreversible action: auto-merging, deploying, or replying to a customer puts the agent’s contextlessness past the last point a human can catch it, which is the exact failure the draft-then-review shape exists to prevent. And skip it for genuine one-offs. The wiring — the workflow file, the rules, the guard against duplicate runs — is a fixed cost you pay once and amortize over every future firing. A question you’ll ask exactly once never repays it; that belongs in a chat window, where the prompt is cheap and disposable. The rule of thumb: if you can’t name the event and you won’t see the work again, you don’t have a trigger, you have a task.

Start with one. Take the prompt you retype most often — the changelog, the triage, the doc sync — and ask what event in your repo always precedes you typing it. Wire that event to a headless run. Move the standing instructions out of your fingers and into a rules file. Then delete the prompt from your habits, because the habit is now the machine’s.

An agent you have to remember to ask is a smarter rubber duck. An agent that fires on the event that should cause the work, carrying your conventions into a run nobody is watching, is a coworker. The difference is one wire and one file.

For the per-tool mechanics, see Hooks for the in-session event model, Headless & CI for unattended one-shot runs, and Rules for the conventions that make an unwatched run trustworthy.