An Agent That Can't Forget Is Worse Than One With No Memory

A rules file is write-once and rots; auto-memory saves corrections but on its own judgment, in its own store. Add the human-gated curation layer the vendors leave open: a command that counts corrections and proposes durable edits, a hook that resolves conflicts, so corrections compound instead of contradicting.

An Agent That Can't Forget Is Worse Than One With No Memory

You correct the agent for the third time this week: “We migrated off that ORM in March — stop generating Sequelize models.” It nods, fixes the file, and forgets by the next session. The correction cost you ninety seconds and bought you nothing durable. Tomorrow you’ll pay it again.

So you reach for memory. You let the agent persist facts across sessions, append-only, and feel briefly clever. Then one morning it confidently writes Sequelize models and cites a stored note saying you use Prisma — both retrieved, both presented as truth. You didn’t fix the forgetting problem. You upgraded it into a contradiction problem.

A rules file is write-once; the work is write-often

Section titled “A rules file is write-once; the work is write-often”

Your rules file — AGENTS.md, CLAUDE.md, whatever your tool reads on startup — is the canonical place to write context down once so the agent stops asking. That’s exactly its strength and exactly its failure mode. It captures what you decided to tell it, the day you decided to tell it, and then nobody curates it. Decisions reverse. The file doesn’t.

AGENTS.md
- ORM: Sequelize # written January
- Use class components # written 2022, nobody deleted it

Six months later that file is an archaeology dig. The agent reads “Sequelize” with the same confidence as the line you wrote yesterday, because to the model there is no yesterday — within one file, line 3 and line 30 carry equal authority, and nothing marks one as reversed. (The loading system does have a precedence order — managed rules override user, which override project, which override the local file, closest-wins — but that resolves which file you read, not which line inside it is stale.) A write-once artifact can’t track a write-often reality. The rot isn’t a bug in rules; it’s the absence of a curation loop around them.

The bloat compounds the problem. AGENTS.md is now a governed open standard — released by OpenAI in August 2025 and, as of December 2025, stewarded under the Linux Foundation’s Agentic AI Foundation, read by 60,000-plus repositories across roughly twenty-five tools — but the standard says nothing about pruning. Anthropic’s own guidance is to keep a memory file under ~200 lines, because adherence drops as it grows; frontier models reliably follow only the first ~150–200 instructions before later ones start sliding. Your archaeology dig isn’t just confusing the reader of the file — past a certain length it’s quietly not being followed at all.

There’s a sharper version of the same failure. Despite the standard, Claude Code reads CLAUDE.md, not AGENTS.md, unless you explicitly @-import the latter. A repo that standardized on AGENTS.md and assumed every agent would pick it up has a curated, true context file that one of the most common agents silently ignores. That’s the write-once-nobody-curates thesis in its purest form: the file exists, the file is right, and nobody checks that it’s actually being loaded.

Memory without forgetting is a slow-motion hallucination engine

Section titled “Memory without forgetting is a slow-motion hallucination engine”

The obvious fix is the wrong one. Everyone optimizes recall — bigger memory stores, better retrieval, more facts surfaced. Vendors have started shipping forgetting too: background consolidation that prunes stale memories, context-editing that drops aged content out of the window. What stays rare is forgetting you can audit — principled, user-visible precedence where you can see which fact won and why. And steel-manning the recall camp for a second: yes, an agent that remembers your stack across sessions is genuinely useful, and append-only is the safest write pattern because you never lose data.

But never losing data is the problem when the data reverses. An append-only memory store keeps both “we use Sequelize” and “we migrated to Prisma,” and retrieval has no principled reason to prefer the newer one. So here’s the question worth sitting with: if you can’t tell the agent to forget, what stops a stale fact from being retrieved as current truth? Nothing does. The mistake is now durable and confidently cited — strictly worse than an agent with no memory at all, which at least asks.

Build the loop: propose durable edits, resolve conflicts, gate on a human

Section titled “Build the loop: propose durable edits, resolve conflicts, gate on a human”

Worth being precise about what already ships before you build anything. There are two files in play and they have different owners. You write the rules file — CLAUDE.md, AGENTS.md — and it’s static until you touch it. The agent writes its own auto-memory, and that one evolves: Claude Code’s auto-memory already saves notes based on your corrections and preferences, and it triggers when the agent catches itself repeating a mistake. So correction-saving is not the void this post once implied. The void is the curation discipline between the two files — and that’s the layer the vendors leave open for you to own.

The fix is two small mechanisms in that gap, and a non-negotiable human in the middle.

First, a slash command you write that proposes durable additions instead of letting corrections evaporate. What auto-memory gives you is a qualitative judgment call — it saves when it decides a correction was worth keeping. What you add is a deterministic trigger and an explicit human gate: the third time you correct the same thing in a session, that’s not a judgment call, that’s a countable signal of a missing rule. The command counts, reads the correction, and drafts a one-line edit for you to approve — promoting a transient correction into the canonical rules file, not just the agent’s private store.

Terminal window
# /remember — proposes a rules edit, never writes silently
/remember
Detected 3 corrections about the ORM this session.
Proposed edit to AGENTS.md:
- ORM: Sequelize
+ ORM: Prisma (migrated 2026-03; do not generate Sequelize)
Apply? [y/N]

Note the diff replaces the stale line. It doesn’t append a contradiction beneath it. The command’s job is to keep the write-once file honest by turning your repeated corrections into curation events.

Second, a hook that fires on conflict — when a session asserts a fact that contradicts a stored one. Instead of silently carrying both, the hook surfaces the collision and applies one rule: the newer fact wins, pending confirmation.

# pseudocode hook: on memory write, detect + resolve contradiction
def on_remember(new_fact, store):
clash = store.contradicts(new_fact) # same key, different value
if clash:
print(f"CONFLICT: stored '{clash.value}' ({clash.date}) "
f"vs new '{new_fact.value}' ({new_fact.date})")
print("Newer fact taken as source of truth. Confirm? [y/N]")
if confirm():
store.supersede(clash, new_fact) # mark old as superseded, don't delete
else:
store.add(new_fact)

supersede is the whole trick. You don’t destroy history — you demote it, so the old fact stops being retrievable as current while staying auditable. That’s forgetting done responsibly: not amnesia, precedence.

Be honest about what that pseudocode can and can’t catch, though. contradicts here is doing the easy version — same key, different value, ORM: Sequelize versus ORM: Prisma. Most real reversals aren’t that tidy. “We use REST” and “the new billing service is gRPC-only” don’t share a key; they’re a scoped contradiction the string match sails straight past. A correction like “stop assuming every table is soft-deleted — that was only the legacy schema” has no prior line to supersede at all; it’s a fresh constraint that quietly invalidates a dozen unstated assumptions. The detector will miss these, and that’s fine — the deterministic trigger was never meant to be a perfect contradiction oracle. Its job is to put a cheap, frequent prompt in front of the human, who is the only one who can tell a scoped exception from a global reversal. The machine surfaces the candidate; the judgment stays yours.

Third, and this is the load-bearing part: the human approval gate. Both mechanisms propose; neither writes unattended. That gate is the only thing standing between “memory that compounds” and “memory that drifts into confident nonsense.” An agent allowed to silently rewrite its own durable context will, eventually, persist a hallucination and then cite it to you as established fact. The asymmetry is the whole reason to bother: a missing fact makes the agent ask, which costs you a sentence; a wrong fact retrieved as truth makes the agent act, which costs you a debugging session and your trust in the store. Unattended writes optimize for the cheap failure and quietly manufacture the expensive one. The approval prompt is cheap insurance against your memory store becoming a generator of fiction. Keep it under your permissions policy, not the agent’s discretion.

The loop earns its keep on a long-lived codebase with a team and a decision history — a context that reverses, repeatedly, over months. Aim it at the wrong target and the curation becomes its own kind of rot.

Don’t promote ephemera. A correction like “use port 3001 this time, 3000 is taken” is true for an afternoon, not a quarter; bake it into the rules file and you’ve manufactured a stale line the moment you reboot. The deterministic trigger counts repetition across sessions for exactly this reason — a fact worth a durable edit is one you’ve re-corrected on Tuesday and again on Friday, not three times inside one debugging spiral about the same flaky test. Three corrections can be one missing rule, or they can be three faces of a problem that evaporates the instant you fix the underlying bug. The counter only flags the candidate; the human gate is what tells those two apart.

And on a solo throwaway prototype, skip the apparatus entirely. The rot thesis assumes the file outlives your memory of writing it. If the project won’t, an un-curated AGENTS.md is perfectly fine — the cheapest curation loop is a short-lived project. The discipline scales with the lifespan of the context, not with the existence of an agent.

The third correction stops being wasted breath — it becomes a one-line edit you approve in two seconds, and the agent never generates a Sequelize model again. A later session that contradicts a stored fact doesn’t quietly poison the store; it raises a flag, takes the newer truth, and demotes the old one with a paper trail. Corrections compound into a sharper rules file instead of accumulating into a self-contradicting one.

This is the gap the whole site is about, viewed through one primitive. The agent is fast and broad but contextless — it has no yesterday, no sense that a decision reversed. You have the context but you’re tired of repeating it. The loop is the bridge: it converts your scarce, expensive corrections into the agent’s cheap, durable context, without letting the durability calcify into rot.

Recall is easy. Every vendor ships it. Even forgetting now ships — automatically, on the agent’s judgment, inside its own store. What no vendor ships, and what this loop is, is audited forgetting: a human-gated curation layer sitting between the file you write and the memory the agent writes itself. Keeping a rules file true over time is a context-engineering discipline, and it’s precisely the part the agent can’t do for you unsupervised — because the judgment about which yesterday is now wrong is yours, not the model’s.

For the mechanics: write the canonical context in Rules; propose edits with a slash command; resolve conflicts and gate writes with a hook.