agents.cli logo agents.cli logo
Foundations Blog GitHub

Topics

Context Engineering
Skills
Subagents
Mcp
Headless
Model Selection
Permissions
Slash Commands
Rules
Plan Mode
Configuration
Hooks
Mcp Servers
Workflow
Plugins
Evals
Thesis

#evals

1 post tagged #evals.

#all #context-engineering #skills #subagents #mcp #headless #model-selection #permissions #slash-commands #rules #plan-mode #configuration #hooks #mcp-servers #workflow #plugins #evals #thesis
'It worked when I tried it' is not a test for a non-deterministic system.
DISCOVERIES • May 29, 2026

'It worked when I tried it' is not a test for a non-deterministic system.

Treat prompts and rules like code: a golden dataset of inputs with known-good outputs, run headless on every change, gated by a hook that fails the build below baseline. The eval that blocks the merge is the one that prevents regressions.

SO
Sourabh Kushwah
11 MIN READ
agents.cli

Notes on context engineering, AI coding agents, and the repo knowledge developers need to write down once.

Resources

Foundations Rules MCP servers Skills

Explore

Blog Context engineering Rules essays GitHub
© 2024 AGENTS.CLI, SYSTEM_STABLE_v2.4.0