The README for AI: How to Write an AGENTS.md That Ships

AGENTS.md is the README for AI assistants: a practical, technical guide to structure, layering, boundaries, and maintenance for better AI coding outcomes.

The README for AI: How to Write an AGENTS.md That Ships

Most engineering teams now have two READMEs, whether they admit it or not:

  1. README.md for humans.
  2. A scattered set of half-remembered prompts for AI assistants.

That second one is where quality collapses. The assistant runs the wrong tests, edits files it shouldn’t, and wastes tokens rediscovering context your team already knows.

AGENTS.md fixes that. It gives your coding assistant a predictable operating manual: what this repo is, how to validate work, what boundaries matter, and where not to touch.

I’ll be direct: if your team uses AI coding tools daily and still relies on chat-only instructions, you are paying a productivity tax every single week.

AGENTS.md is the README for AI

Who this is for: engineering teams using AI coding assistants in active repos.
Not for: teams expecting prompt-only workflows without repeatable runbooks and boundaries.

The agents.md format defines a simple, open markdown convention for agent instructions. OpenAI Codex docs explicitly describe AGENTS.md discovery and precedence from global to local folders. GitHub Copilot docs now support AGENTS.md alongside repository and path-specific instruction files. Anthropic’s Claude Code memory model uses the same core principle: instruction hierarchy plus locality.

Different tools, same pattern: persistent, versioned instructions beat repeated ephemeral prompting.

That’s why I frame it this way: README.md explains your project to humans; AGENTS.md explains your project to AI.

What a high-signal AGENTS.md must contain

The best files are operational, not philosophical. If a line does not change behavior, remove it.

1) Scope and safety boundaries

Start with what the assistant may do. Then define approval gates. Then define hard “never” rules.

  • Always: modify src/ and tests/, run required validation commands.
  • Ask first: new dependencies, migrations, infrastructure or CI workflow edits.
  • Never: touch secrets, rotate keys, or alter generated/vendor output.

This single section eliminates a huge class of costly mistakes.

2) Runnable command sequence

Agents need exact commands, in exact order, with exact flags. Avoid “run tests” language.

Good:

pnpm install
pnpm turbo run lint --filter web
pnpm turbo run test --filter web
pnpm turbo run build --filter web

Also include known caveats: required services, environment assumptions, and common failure modes.

3) Repository wayfinding

Add a compact map of the system so the assistant can navigate with purpose:

  • where API entry points live
  • where domain logic belongs
  • where tests are expected
  • where lint/typecheck/CI configs are stored

This reduces exploration overhead and gives more deterministic edits.

AGENTS.md structure overview
High-signal structure: scope, commands, style examples, and boundaries.

4) Style examples, not style lectures

One real “good” snippet and one “bad” snippet is better than fifteen abstract bullets. Models imitate patterns. Show patterns.

Pull examples from your own codebase. Synthetic examples help less than you think.

5) Layering strategy for monorepos

Use root AGENTS.md for shared baseline rules. Add local AGENTS.md files near subprojects where workflows diverge. Keep local files short and specific.

This aligns with documented precedence behavior in Codex and with memory locality patterns in Claude Code.

Instruction layering model
Global defaults + repository policy + nearest-folder overrides.

A practical template (battle-tested and boring on purpose)

# AGENTS.md
## Project Overview
- Stack: Node 22, TypeScript, PostgreSQL 16
- Apps: /apps/web, /apps/api
- Shared packages: /packages/*

## Boundaries
- Always: edit /apps/* and /packages/*, add/update tests
- Ask first: db migrations, dependencies, CI changes
- Never: secrets, keys, vendor/generated assets

## Validation
1. pnpm install
2. pnpm turbo run lint --filter web
3. pnpm turbo run test --filter web
4. pnpm turbo run build --filter web

## Architecture map
- API routes: /apps/api/src/routes
- Domain services: /packages/domain
- Frontend pages: /apps/web/src/pages
- Tests: /apps/web/tests + /apps/api/tests

## Style guide
- Prefer small pure functions
- Keep orchestration in services, not controllers/components
- Use descriptive names over abbreviations

## Good example
// real snippet from this repo

## Anti-pattern
// real snippet showing what to avoid

## PR expectations
- 3-bullet summary
- risk + rollback note when relevant
- list executed validations

Common failure patterns I see in real teams

Failure #1: Instruction bloat

A giant 500-line AGENTS.md feels comprehensive, but it dilutes signal and increases contradiction risk. Keep root guidance compact; push specifics down to local files.

Failure #2: No ownership model

If nobody owns AGENTS.md, it rots. Assign ownership to the same people who own build/test reliability.

Failure #3: Static instructions in a moving repo

Toolchains evolve. If commands are stale, trust drops and engineers stop relying on agents.

Failure #4: Treating prompts as memory

Anything repeated in chat should be promoted into AGENTS.md. Repetition is your refactoring signal.

How to roll this out in one sprint

  1. Day 1: create a root AGENTS.md with boundaries + command runbook.
  2. Day 2-3: run actual agent tasks; log missteps and missing context.
  3. Day 4: add repo map and style examples from real code.
  4. Day 5: add 1-2 nested AGENTS.md files for high-variance subfolders.
  5. Day 6-7: validate against real work; remove fluff; tighten language.

This is lightweight process, not bureaucracy. You’re encoding operational knowledge once, then compounding it.

Mini case study

One team moved recurring chat instructions into AGENTS.md and added strict command runbooks. Within two weeks, they cut invalid command executions and reduced prompt length because the assistant stopped asking for repeated context.

Quality metrics worth tracking

  • Agent task completion rate without manual correction
  • Mean time from prompt to merge-ready diff
  • Number of invalid command executions per week
  • Incidents caused by boundary violations
  • Prompt length needed to achieve acceptable output

If AGENTS.md is working, these numbers move in the right direction quickly.

Observe update validate loop
Observe failures, update instructions, validate on real tasks.

Opinionated defaults I recommend

  • Keep root file around 100-150 lines.
  • Put command sections above style sections.
  • Use explicit Always / Ask First / Never language.
  • Include one real code example per major language/framework.
  • Document command prerequisites (services, env vars, versions).
  • After every agent-related incident, update AGENTS.md in the same PR.

Final take

Most teams assume AI reliability is mostly a model selection problem. In day-to-day engineering, it’s usually a context engineering problem.

AGENTS.md won’t magically fix poor architecture. But it will make good models dramatically more consistent, cheaper to run, and easier to trust.

So yes: AGENTS.md is the README for AI assistants. Treat it like production infrastructure, because that’s exactly what it becomes once your team depends on agents.


References