Skip to main content
Without persistent memory, agents re-solve problems, forget constraints, and make the same mistakes across sessions. ZO treats memory as required infrastructure: every session begins with a read and ends with a write.

The four memory files

Every project gets a memory/ directory with four canonical files (or .zo/memory/ in the portable layout):

STATE.md

Current phase, last checkpoint, agent statuses, blockers, next steps. Overwritten each session-end. Atomic-write protected.

DECISION_LOG.md

Append-only audit trail. Every architectural decision, gate passage, scope change. Each entry has a timestamp, type, title, decision, rationale, alternatives considered, outcome.

PRIORS.md

Domain knowledge accumulated through running ZO. Each prior references the failure that triggered it. After 23 sessions, ZO has 34 documented priors.

sessions/

Per-session summary files (session-NNN-YYYY-MM-DD.md). Written at session end. Captures what was attempted, what shipped, what’s next.

Portable memory: .zo/

Project memory lives in the delivery repo, not the ZO repo. This is by design:
  • The ZO repo is public (open source). It can’t contain client/project state.
  • The delivery repo is private, committed to git, already travels between machines.
  • git pull on a new machine brings code AND state.
Layout in the delivery repo:
delivery/
├── .zo/
│   ├── config.yaml              # portable project config (committed)
│   ├── local.yaml               # machine-specific (paths, GPU info — gitignored)
│   ├── memory/
│   │   ├── STATE.md
│   │   ├── DECISION_LOG.md
│   │   ├── PRIORS.md
│   │   ├── snapshots/           # phase snapshots (every gate PROCEED)
│   │   └── sessions/
│   ├── plans/
│   │   └── <project>.md         # the plan, also lives here
│   └── experiments/
│       └── exp-NNN/             # Phase 4 experiment trail
└── src/, models/, reports/, ...
ZO’s own platform memory (memory/zo-platform/) is the only memory tracked in the public ZO repo. It captures what ZO learned generically, never project specifics. See PR-024 / PR-028 / PR-030 for the confidentiality enforcement story.

Cross-machine: zo continue --repo

Move a project from a Mac dev box to a Linux GPU server:
# On Mac: commit and push
cd /path/to/delivery
git add .zo/ && git commit -m "feat: ZO state checkpoint" && git push

# On GPU server: clone and resume
git clone <delivery-repo>
cd <delivery>
zo continue --repo $(pwd)
The CLI auto-detects the .zo/ layout, loads project context, and resumes from the recorded phase. Machine-specific paths (data location, GPU details) are re-detected via zo.environment.detect_environment() and written to .zo/local.yaml (gitignored). DECISION_LOG.md accumulates rapidly — a long-running project might have 200+ decision entries. The semantic index lets agents query in natural language:
# Inside a Claude Code session:
/memory:recall "what did we try last time for feature selection?"
How it works:
  • src/zo/semantic.py embeds each DECISION_LOG entry (1 vector per entry, summary derived from title + outcome)
  • Queries cosine-match against the summary embedding
  • The full entry is injected into context (not just the summary)
  • Storage: SQLite at {memory_root}/index.db
  • Embeddings: fastembed (optional dependency; falls back to word-overlap if missing — see DECISION_LOG entry from session 2)
This optimises for context-window density: 3 highly relevant full decisions beat 10 noisy fragments.

The session lifecycle

1

Session start

Agent reads STATE.md (current phase, blockers). Queries semantic index for relevant past decisions. Loads only the spec files needed for the current task.
2

Work

Agents execute. Every architectural decision is appended to DECISION_LOG.md immediately (not batched). Comms events log to JSONL.
3

Failure → prior

If anything fails, the post-mortem protocol fires: document the failure, classify root cause (missing_rule / incomplete_rule / ignored_rule / novel_case / regression), fix the symptom, update the rule that allowed it, verify the fix prevents recurrence.
4

Phase completion

At every gate PROCEED, a PhaseSnapshot is written to {memory_root}/snapshots/. Captures the phase’s full context for the next-phase Lead.
5

Session end

Session summary written to sessions/. STATE.md updated with final phase, last completed subtask, blockers. DECISION_LOG.md flushed. Semantic index re-indexes new entries.

Phase-aware context resets

Planning, building, and maintenance are separate conversation contexts. When transitioning from planning to building, the orchestrator closes the planning context and opens a fresh building context, loading only:
  • STATE.md (current state)
  • DECISION_LOG.md (recent + semantic-matched older decisions)
  • PRIORS.md (relevant priors)
  • The previous phase’s snapshot
This prevents accumulation of irrelevant reasoning and keeps token costs predictable.

Self-evolution in practice

The 34 priors in memory/zo-platform/PRIORS.md are the cumulative output of this protocol. A few examples:
  • PR-001claude --print --dangerously-skip-permissions exits immediately. Captured after a tmux pane stayed blank during MNIST testing.
  • PR-005 — Aspirational rules without enforcement are dead letter. Captured after a documentation cascade was repeatedly ignored despite being written in CLAUDE.md.
  • PR-028 — Project memory belongs in the delivery repo, not the platform repo. Captured after a Mac → GPU server transfer broke zo status.
  • PR-034 — PyTorch MPS tensor extraction returns garbage under pytest. Captured after CIFAR-10 oracle tests failed mysteriously despite the same code working in training.
Each prior was earned by a real failure. The same mistake never happens twice.

Next

zo continue

Resume a paused project on the same or a different machine.

Self-evolution protocol

The full post-mortem and rule-update protocol.