Context Window & Compaction

Every model has a context window (max tokens it can see). Long-running chats accumulate messages and tool results; once the window is tight, Mayros compacts older history to stay within limits.

What compaction is

Compaction summarizes older conversation into a compact summary entry and keeps recent messages intact. The summary is stored in the session history, so future requests use:

  • The compaction summary
  • Recent messages after the compaction point

Compaction persists in the session’s JSONL history.

Configuration

Use the agents.defaults.compaction setting in your mayros.json to configure compaction behavior (mode, target tokens, etc.).

Auto-compaction (default on)

When a session nears or exceeds the model’s context window, Mayros triggers auto-compaction and may retry the original request using the compacted context.

You’ll see:

  • 🧹 Auto-compaction complete in verbose mode
  • /status showing 🧹 Compactions: <count>

Before compaction, Mayros can run a silent memory flush turn to store durable notes to disk. See Memory for details and config.

Manual compaction

Use /compact (optionally with instructions) to force a compaction pass:

/compact Focus on decisions and open questions

Context window source

Context window is model-specific. Mayros uses the model definition from the configured provider catalog to determine limits.

Compaction vs pruning

  • Compaction: summarises and persists in JSONL.
  • Session pruning: trims old tool results only, in-memory, per request.

See /concepts/session-pruning for pruning details.

Smart Compaction

When the memory-semantic plugin is enabled, compaction goes beyond summarization. The CompactionExtractor analyzes the conversation before compaction and extracts structured knowledge into Cortex:

  • Conventions: coding patterns and rules observed in the conversation (e.g., "always use strict TypeScript")
  • Decisions: architectural choices with rationale (e.g., "chose QuickJS over Node VM for isolation")
  • Changes: code modifications made during the session
  • Findings: debugging insights and discoveries (e.g., "the timeout was caused by missing await")
  • Error patterns: recurring errors and their resolutions

Extraction happens automatically via the before_compaction hook. Extracted knowledge is stored as typed RDF triples in Cortex, making it available for cross-session recall in future sessions.

This means that even after compaction removes the original conversation details, the key knowledge is preserved in the knowledge graph.

See Project Memory for details on how extracted knowledge is used.

Tips

  • Use /compact when sessions feel stale or context is bloated.
  • Large tool outputs are already truncated; pruning can further reduce tool-result buildup.
  • If you need a fresh slate, /new or /reset starts a new session id.