Token Economy

The Token Economy extension tracks LLM usage costs across session, daily, and monthly windows. It supports soft and hard enforcement modes, a built-in model pricing catalog, and an LRU prompt cache for estimating savings.

Budget tracking

The BudgetTracker maintains three independent cost counters:

ScopeResetsPersistence
sessionOn each new sessionIn-memory only
dailyMidnight (auto-rollover)~/.mayros/token-budget.json
monthlyFirst of month (auto-rollover)~/.mayros/token-budget.json

Each counter tracks total USD spent, and compares against its configured limit. The status for each scope is one of ok, warn, or exceeded.

Warning threshold

When usage reaches the warnThreshold (default: 80% of the limit), the status changes to warn. A context message is injected into the prompt via the before_prompt_build hook.

Enforcement modes

Soft enforcement (default)

When a budget is exceeded, a polite message is prepended to the prompt asking the agent to wrap up. The agent can still make LLM and tool calls.

Hard enforcement

When a budget is exceeded, a system-level STOP message is injected into the prompt. After a configurable grace period (default: 3 tool calls), subsequent tool calls are blocked entirely via the before_tool_call hook.

typescript
// Hard enforcement flow:
// 1. Budget exceeded → warn message injected
// 2. Agent gets gracePeriodCalls more tool calls
// 3. After grace period → tool calls return { block: true, reason: "..." }

Model pricing catalog

A built-in catalog provides fallback pricing (USD per 1M tokens) for models from Anthropic, OpenAI, and Google. User-configured cost entries take priority over the catalog.

ProviderModelsContext windows
AnthropicOpus 4.6, Sonnet 4.6, Sonnet 4.5, Haiku 4.5, 3.5 Sonnet, 3.5 Haiku200K
OpenAIGPT-4o, GPT-4o Mini, o1, o1-mini, o3-mini128K–200K
GoogleGemini 2.0 Flash, Gemini 2.0 Pro, Gemini 1.5 Pro1M–2M

Each entry includes input, output, cacheRead, and cacheWrite rates plus context window size.

The catalog supports prefix matching: a model ID like claude-sonnet-4-6-20260301 matches the claude-sonnet-4-6 entry.

Prompt cache

The PromptCache uses an LRU eviction strategy with TTL-based expiry:

  • Key: SHA-256 hash of provider + model + systemPrompt + prompt
  • Capacity: configurable maxEntries (default: 256)
  • TTL: configurable ttlMs (default: 300,000 ms / 5 minutes)

The cache is observational — it does not skip LLM calls. Instead, it tracks cache hits and estimates how much would have been saved if the response had been reused. This provides visibility into prompt repetition without altering agent behavior.

Budget persistence and rollover

Budget state is persisted to ~/.mayros/token-budget.json (configurable via persistPath). The file is flushed:

  • Every 30 seconds during an active session
  • On session end
  • On service stop

Rollover is automatic: if the persisted dailyDate does not match today, the daily counter resets. If monthlyKey does not match the current month, the monthly counter resets.

Agent tools

ToolDescription
budget_statusShow session, daily, and monthly cost vs limits
budget_set_limitSet or update a budget limit at runtime
budget_model_usagePer-model cost and token breakdown
budget_pricing_catalogList the built-in pricing catalog

Hooks wiring

HookAction
session_startLoad persisted budget, rollover, init tracker
llm_outputAccumulate cost, update prompt cache
llm_inputCheck prompt cache for observational tracking
before_tool_callHard enforcement: block tool calls after grace
before_prompt_buildInject budget warnings or stop messages
session_endFlush budget to disk, clear cache

Configuration

typescript
{
  sessionLimitUsd?: number;      // Per-session limit (default: unlimited)
  dailyLimitUsd?: number;        // Daily limit (default: unlimited)
  monthlyLimitUsd?: number;      // Monthly limit (default: unlimited)
  warnThreshold: number;         // 0.0–1.0 (default: 0.8)
  persistPath: string;           // Default: "~/.mayros/token-budget.json"
  enforcement: "soft" | "hard";  // Default: "soft"
  gracePeriodCalls: number;      // Tool calls allowed after exceed (default: 3)
  cache: {
    enabled: boolean;            // Default: true
    maxEntries: number;          // Default: 256
    ttlMs: number;               // Default: 300000
  }
}

CLI

bash
mayros budget status             # Print budget summary table
mayros budget set <scope> <usd>  # Set a budget limit
mayros budget reset [scope]      # Reset counters (session|daily|monthly|all)
mayros budget models             # Per-model cost breakdown
mayros budget pricing            # Show built-in pricing catalog
mayros budget cache              # Prompt cache stats