Learning & Routing

Kaneru agents improve over time. Every completed mission updates the agent's learning profile, and the routing system uses these profiles to assign future missions to the best-suited agent.

Learning Profiles

A learning profile tracks an agent's expertise across domain and task type pairs. For example, an agent might have high expertise in typescript:security-scan but lower expertise in python:refactoring.

Each profile entry stores:

FieldDescription
DomainLanguage or area (e.g. typescript, rust, devops)
Task typeKind of work (e.g. security-scan, code-review, implementation)
Expertise score0–100%, smoothed over time
Success rateFraction of missions completed successfully
Mission countTotal missions completed for this pair
bash
# View an agent's learning profile
mayros kaneru learn profile --agent scanner

# Output:
# typescript:security-scan — expertise: 92%, success: 95%, missions: 47
# typescript:code-review   — expertise: 78%, success: 88%, missions: 23
# rust:security-scan       — expertise: 65%, success: 80%, missions: 12

EMA Smoothing

Expertise scores use Exponential Moving Average (EMA) smoothing rather than simple averages. This means recent performance matters more than early results, but the score remains stable against one-off failures.

The formula:

new_expertise = alpha * current_score + (1 - alpha) * previous_expertise

With alpha = 0.3, an agent's expertise adjusts gradually. A single poor result drops the score slightly, while consistent performance over many missions raises it steadily. This prevents both overreaction and stagnation.

Q-Learning Router

When a mission needs assignment, the routing system blends two signals:

  • Q-learning values (60%) — reinforcement learning scores updated after each mission based on outcome quality
  • Expertise scores (40%) — the EMA-smoothed profiles described above

The blended score determines which agent gets the mission:

bash
# Route a mission to the best agent
mayros kaneru route --mission "Audit the auth module for token leakage"

# Output:
# Agent: scanner
# Confidence: 87.3%
# Task type: security-scan
# Complexity: medium
# Domain: typescript

You can also restrict routing to a subset of available agents:

bash
mayros kaneru route --mission "Review API rate limits" --agents scanner,builder,reviewer

The router considers file path context when provided:

bash
mayros kaneru route --mission "Fix memory leak" --path src/auth/token-manager.ts

Knowledge Transfer

When a mission completes with learning (complete-with-learning), two things happen:

  1. Profile update — the agent's expertise score is EMA-smoothed with the new result
  2. Knowledge transfer — relevant findings from the agent's namespace are merged into the venture's shared namespace

This means knowledge gained by one agent becomes available to all agents in the venture. The fusion uses an additive merge strategy by default.

bash
mayros kaneru mission complete-with-learning \
  --mission <id> \
  --run run-001 \
  --domain typescript \
  --task-type security-scan \
  --score 0.95

Top Agents Ranking

Query the top-performing agents for any domain and task type:

bash
mayros kaneru learn top --domain typescript --task-type security-scan

# Output:
# 1. scanner    — expertise: 92%, missions: 47
# 2. reviewer   — expertise: 78%, missions: 23
# 3. builder    — expertise: 61%, missions: 8

This is useful for manual assignment or verifying that the Q-learning router is making good choices.

MCP Tools

Two MCP tools expose learning functionality:

ToolDescription
kaneru_learn_profileView an agent's expertise profile
kaneru_learn_topRank agents by domain and task type

Next Steps