Learning & Routing
Kaneru agents improve over time. Every completed mission updates the agent's learning profile, and the routing system uses these profiles to assign future missions to the best-suited agent.
Learning Profiles
A learning profile tracks an agent's expertise across domain and task type pairs. For example, an agent might have high expertise in typescript:security-scan but lower expertise in python:refactoring.
Each profile entry stores:
| Field | Description |
|---|---|
| Domain | Language or area (e.g. typescript, rust, devops) |
| Task type | Kind of work (e.g. security-scan, code-review, implementation) |
| Expertise score | 0–100%, smoothed over time |
| Success rate | Fraction of missions completed successfully |
| Mission count | Total missions completed for this pair |
bash# View an agent's learning profile mayros kaneru learn profile --agent scanner # Output: # typescript:security-scan — expertise: 92%, success: 95%, missions: 47 # typescript:code-review — expertise: 78%, success: 88%, missions: 23 # rust:security-scan — expertise: 65%, success: 80%, missions: 12
EMA Smoothing
Expertise scores use Exponential Moving Average (EMA) smoothing rather than simple averages. This means recent performance matters more than early results, but the score remains stable against one-off failures.
The formula:
new_expertise = alpha * current_score + (1 - alpha) * previous_expertise
With alpha = 0.3, an agent's expertise adjusts gradually. A single poor result drops the score slightly, while consistent performance over many missions raises it steadily. This prevents both overreaction and stagnation.
Q-Learning Router
When a mission needs assignment, the routing system blends two signals:
- Q-learning values (60%) — reinforcement learning scores updated after each mission based on outcome quality
- Expertise scores (40%) — the EMA-smoothed profiles described above
The blended score determines which agent gets the mission:
bash# Route a mission to the best agent mayros kaneru route --mission "Audit the auth module for token leakage" # Output: # Agent: scanner # Confidence: 87.3% # Task type: security-scan # Complexity: medium # Domain: typescript
You can also restrict routing to a subset of available agents:
bashmayros kaneru route --mission "Review API rate limits" --agents scanner,builder,reviewer
The router considers file path context when provided:
bashmayros kaneru route --mission "Fix memory leak" --path src/auth/token-manager.ts
Knowledge Transfer
When a mission completes with learning (complete-with-learning), two things happen:
- Profile update — the agent's expertise score is EMA-smoothed with the new result
- Knowledge transfer — relevant findings from the agent's namespace are merged into the venture's shared namespace
This means knowledge gained by one agent becomes available to all agents in the venture. The fusion uses an additive merge strategy by default.
bashmayros kaneru mission complete-with-learning \ --mission <id> \ --run run-001 \ --domain typescript \ --task-type security-scan \ --score 0.95
Top Agents Ranking
Query the top-performing agents for any domain and task type:
bashmayros kaneru learn top --domain typescript --task-type security-scan # Output: # 1. scanner — expertise: 92%, missions: 47 # 2. reviewer — expertise: 78%, missions: 23 # 3. builder — expertise: 61%, missions: 8
This is useful for manual assignment or verifying that the Q-learning router is making good choices.
MCP Tools
Two MCP tools expose learning functionality:
| Tool | Description |
|---|---|
kaneru_learn_profile | View an agent's expertise profile |
kaneru_learn_top | Rank agents by domain and task type |
Next Steps
- Ventures & Missions — mission lifecycle and projects
- DAG Audit Trail — every learning update is DAG-committed
- Kaneru CLI Reference — full command reference