Code Indexer

The Code Indexer scans project files and extracts structural knowledge — functions, classes, imports, and exports — into the Cortex knowledge graph as RDF triples. This gives agents an understanding of the codebase structure.

Supported languages

LanguageExtensionsExtracted patterns
TypeScript.ts, .tsxfunctions, classes, interfaces, imports, exports
JavaScript.js, .jsx, .mjsfunctions, classes, imports, exports
Python.pyfunctions, classes, imports
Rust.rsfunctions, structs, impls, use statements
Go.gofunctions, structs, interfaces, imports

The code indexer uses regex-based extraction, not AST parsing — it's fast but may miss complex patterns like deeply nested or dynamically generated definitions.

Incremental indexing

The indexer tracks file hashes to avoid re-indexing unchanged files:

  1. On each run, it computes a hash of every matching file
  2. Files with unchanged hashes are skipped
  3. Changed files have their old triples deleted and new triples created
  4. Deleted files have their triples removed

This makes re-indexing fast even on large codebases.

Triple patterns

The indexer generates triples following these patterns:

file:src/auth.ts → defines:function → "validateToken"
file:src/auth.ts → defines:class → "AuthService"
file:src/auth.ts → imports → "jsonwebtoken"
file:src/auth.ts → exports → "AuthService"
file:src/auth.ts → defines:interface → "AuthConfig"

These triples can be queried via mayros kg explore "file:src/auth.ts" or through the knowledge graph tools.

CLI

bash
# Trigger a manual re-index
mayros kg code --reindex

# Show triple counts by type
mayros kg stats

Configuration

json5
{
  codeIndexer: {
    enabled: true,          // Enable/disable indexing
    include: ["src/**/*"],  // Glob patterns to include
    exclude: [              // Glob patterns to exclude
      "node_modules/**",
      "dist/**",
      "*.test.*",
    ],
    languages: ["typescript", "javascript", "python", "rust", "go"],
  },
}