Ollama

Ollama is a local LLM runtime that makes it easy to run open-source models on your machine. Mayros integrates with Ollama's native API (/api/chat), supporting streaming and tool calling, and can auto-discover tool-capable models when you opt in with OLLAMA_API_KEY (or an auth profile) and do not define an explicit models.providers.ollama entry.

Quick start

Install Ollama: https://ollama.ai
Pull a model:

bash
ollama pull gpt-oss:20b
# or
ollama pull llama3.3
# or
ollama pull qwen2.5-coder:32b
# or
ollama pull deepseek-r1:32b

Enable Ollama for Mayros (any value works; Ollama doesn't require a real key):

bash
# Set environment variable
export OLLAMA_API_KEY="ollama-local"

# Or configure in your config file
mayros config set models.providers.ollama.apiKey "ollama-local"

Use Ollama models:

json5
{
  agents: {
    defaults: {
      model: { primary: "ollama/gpt-oss:20b" },
    },
  },
}

Model discovery (implicit provider)

When you set OLLAMA_API_KEY (or an auth profile) and do not define models.providers.ollama, Mayros discovers models from the local Ollama instance at http://127.0.0.1:11434:

Queries /api/tags and /api/show
Keeps only models that report tools capability
Marks reasoning when the model reports thinking
Reads contextWindow from model_info["<arch>.context_length"] when available
Sets maxTokens to 10× the context window
Sets all costs to 0

This avoids manual model entries while keeping the catalog aligned with Ollama's capabilities.

To see what models are available:

bash
ollama list
mayros models list

To add a new model, simply pull it with Ollama:

bash
ollama pull mistral

The new model will be automatically discovered and available to use.

If you set models.providers.ollama explicitly, auto-discovery is skipped and you must define models manually (see below).

Configuration

Basic setup (implicit discovery)

The simplest way to enable Ollama is via environment variable:

bash
export OLLAMA_API_KEY="ollama-local"

Explicit setup (manual models)

Use explicit config when:

Ollama runs on another host/port.
You want to force specific context windows or model lists.
You want to include models that do not report tool support.

json5
{
  models: {
    providers: {
      ollama: {
        baseUrl: "http://ollama-host:11434",
        apiKey: "ollama-local",
        api: "ollama",
        models: [
          {
            id: "gpt-oss:20b",
            name: "GPT-OSS 20B",
            reasoning: false,
            input: ["text"],
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 8192,
            maxTokens: 8192 * 10
          }
        ]
      }
    }
  }
}

If OLLAMA_API_KEY is set, you can omit apiKey in the provider entry and Mayros will fill it for availability checks.

Custom base URL (explicit config)

If Ollama is running on a different host or port (explicit config disables auto-discovery, so define models manually):

json5
{
  models: {
    providers: {
      ollama: {
        apiKey: "ollama-local",
        baseUrl: "http://ollama-host:11434",
      },
    },
  },
}

Model selection

Once configured, all your Ollama models are available:

json5
{
  agents: {
    defaults: {
      model: {
        primary: "ollama/gpt-oss:20b",
        fallbacks: ["ollama/llama3.3", "ollama/qwen2.5-coder:32b"],
      },
    },
  },
}

Advanced

Reasoning models

Mayros marks models as reasoning-capable when Ollama reports thinking in /api/show:

bash
ollama pull deepseek-r1:32b

Model Costs

Ollama is free and runs locally, so all model costs are set to $0.

Streaming Configuration

Mayros's Ollama integration uses the native Ollama API (/api/chat) by default, which fully supports streaming and tool calling simultaneously. No special configuration is needed.

Legacy OpenAI-Compatible Mode

If you need to use the OpenAI-compatible endpoint instead (e.g., behind a proxy that only supports OpenAI format), set api: "openai-completions" explicitly:

json5
{
  models: {
    providers: {
      ollama: {
        baseUrl: "http://ollama-host:11434/v1",
        api: "openai-completions",
        apiKey: "ollama-local",
        models: [...]
      }
    }
  }
}

Note: The OpenAI-compatible endpoint may not support streaming + tool calling simultaneously. You may need to disable streaming with params: { streaming: false } in model config.

bash
ollama serve

And that the API is accessible:

bash
curl http://localhost:11434/api/tags

No models available

Mayros only auto-discovers models that report tool support. If your model isn't listed, either:

Pull a tool-capable model, or
Define the model explicitly in models.providers.ollama.

To add models:

bash
ollama list  # See what's installed
ollama pull gpt-oss:20b  # Pull a tool-capable model
ollama pull llama3.3     # Or another model

Connection refused

Check that Ollama is running on the correct port:

bash
# Check if Ollama is running
ps aux | grep ollama

# Or restart Ollama
ollama serve