Drive it with a local or custom model

You’ve spent this chapter learning to put the best model behind each job. This lesson is about the one job where “best” isn’t the constraint. feedmill pulls from an internal feed your employer runs — incident postmortems, customer-support digests, the kind of content that has a clear rule attached: it does not leave the corporate network. The Atom parser for that source is throwing on a malformed <updated> element, and your usual move — hand the failing feed body to a strong hosted model and ask it to find the bug — would mean shipping that content to Anthropic or OpenAI. For this task, that’s not an option. The frontier model isn’t wrong; it’s just off the table.

This is exactly the wall the single-vendor tools have no answer for. Claude Code’s loop runs on Anthropic’s servers; Codex’s runs on OpenAI’s. There’s no “but not for this one task” knob. OpenCode’s provider-agnostic design is the knob: point it at a model running on your own machine, and the loop never makes a network call to anyone. Same TUI, same read-propose-approve-apply loop you already know — just a model that physically can’t leak the feed.

Point OpenCode at a local model

If you’ve connected providers from the catalog before, this looks almost identical, because local runtimes are in the catalog. Ollama and LM Studio both ship preloaded among OpenCode’s 75+ providers — they’re just providers whose endpoint happens to be localhost. The cleanest way to wire one is a provider block in feedmill’s opencode.json. For Ollama running its default server:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama (local)",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder": {
          "name": "Qwen2.5 Coder (local)"
        }
      }
    }
  }
}

Three things to notice. The npm field points at @ai-sdk/openai-compatible — OpenCode talks to Ollama through its OpenAI-compatible API surface rather than a bespoke integration, which is why the same recipe works for nearly any local runtime. The baseURL under options is the local server, not a cloud host; nothing in this block reaches the public internet. And models is a map you fill in yourself — OpenCode can’t enumerate what you’ve pulled down locally the way it knows Anthropic’s catalog, so you list the model ids you actually have. LM Studio is the same shape with a different port:

{
  "provider": {
    "lmstudio": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "LM Studio (local)",
      "options": {
        "baseURL": "http://127.0.0.1:1234/v1"
      },
      "models": {
        "google/gemma-3n-e4b": {
          "name": "Gemma 3n (local)"
        }
      }
    }
  }
}

With the block saved and Ollama serving the model, select it for the session the way you select any other — provider/model, where the provider is the key you named in the block:

> /models

  anthropic/claude-sonnet-4-5
  openai/gpt-5.2
  ollama/qwen2.5-coder          ← your local model, no slash-syntax exception
  ❯ select a model for this session

Now hand it the private feed. The loop is unchanged — it reads the parser, proposes a fix, waits for your approval — except every token stays on your machine:

> ollama/qwen2.5-coder is driving now. the Atom parser in
  internal/parse/atom.go panics on the internal-incidents feed —
  it's the <updated> element. read the failing sample in
  testdata/incidents.xml and find the bug. do not summarize the feed
  body back to me, just the fix.

  ⏵ Read internal/parse/atom.go, testdata/incidents.xml          (2 files)

  The feed emits <updated></updated> with no text node. parseTime is
  called on the empty string and returns a zero-value error the caller
  doesn't check, so a later .Format panics. I'll treat an empty
  <updated> as "use <published>" and guard the nil path.

  ⏵ Edit internal/parse/atom.go

  Apply this change? › Yes / No

A local model won’t reason like a frontier one — it may need a tighter prompt and a closer read of its diff, and for a subtle bug you might still escalate. But for this bug, on this feed, “runs on a model that can’t see the network” is the requirement that outranks raw capability. OpenCode let you satisfy it without leaving the session.

Reach an arbitrary endpoint with a custom provider

Local runtimes are one case of a more general escape hatch. Maybe the private model isn’t on your laptop — it’s an internal inference gateway your platform team runs, exposing an OpenAI-compatible API behind your VPN. The mechanism is identical; only the URL and credentials change:

{
  "provider": {
    "internal": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Internal Gateway",
      "options": {
        "baseURL": "https://llm.corp.internal/v1",
        "apiKey": "{env:INTERNAL_LLM_KEY}"
      },
      "models": {
        "llama-3.3-70b": {
          "name": "Llama 3.3 70B (internal)"
        }
      }
    }
  }
}

Same @ai-sdk/openai-compatible package, same options shape — you’ve just added an apiKey, and the {env:INTERNAL_LLM_KEY} syntax pulls it from the environment so the secret never lands in a file you might commit. Reference it as internal/llama-3.3-70b, exactly the provider/model pattern you’ve used all chapter. The provider key (internal) is whatever you named the block; the slash separates it from the model id. There is no special case here — a local Ollama, an internal gateway, and Anthropic are the same shape of thing to OpenCode, which is the entire point of the design.

Let Zen choose when 75 is too many

The opposite problem from “I must use this one private model” is “I don’t want to choose at all.” The catalog is enormous, and most of those models are mediocre at driving an agent. For the non-private feeds — the public RSS and JSON sources where data residency isn’t a concern — you don’t want to benchmark seventy-five options to find the few that hold up under a real coding loop.

That’s what Zen is. OpenCode Zen is a curated, vetted shortlist — models the OpenCode team tested and tuned specifically for the agent loop, served through a single gateway so you get one key instead of ten accounts. You reference a Zen model as opencode/<model>:

> /models

  opencode/claude-sonnet-4-6
  opencode/gpt-5.5
  opencode/qwen3.7-max
  ❯ select a model for this session

{
  "model": "opencode/qwen3.7-max"
}

The opencode/ prefix is the tell: it’s not a provider you wired up, it’s the curated set, reached through Zen’s gateway. Think of it as the inverse of the custom-provider work above. The custom block is for when you know exactly which endpoint you must hit and OpenCode should get out of the way. Zen is for when you’d rather trust a vetted default than make the call yourself. feedmill wants both, on different feeds: the locked-down internal source pinned to a local model that can’t leak it, and the public-feed work running on whatever Zen says is currently good — with the freedom to switch between them mid-session, which is the muscle this whole chapter built.

You can now reach any model that exists — one on your laptop, one behind your VPN, or a curated pick you didn’t have to research — and keep the private feed on a model that never touches the wire. That closes the providers-and-models chapter: you’ve gone from one vendor’s assumption to a tool where the model is just another thing you configure per job. Next, you’ll stop the agent reintroducing the same feedmill parser bugs by writing the rules down — Teach feedmill’s house rules with AGENTS.md.