Model selection & multi-provider
What model selection is
Section titled “What model selection is”Every CLI in scope can be steered to a specific underlying model — and most can swap models mid-session. The model is the engine: a “smarter” model reasons more deeply, plans longer, and costs more per token; a “lighter” model is faster, cheaper, and good enough for most mechanical work.
The single-vendor / multi-vendor split shapes everything downstream. Claude Code runs Anthropic models only. Codex runs OpenAI models only. opencode is fully multi-provider — any model that fits its provider abstraction. Cursor ships a wide built-in roster (Anthropic, OpenAI, Google, xAI, Moonshot, plus its own Composer family). Copilot offers a tiered roster (Claude, GPT, Gemini) gated by plan and org policy. That distinction shapes model swaps, cost optimisation, side-by-side comparisons, and vendor-risk posture.
Why you’d want to think about this
Section titled “Why you’d want to think about this”You glance at the month’s API bill and realise you’ve been paying flagship-model rates to do… flagship-model things: renaming variables, regenerating docstrings, running greps and summarising the results, reformatting a JSON file. Tasks the cheapest model in the lineup would handle in half a second. You don’t need Opus to spell connectionTimeout correctly.
The reverse trap exists too. You’ve been letting the cheap fast model lead a complex refactor because you didn’t want to pay for the big one, and now you’re three rounds into untangling a design it tied itself in knots over. The cheap model’s hourly cost was negligible. The total cost — yours, in time spent unscrewing what it did — wasn’t.
That’s the gap model selection closes. Match the model to the work: heavyweight for hard reasoning (planning, code review of long diffs, debugging, architecture), default for everyday coding, lightweight for mechanical or exploratory tasks. The same CLI runs all three; you choose per task or per subagent.
Real-world examples of how teams allocate models:
- The main loop uses the default — Sonnet (Claude Code) or GPT-5.x default (Codex) — calibrated for general coding.
- Heavy planning / review uses the flagship — switch to Opus / strongest GPT-5 for “design this feature,” “review this 800-line diff,” “debug this race condition.” Switch back after.
- Exploration subagents use the cheap fast model — Haiku /
gpt-5.4-mini/ a small Gemini for “find every place we call X” or “summarise these 40 files.” The work is high-volume and low-stakes per call. - Side-by-side comparisons (opencode) — same prompt to Claude, GPT-5, and a Gemini; diff the answers. Useful for hard architecture questions where you don’t trust any single answer.
- Local / private models for sensitive code — opencode pointed at a local Ollama or LM Studio for code that mustn’t leave the machine.
- Provider failover — when one provider has an outage or rate-limits, opencode lets you swap providers without changing tool.
The test: if you’ve been using the same model for everything, you’re either overpaying or under-thinking. The right answer is usually “two or three models, picked per task.”
Why this and not…
Section titled “Why this and not…”| You want to… | Reach for | Not |
|---|---|---|
| Cheaper, faster runs on simple tasks | Switch to a lighter model | A heavier prompt on the flagship |
| Deeper reasoning on a hard problem | Switch to the flagship | More skills/context on a lighter one |
| Different model per worker | Per-subagent model field | Restarting the whole session |
| Compare two models on the same task | opencode side-by-side agents | Running the same prompt twice manually |
| Run a model offline / on-prem | opencode with local provider | Claude Code or Codex |
| Lock the team to one model for consistency | Pin model in project config | Verbal agreement |
Switching models doesn’t replace your prompt, skills, or memory — it changes who’s reading them. If your output quality problem isn’t about reasoning depth, a model swap won’t fix it.
How it works in each tool
Section titled “How it works in each tool”Models (current Claude 4.x tier):
claude-opus-4-7— flagship; 1M-token contextclaude-sonnet-4-6— balanced; 1M-token contextclaude-haiku-4-5— fast; 200K-token context
Switching: /model mid-session lists available models; pick one to switch the rest of the conversation. Pin a default via the model setting or the ANTHROPIC_MODEL env var.
Tier strategy:
- Opus — heaviest model; use for hard reasoning, planning, code review of long diffs.
- Sonnet — default for most coding work.
- Haiku — fast and cheap; use for trivial work or as a subagent’s model.
Claude Code does not support non-Anthropic models.
Models: OpenAI GPT-5.x family (and successors).
Switching: /model mid-session or --model <id> at launch. A model field in config.toml (or in a profile — see Configuration) sets the default.
Codex does not support non-OpenAI models.
Default model: gpt-5.5 for ChatGPT-authenticated sessions (with gpt-5.4 as fallback); gpt-5.2-codex for API-key auth. Known model IDs referenced in current docs include gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.3-codex, gpt-5.3-codex-spark (research preview), gpt-5.2-codex, gpt-5.1-codex-max. Codex doesn’t publish a static catalog — run codex debug models for the live list available to your account.
Multi-provider is opencode’s defining feature. Each agent can have its own model field pointing at any supported provider. Configure providers in opencode.json; switch in-session via TUI or per-agent.
“Zen” is opencode’s curated model selection — a vetted list of model+provider combinations tested for opencode’s workflow.
Common opencode setups:
- Default agent uses one provider (e.g. Sonnet for the loop), with
exploresubagent on Haiku for cheap context scans. - Side-by-side compare: spawn the same agent against two providers and diff the outputs.
Provider list: ~75+ providers preloaded via the Models.dev catalog — Anthropic, OpenAI, OpenRouter, Groq, Google (Gemini / Vertex AI), Amazon Bedrock, Azure OpenAI, Ollama, LM Studio, GitHub Copilot, GitLab Duo, DeepSeek, Together AI, Hugging Face, and many more. Credentials are added via /connect (or opencode auth login).
Custom providers: add a provider block to opencode.json. For OpenAI-compatible endpoints, use the @ai-sdk/openai-compatible npm package:
{ "provider": { "my-provider": { "npm": "@ai-sdk/openai-compatible", "name": "My Provider", "options": { "baseURL": "https://api.example.com/v1", "apiKey": "{env:MY_API_KEY}" }, "models": { "my-model": { "name": "My Model" } } } }}Reference the model as my-provider/my-model.
Model picker per chat. Cursor exposes a wide roster spanning Anthropic (Claude 4.5 / 4.6 / 4.7 across Sonnet, Opus, Haiku), OpenAI (GPT-5 family including Codex variants), Google (Gemini 2.5 / 3 Flash and Pro), xAI (Grok 4.x), Moonshot (Kimi K2.5), and Cursor’s own Composer family. Of every CLI in scope, only Cursor and opencode break the single-vendor mould.
Auto Mode routes between models on Cursor’s terms — “Auto allows Cursor to select models that balance intelligence, cost efficiency, and reliability.” Auto requests draw from a discounted Auto + Composer pool.
MAX Mode expands the context window to the model’s maximum and switches that request to token-based API pricing. As of late 2025, several frontier thinking models (GPT-5 Codex, Opus 4.5/4.6, Sonnet 4.5/4.6) auto-enable MAX Mode when selected, and users have flagged this as friction — it can’t always be turned off per-request. Unverified as of 2026-05-21 whether the per-request override has landed.
Plan Mode (Cursor 2.0) supports planning with one model and building with another — the only tool in scope that splits the loop across two models natively.
Per-subagent model — yes, Cursor’s subagents support per-agent model selection in their frontmatter.
Model picker in VS Code Chat / Edit / Agent / Plan modes. Available models depend on plan tier:
- Free — GPT-5 mini, Claude Haiku 4.5
- Pro — adds Claude Sonnet 4.5 / 4.6, GPT-5, GPT-4.1
- Pro+ / Enterprise — adds Claude Opus 4.7 and Gemini variants
Auto model selection routes a request to a Copilot-picked model with a 10% premium-request discount (e.g., Claude Sonnet 4 billed at 0.9× instead of 1×).
Premium request economics. Each plan ships a monthly premium-request quota (Pro: 300, Pro+: 1500, Business: included, Enterprise: 5× Pro). Non-premium models (GPT-5 mini, GPT-4.1, GPT-4o) are unlimited on paid plans. Each model has a multiplier (e.g., Claude Opus typically bills above 1×).
Org policies (Business / Enterprise). Admins can allow or deny specific models via Copilot policies. End users only see models their org permits — the model picker is gated, not just billed.
Other surfaces share the picker: github.com Chat and code review use the IDE model list. Coding Agent has a per-task model picker — GA for Pro/Pro+ in December 2025, expanded to Business/Enterprise in February 2026. The copilot CLI uses /model at runtime (default Claude Sonnet 4.5) and supports custom model providers via env vars (OpenAI-compatible, Azure, Anthropic, Ollama).
Comparison
Section titled “Comparison”| Aspect | Claude Code | Codex | opencode | Cursor | Copilot |
|---|---|---|---|---|---|
| Vendor lock-in | Anthropic only | OpenAI only | Multi-provider | Multi-provider (built-in roster) | Multi-provider (curated, plan-gated) |
| In-session switch | /model | /model, --model | Per-agent / TUI | Per-chat picker | Per-chat picker (/model in CLI) |
| Curated model list | Implicit (Opus/Sonnet/Haiku) | Implicit (GPT-5.x) | Zen (curated) | Composer + roster | Plan-tier roster |
| Auto-routing mode | — | — | — | Auto Mode | Auto model selection (10% discount) |
| Different model per subagent | Yes (frontmatter model:) | Yes (TOML model field) | Yes | Yes | — |
| Side-by-side comparison | N/A | N/A | Native (different agents) | — | — |
| Org / policy gating | Managed settings | Managed settings | Config | Team / Enterprise | Business / Enterprise allow-deny |
When this matters
Section titled “When this matters”- Pinning subagent costs. Giving a cheap exploration subagent a cheap model while the main loop uses a heavyweight one saves real money. Claude Code via
model:in subagent frontmatter, Codex viamodelin the agent TOML file, opencode via per-agentmodelin the agent’s markdown frontmatter, Cursor via subagent frontmatter. - Comparison reviews. Want to compare how three models answer the same code-review question? opencode’s per-agent provider is the only built-in path.
- Vendor risk. If you’ve standardised on Claude Code or Codex and need to switch providers, you switch tools, not just config. Cursor, opencode, and Copilot soften that by carrying multiple vendors inside one tool.
- Premium-request burn (Copilot). Coding Agent runs consume premium requests; long autonomous sessions can burn quotas fast — a 2026 sticker-shock issue flagged in the press.
Name collisions
Section titled “Name collisions”- “Model” sometimes means the family (Opus, Sonnet, GPT-5) and sometimes the snapshot ID (
claude-opus-4-7-20...). Be explicit when documenting model selection — snapshot IDs change over time. - “Auto” means different things across tools. Cursor’s Auto Mode picks a model on Cursor’s terms from a discounted pool. Copilot’s Auto model selection picks for you with a 10% premium-request discount. They are not the same primitive.