Permissions, auto-run & the sandbox

The work so far has been supervised by default: you read the agent’s proposed edits, you clicked through its terminal commands one at a time, and nothing ran that you didn’t watch run. That’s fine while you’re feeling a project out. It stops being fine the moment the work gets repetitive — a budgetcli migration that runs the same go test ./... thirty times, a refactor that touches forty files and wants to re-run the build after each batch. Clicking Run on every command is friction you’ll start routing around, and the way people route around it is by flipping the agent to run everything unattended. Do that without understanding what’s actually containing the agent and you’ve quietly handed a model write access to your whole machine.

This chapter is about setting the blast radius on purpose instead of by accident. There are two things to get straight, and they’re easy to conflate:

How much the agent runs without asking you — the auto-run mode. Does the agent route each command through review tiers, run them unattended inside a sandbox, or just run everything?
What a running command can actually reach — the sandbox, plus the allowlist and denylist layered on top. When a command does run, is it boxed into your project, or is your filesystem and network in scope?

The single most important idea in the chapter is which of these is real. In Cursor, the sandbox is the load-bearing control and the lists are superseded by it. The OS-level sandbox is enforced by the operating system; the allowlist is a pattern-matcher the sandbox is explicitly designed to replace, and the denylist was deprecated outright. Get that ordering backwards — trust the denylist to keep the agent away from something dangerous — and the protection you think you have isn’t there.

Before Cursor’s names for any of this, the judgment itself: how much leash a task earns comes down to two properties of the task, not its difficulty. Set them for the budgetcli migration grind, then for anything that touches a deploy, and watch the recommended rung move — the rest of this chapter is Cursor’s hardware for each rung, with the sandbox as the fence that makes the loose end safe:

Two questions decide how much leash a task earns — neither of them is “how hard is it.” Set both for the work in front of you and read the rung it lands on.

If the agent’s worst single action went wrong, undoing it would take…a git checkoutthe change lives in tracked files; the floor holdsreal worka migrated database, regenerated files, a rebased brancha miraclea send, a deploy, a delete with no backup

…and its consequences would reachyour working copyone repo, one machine, nobody downstreamwhat the team sharesa library others consume, common infra, shared databeyond the machineproduction, customers, money, the public

Run free, no fencedisposable environments onlyNothing pauses it and nothing contains it. No combination on this dial lands here — it belongs only where the whole environment is disposable: a throwaway container, an already-isolated CI runner.
Run free inside a fencethis taskNo prompts; the boundary does the protecting. The agent grinds end to end inside a sandbox, container, or scratch worktree, and you review the whole batch once at the end.
Auto-apply edits, gate the rest
Ask before acting
Read & propose only

A mechanical rename across your own repo is the canonical case: the worst outcome is a git diff you throw away. Prompting on every one of twenty-four identical edits doesn’t add safety — it teaches you to stop reading prompts, which is where real risk starts. Let it run inside the fence and review the batch once.

The rung names are generic on purpose — every tool spells its own versions of them, and most let you set different rungs for different categories of action. The judgment underneath is the same two questions, asked per task, never answered once for all time.

The auto-run modes

Cursor exposes auto-run as a setting with four modes. You set it in Cursor Settings. From the CLI, /auto-run [on|off|status] toggles auto-run on or off (or reports its state) — it’s a toggle, not a selector among the named GUI modes.

The four modes, per the current terminal docs:

Auto-review

The default as of Cursor 3.6, and the most important mode to understand because it isn’t a single gate — it’s a three-tier router. When the agent proposes a Terminal, MCP, or Fetch call, Auto-review sorts it through three stages in order:

Allowlisted commands run immediately — trusted calls execute with no model and no prompt in the loop. This is your known-safe set (go test, git status on budgetcli) going straight through.
Sandboxable commands are auto-sandboxed — anything Cursor can safely isolate is run inside the sandbox automatically, without asking you. The sandbox does the containing; you don’t get a prompt.
Everything else is routed to a classifier subagent — a small reasoning agent embedded in the main agent loop. It reads the proposed action, applies any custom instructions you’ve given it, and decides whether to allow the call, try a different approach, or ask for your approval.

The classifier is the new piece in 3.6, and the one to be precise about: it is not a security boundary. It’s a non-deterministic LLM making a best-effort convenience call to cut approval prompts — Cursor reduced “y-spamming” roughly 84% with it — but because it can be wrong, you do not lean on it to keep the agent away from something irreversible. The three tiers are real and useful for staying in motion; the wall is still the sandbox (next section), not the classifier’s judgment.

Allowlist

Commands matching your allowlist run without prompting; anything not on the list is gated. This is the classic “let it run go test ./... and git status freely, but pause on everything else” posture — useful for cutting friction on a known-safe set of commands while keeping a checkpoint in front of the rest.

Allowlist (with Sandbox)

Commands run without prompting you, but inside an OS-level sandbox. The agent gets to grind through its loop — run tests, re-run the build, retry after a failure — without stopping at every step, and the sandbox is what makes that safe: a command that tries to write outside the allowed area, or reach the network, doesn’t get to. This is the mode the rest of the chapter is really about, because it’s the one where the sandbox is doing the protecting and you need to understand exactly what the sandbox does and doesn’t cover.

One thing to be precise about: in sandbox mode the allowlist is not applied. The sandbox is meant to replace the allowlist — filesystem and network restrictions instead of per-command approval. So when a sandboxed command genuinely needs something the sandbox blocks — outbound network, or filesystem access outside the sandboxed area — it doesn’t silently fail into the void, and it doesn’t fall through to allowlist evaluation. Cursor prompts you with three options: skip the command, run it without restrictions, or run it and add it to the allowlist. The fall-through is to a prompt, not to a list.

Run Everything

No gating. Every command runs, no sandbox, no prompt — this is the historical “YOLO mode” by another name. There are legitimate places for it: a throwaway container, a CI runner that’s already isolated, a VM you’ve deliberately made disposable. On your actual development machine, with your real repos and credentials in reach, it’s almost always the wrong answer. The reason you’d reach for an AI agent on serious work is that the stakes are real, and this is precisely the setting that throws away every protection those stakes call for. When you feel the pull toward it, the right move is usually to switch to Allowlist (with Sandbox) instead — you get the unattended loop without removing the wall.

The Agent Sandbox is the wall

Cursor shipped a real Agent Sandbox: sandboxed terminals launched in beta in 1.7 and went GA in 2.0, where agent commands run inside the secure sandbox by default. This is the control that actually holds. It’s enforced by the operating system, not by the model’s judgment or by a pattern match against a command string — so a write outside the sandboxed area, or an outbound network call, fails, regardless of how the command was spelled or whether the agent “meant” to escape.

The sandbox is no longer macOS-only. Current docs document safe sandbox execution on macOS, Linux, and Windows:

macOS — supported directly.
Linux — requires kernel 6.2+ with Landlock v3 and unprivileged user namespaces. If those prerequisites aren’t met, the sandbox can’t enforce its restrictions and Cursor falls back to asking for approval instead of running unattended.
Windows — runs the sandbox inside WSL2, applying the same restrictions as Linux.

So the “macOS-only, parity unconfirmed” caveat from earlier builds is now stale. The thing to carry instead: cross-platform isolation exists, but on Linux it’s contingent on the kernel prerequisites above, and when they’re absent you get prompted rather than silently unprotected.

Hold the distinction the same way you would a wall versus a checkpoint. The sandbox is a property of the environment: set it and a forbidden write is impossible, not disallowed-but-overridable. The auto-run mode is a property of the workflow: it decides which possible actions pause for you. You combine them — the sandbox says where a command can reach at all, the mode says which commands you watch run.

Allowlist and denylist: superseded, not guarantees

Cursor historically gave you two fine-grained lists:

a command allowlist — patterns that are permitted to run, and
a command denylist — patterns that should be blocked.

These looked like the way to express “let it run go test ./... and git status freely on budgetcli, but never git push or rm -rf.” Useful as intent — but you have to know exactly how much weight they can bear, because the honest answer is less than they look like they bear. Two facts make the point, and both have since been resolved in Cursor’s favor by removing the list as the load-bearing layer rather than hardening it.

The allowlist is not applied under the sandbox — by design. This isn’t a build-specific bug or a silent leak: it’s intended behavior, confirmed by Cursor staff. In sandbox mode the allowlist really isn’t applied; the sandbox is meant to replace it, swapping per-command approval for filesystem and network restrictions. So you can’t reason “this command ran, therefore it was on my allowlist.” Under the sandbox, the sandbox is what’s gating reach; the allowlist is not in the loop at all. (If you specifically need the old per-command allowlist behavior, enabling the Legacy Terminal Tool is the workaround.)

The denylist was bypassable, so Cursor deprecated it. A denylist looks like it blocks dangerous commands, but security research (Backslash Security, reported by The Register) demonstrated reproducible bypasses that don’t go near the obvious cases: obfuscation via Base64, subshell execution with parentheses, writing the dangerous action into a shell script and running that, and quote-escaping such as "e"cho to dodge a literal pattern match. Backslash’s core finding was blunt — for every command in a Cursor denylist, there are infinite commands not in the denylist that have the same behavior. A pattern-matcher over command strings can’t win that game. Cursor’s response was to officially deprecate the denylist feature in release 1.3 — so the denylist is not a current advisory control you should be leaning on; it’s a removed one.

The conclusion is not “lists are useless.” It’s that they’re the wrong layer to put your trust in, and Cursor has effectively said so by replacing them. Use the allowlist (outside the sandbox) to cut friction and express intent — to let known-safe commands run without a prompt. Do not reach for a denylist entry as the thing standing between the agent and an irreversible action: the feature is deprecated, and even when present it never held. The thing that actually contains a running command is the sandbox. The lists are superseded; the sandbox is load-bearing. When in doubt, believe the sandbox.

When you need real isolation: Cloud Agents

Sometimes the right answer isn’t tuning the local controls at all — it’s not running on your machine in the first place. Cursor’s Cloud Agents (formerly Background Agents) operate in isolated VMs in the cloud with full development environments instead of on your local machine: the agent clones budgetcli onto a fresh branch, does its work, and pushes results back. Because nothing executes on your laptop, the local-machine threat model — your credentials, your other repos, your filesystem — simply doesn’t apply. The environment is configured from .cursor/environment.json (a Dockerfile is supported), so the box is reproducible the same way every run.

This is the cleanest way to give an agent a long leash safely: the isolation is the boundary of a VM, not a pattern in a deprecated list. We pick Cloud Agents back up in the CLI chapter, where running headless and unattended makes that built-in isolation matter most — when nobody’s at the keyboard to answer a prompt, you want the agent in a box by construction.

(Reference: the canonical docs live at cursor.com/docs/background-agent, which serves the current Cloud Agents content.)

Set it by reflex

The shape to carry out of this chapter:

Pick the mode for the work. Unfamiliar code or real stakes → Auto-review (the default) or plain Allowlist for a known-safe command set. Repetitive, supervised grind where you want the loop to run unattended → Allowlist (with Sandbox). Run Everything stays in the disposable-environment box, not on your real machine.
Lean on the sandbox, not the lists. The Agent Sandbox is the wall that actually holds — on macOS directly, on Linux given the kernel prerequisites, on Windows via WSL2. The allowlist is superseded by the sandbox when sandboxing is on, and the denylist is deprecated. Treat lists as friction-reduction and intent, never as your last line of defense.
When you need a real guarantee, change where the agent runs — a Cloud Agent in an isolated VM — rather than trying to harden the local lists into something they aren’t.

Get the ordering right — sandbox load-bearing, lists superseded — and the rest is muscle memory: loose on the mode where the work is tedious and watched, tight on reach because the wall, not the model, is what’s keeping the agent in its lane.