The two-axis model — approval policy is orthogonal to the sandbox
You’re about to start a refactor on budgetcli’s money handling — the original author stored some amounts as floats, and you want them converted to integer cents across half a dozen files. It’s the kind of job where stopping to approve every single edit is friction you’ll resent by the third file. So your instinct is to reach for some “let it run” setting. Before you do, you need to know what you’re actually turning up, because in Codex that instinct splits into two separate questions — and conflating them is how people end up either babysitting work that’s safe or walking away from work that isn’t.
One slider is the wrong shape
Section titled “One slider is the wrong shape”Think about what could go wrong when an agent works on your financial data, and you’ll notice the risks aren’t all the same kind:
- It could do something you didn’t expect — delete a file, run a destructive command, rewrite something it shouldn’t have. The defence against that is making it ask you first.
- It could reach somewhere it shouldn’t — read your
.env, write outside the project, phone home to some endpoint with your data. The defence against that is fencing where it can go, independent of whether it asks.
A single trust slider can’t separate these. “More autonomous” on a one-dial tool means both more-without-asking and more-reach, bolted together — so to stop being interrupted on a safe refactor, you’d also have to widen the blast radius. That’s a bad trade, and it’s the trade Codex refuses to make you. It gives you two dials.
Axis one — how much it can do without asking
Section titled “Axis one — how much it can do without asking”The approval policy controls when Codex pauses to get your y/n before it acts. You set it with --ask-for-approval (short form -a), and it takes three values:
-a untrusted pause for anything not on a known-safe list-a on-request let the agent decide when to ask; it pauses for the riskier moves-a never never pause — run end to end without interrupting youThat’s the whole first axis: from “check with me constantly” to “don’t interrupt me at all.” Notice what it does not say anything about — where the agent is allowed to read, write, or connect. Approval is purely about the interruptions.
Axis two — how far it can reach
Section titled “Axis two — how far it can reach”The sandbox controls the agent’s actual reach: what the operating system will let it touch, regardless of whether it asked. You set it with --sandbox (short form -s), and it also takes three values:
-s read-only can read files, cannot write anything or run commands that change state-s workspace-write can read, and can write inside the project — but not outside it, and no network-s danger-full-access no fence at all — whole filesystem, full networkThis axis is enforced by the sandbox itself, not by the agent’s good behaviour. In read-only, a write can’t happen — there’s nothing to approve, because the OS won’t permit it. That’s a stronger guarantee than “the agent agreed not to,” and it’s why the two axes aren’t redundant: approval is a checkpoint the agent passes through; the sandbox is a wall it can’t.
They combine
Section titled “They combine”Because the two are orthogonal, you pick one value from each, and the pair defines the session:
read-only workspace-write danger-full-access untrusted look, ask to act refactor, ask on risk (rarely sensible) on-request read & propose the daily driver power use, asks on risk never silent read-only hands-off in the fence no guardrails at allRead that grid as two questions answered separately. The cell that fits today’s money refactor is on-request × workspace-write: let the agent edit freely inside the project and only stop you for the genuinely risky moves, while the sandbox guarantees it can’t escape the directory or touch the network no matter how the approval dial is set. You get an uninterrupted refactor and a hard fence — which a single slider could never give you at once.
A couple of things worth nailing down now, because they trip people up:
- There are no named “modes.” You may have seen references to
suggest,auto-edit, orfull-autopresets — those aren’t a thing in Codex. There’s only the two-axis pairing. (There’s a deprecated--full-autoflag floating around; it just setsworkspace-writeand prints a warning telling you to use-s workspace-writedirectly. Ignore it.) - The default is deliberately cautious. Launch Codex with neither flag and it picks a safe pairing for you — closer to the top-left of that grid than the bottom-right. You loosen on purpose, per task, not by accident.
You’ll often set both dials persistently rather than typing flags every time — the config keys are approval_policy and sandbox_mode in config.toml, and they take the same values. We’ll bundle them into a named profile in the last lesson. The authoritative list of values and their precise behaviour lives in the Codex config reference; pin to that rather than trusting a fixed memory, since the granular options grow over time.
Now take each axis one at a time. Start with the wall, because it’s the one protecting your money — the three sandbox levels in depth.