Extending Codex

So far everything Codex has done to budgetcli lived inside the repo — read files, run the test suite, edit Python, import a CSV with a skill. But the budget has a hole in it. A handful of your transactions are in euros and pounds: a hotel booking, a couple of foreign subscriptions. Right now they’re stored at whatever rate you guessed when you typed them in, and the monthly report quietly lies about your real spend.

Codex can’t fix that on its own, because the truth lives somewhere it can’t reach — an external exchange-rate API, out on the network, behind a wall the agent has no door through. Reading a file is one thing; calling a live HTTP service the model has never heard of is another. That’s a reach problem.

And there’s a second, sharper problem waiting. Once Codex can touch external systems and edit your money-handling code freely, you want one hard guarantee: it must never write to your real ledger database, and it must run the tests after touching the money math. Not “ask it nicely in AGENTS.md” — a wall that fires every single time, no matter what the model decided in the moment. That’s a gate problem.

Two extension points, two distinct jobs, and it’s worth fixing the distinction before we start because they’re easy to blur:

Reach — an MCP server is a bridge to a system Codex otherwise can’t touch: an API, a database, a browser. It gives the model new tools.
Gate — a hook is a deterministic shell command wired to a moment in the agent’s lifecycle. It runs regardless of what the model chooses, so it’s how you enforce a rule that can’t depend on the agent remembering it.

What you’ll do this chapter

Connect the exchange-rate API as an MCP server — a [mcp_servers.<name>] block, codex mcp to manage it, and the network-off gotcha that bites everyone the first time.
Let Codex itself act as an MCP server — the other direction, and when handing Codex to another agent is the right move.
Gate a risky move with a hook — block writes to the real ledger db, run the tests after the money math changes.
When a hook beats the agent’s judgment or a permission — the three kinds of guarantee, and how to pick.

The lessons build in that order, because that’s the order the need shows up in real work: first you widen what the agent can reach, then you put a wall around the part that must never go wrong. Each lesson ends where the next begins.

Start by connecting the rate API.