Capability-Based Security¶

Don't enumerate what to block. Enumerate what to allow.

The principle that ties together every successful sandbox — browsers, mobile OSes, OS process isolation. Default-deny everything, then explicitly grant specific, minimal capabilities. If you didn't grant it, it doesn't exist for the code — nothing to exploit because nothing's there.

Block-list vs allow-list¶

Block list. Hand the code a master key + a list of 10,000 rooms it can't enter. You have to enumerate every dangerous syscall, every risky API, every possible attack. Miss one and you're compromised.
Allow list. Give the code keys to only the 3 rooms it actually needs. Everything else is unreachable by construction, not by rule.

Every modern sandbox picks option B. Every legacy security-by-patch approach picked option A and lost.

How it shows up¶

Browsers — a page can't access your camera until you grant the capability.
Mobile OSes — apps must request permissions for camera, contacts, microphone.
OS processes — processes can't read each other's memory by default.
V8 isolates (Cloudflare Workers) — globalOutbound: null blocks all outbound network at the runtime level. No fetch, no websocket, nothing. The code then only gets whatever bindings you explicitly hand it (restricted DB query method, logger, etc.).

Why this matters for AI-generated code¶

AI code inherits whatever privileges its host process has. Default-allow means the LLM + your secrets = full compromise surface. Default-deny + explicit minimal capabilities means the blast radius of any prompt injection, hallucination, or adversarial input is bounded by what you chose to hand in.

Five-point threat model (Agrawal)¶

Ask these for every sandboxed execution, and give yes/no answers (never "probably fine"):

Secrets — can it read env vars / API keys / DB creds?
Networking — can it make outbound requests? Hit internal services? Exfiltrate over HTTP?
File system — can it read files outside its workspace? Config? Other users' data? Application code?
Multi-tenancy — can one user's code see another user's data or affect their execution?
Resource limits — can it infinite-loop and burn compute? Allocate unbounded memory?

Cross-references¶

ai-generated-code-is-untrusted — the threat model this principle addresses
isolates-vs-containers — the two concrete implementations
isolated-agent-vms — sibling pattern at the VM layer