Restate the trust model from first principles: the OS is the only
load-bearing boundary against an adversarial LLM. Distinguish
terminal-backend isolation (sandboxes the shell tool) from
whole-process wrapping (sandboxes the agent itself, reference
deployment NVIDIA OpenShell). Name in-process components (approval
gate, output redaction, Skills Guard) as heuristics, and the class
of reports that defeat them as out of scope under this policy —
while explicitly welcoming them as regular issues or PRs.
Introduce 'agent-loaded content' as the narrow, honest commitment:
attacker-influenced input must not chain into a write the agent
later loads on its own initiative.
Strip implementation-detail enumerations (backend names, adapter
names, config keys, env vars, internal symbols) so the doc stays
evergreen as code evolves.
Clarifies that tool-level access restrictions are not security boundaries
when the agent has unrestricted terminal access. Deny lists only matter
when paired with equivalent terminal-side restrictions (like WRITE_DENIED_PATHS
pairs with the dangerous command approval system).