DPoP-bound capability keys. Streaming PII redaction nobody else has solved. Per-agent budgets. Per-employee attribution. Prompt firewall. HIPAA / PCI / GDPR compliance packs. All composed from primitives KnoxCall already audits.
What's in the box
The AI Gateway market is loud and shallow. Every vendor sells one of these. KnoxCall ships all five, audited, on day one — because they sit on top of primitives we already had (routes, secrets, vaults, alerts, packs, fleet graph, mTLS, VPN egress, leases, KMS-unseal).
kc_live_a_… keys with embedded capability scopes. RFC 9449 sender-constraint binds the
key to a private key in the OS keychain. A stolen key without the matching private half is inert.
Refresh rotation with theft detection invalidates the entire credential family on reuse.
Hold-back FSM. PII split across SSE chunks (think SSNs straddling
data: events) is detected at sentence boundaries before any token leaks downstream.
Bedrock punts. Cloudflare buffers. We solve it.
Per-agent daily / monthly USD caps. Pre-flight Redis check. Pricebook lookup post-flight for exact
cost. X-KC-User header pins spend to an employee + team. Block / warn / fall-back to a
cheaper agent on overage.
Heuristics catch obvious "ignore previous instructions" patterns in microseconds. Per-tenant canary tokens injected into system prompts trip a critical alert on extraction. Vector classifier in v1.2.
HIPAA Safe Harbor (18 identifiers + MRN formats), PCI (PAN + CVV + ABA + tokenization), GDPR (EU national IDs + RTBF), SOC 2 — one-click recognizer sets, retention, audit alerts, route templates. Sells itself in procurement.
Cursor / Claude Code / Cline / Continue / OpenAI SDK (Py + Node) / Anthropic SDK (Py + Node) / generic OpenAI- + Anthropic-compatible. Drop a JSON; new tools take a PR, not a release.
The killer differentiator
Every other AI gateway buffers the response (you lose streaming) or sees PII leak through chunk boundaries (you lose compliance). KnoxCall does neither. Per-stream finite-state machine + 96-char sliding hold-back buffer + Aho-Corasick → regex+checksum → Presidio sidecar.
"Patient John Smith, SSN 123-45-6789" → LLM sees "Patient Mary Jones, SSN 847-29-1058".
Token map keyed by conversation_id, 24h TTL. PHI never leaves your tenant boundary.
Per-event JSON parse, append text to a 96-char hold-back buffer, run detector stack, tokenize / redact in place, emit prefix(buffer, len-96). Covers SSN, PAN (Luhn), Amex, IBAN, passport, BTC/ETH addresses, MRN.
JSONPath rewriter on the way back. Customer sees the original PHI verbatim. The LLM only ever saw tokens. Audit log records spans + recognizer + confidence — never the raw value.
Connect tab
Pick your tool. We generate the exact files, env vars, and copy-paste shell snippets. DPoP-capable tools also receive a fresh ECDSA P-256 keypair installed straight into the OS keychain.
Honest comparison
Plain bearer tokens, post-hoc PII, no streaming proof, no compliance packs, $10 → "Contact Sales". The AI gateway market is asleep. We're not.
| Feature | API Stronghold | Cloudflare AI | Bedrock | KnoxCall AI Gateway |
|---|---|---|---|---|
| DPoP-bound capability keys | plain bearer | plain bearer | IAM | ✓ |
| Streaming SSE pass-through | unproven | buffered | ✓ | proven, hold-back FSM |
| Streaming PII redaction | refused | refused | post-hoc only | unique |
| HIPAA / PCI / GDPR packs | × | × | via Macie | ✓ |
| Per-agent budgets + per-employee | × | × | × | ✓ |
| Model allowlist + rewrite | × | × | × | allow + deny + rewrite |
| Prompt firewall + canary leak | × | × | × | ✓ |
| OIDC workload federation | × | × | via STS | ✓ |
| Audit logs (free tier) | stripped | limited | via CloudTrail | full |
| OpenTelemetry GenAI | × | × | × | ✓ |
| Pricing $49 tier | gap | free / $20 | consumption | ✓ |
Capability keys, streaming PII redaction, budgets, attribution, and compliance packs — all in one audited gateway.