Are your agents secure?
Most aren’t.Forge runs an outside security review and gap remediation across your AI agent stack — Bedrock, AI Foundry, Vertex, NemoClaw, Cursor, Claude Code, and MCP servers.
You leave with a threat model, a gap report, and a remediation plan — whether you ever buy anything else from us or not.
Framework-mapped
Findings mapped to
OWASP · MITRE · NIST · FINOS AIGF
Fixed-scope
Engagement model
Threat model · gap report · plan
Outside review
External engineers
Working in your environment
Optional
Managed remediation
Reign · 24/7 ops on Forge
The new attack surface
Six categories of risk that didn’t exist three years ago.
Generative AI broke the network and application security models you spent the last decade hardening. Agentic AI broke the runtime policy model on top of that. The review starts here.
Tool-call abuse via MCP
Agents invoke tools the platform allowed but the policy did not. Tool poisoning, server impersonation, and parameter manipulation across MCP and native tool runtimes. Many enterprise MCP rollouts ship without a tool registry, policy layer, or audit trail.
Prompt injection & jailbreaks
Direct, indirect, and cross-tenant prompt injection that subverts system instructions, exfiltrates context, or coerces the model into unauthorized actions. Indirect injection through documents, tickets, and emails is now common; vendor-provided guardrails alone don't stop layered attacks.
Credential & secret leakage
Agent-mediated credential harvesting, secret exposure through tool outputs, and over-broad service-account scopes for non-human actors. SecOps practices designed for traditional services often don't cover AI-specific credential paths.
Output abuse & data exfiltration
Sensitive data leaving via model output, side-channel exfiltration through tool callbacks, and inadvertent disclosure via observability pipelines. DLP rules are often not extended to AI-bound traffic.
Supply-chain risk on weights & tools
Untrusted model weights, MCP server supply-chain compromise, agent state persistence poisoning, and dependency drift across the agent runtime. AppSec practices for the AI supply chain are still being defined across the industry.
Blast radius from autonomous tool calls
What happens when an agent acts wrong — multi-agent trust boundary violations, cross-system blast radius, and the absence of a kill switch in the operational path. Kill-switches, rate-limiting, and per-step approvals for sensitive scopes are often not in place.
We map findings to
Six framework lenses, one gap report.
We reference these frameworks in the gap report. We do not certify or accredit against them — that’s your auditor’s and regulator’s job.
What we review
Six surfaces — the full agent estate.
Every layer where an autonomous agent can take action, call a tool, retrieve a secret, hand off to another agent, or leave a trace your auditor will want.
Foundation-model surfaces
Bedrock · Azure AI Foundry · Vertex AI · OpenAI · Anthropic
Provider configuration, region/data-residency, content filters, evaluation harness, and model-version pinning across the foundation-model layer.
Agent runtimes
NemoClaw · Cursor · Claude Code · LangGraph · CrewAI · Salesforce Agentforce
Runtime sandbox isolation, agent identity, tool authorization at the call site, and execution policy across self-hosted and managed agent platforms.
MCP servers
Self-hosted MCP · third-party MCP · in-house tool servers
MCP server inventory, supply-chain provenance, authentication, scope boundaries, parameter validation, and audit posture for every tool the agent can invoke.
Identity, secrets & network
Non-human identity · IAM · secret managers · service mesh
Non-human identity model for every agent and tool. Secret retrieval, scoping, rotation, and network egress controls for autonomous workloads.
Observability & audit
Telemetry · trace · log pipelines · SIEM
Whether agent actions, tool calls, and policy decisions are captured tamper-resistantly, time-stamped, identity-attributed, and queryable on demand.
Deployment posture
Cloud · VPC · on-prem · air-gapped · sovereign
Deployment topology, tenancy boundaries, data-flow controls, validated environments, and the path from review findings to production-grade runtime on Forge.
What we deliver
Four bounded artifacts. No surprises.
You leave with a threat model, a gap report, and a remediation plan — whether you ever buy anything else from us or not.
Threat model document
STRIDE-aligned threat model for your agent estate, with attack trees mapped to the surfaces above and the foundation-model, runtime, and tool layers.
Gap report
Findings catalogued and mapped to OWASP LLM Top 10, MITRE ATLAS, NIST AI RMF, FINOS AIGF, EU AI Act Article 9, and ISO/IEC 42001 — for the gap report. We reference these frameworks. We do not certify against them.
Prioritized remediation plan
Findings ranked by exploitability and blast radius, with concrete remediations sequenced for engineering — what your team can fix this sprint, this quarter, and what needs a structural change.
Reign integration recommendations
How an AI Gateway, runtime policy decision points, and a tamper-resistant evidence pipeline would close the residual gap — independent of whether you ever roll out Reign.
Optional next steps
If you want execution, not just a plan.
- Optional: Managed remediation by Forge engineers
- Optional: Sovereign deployment on validated Forge environments
- Optional: Optional 24/7 managed operations on Forge
- Optional: Reign rollout for runtime governance + evidence
All opt-in. Scoped per engagement. Not bundled into the review.
How the review runs
Four phases. Roughly four weeks.
Realistic estimates — actual duration depends on the size and complexity of your agent estate.
Week 1
Discovery & scoping
Inventory the agent estate. Identify foundation-model providers, runtimes, MCP servers, identity model, and observability stack. Agree on scope boundaries and which environments are in-scope for review.
Weeks 2–3
Threat model & controls audit
Working sessions with your platform, security, and ML platform engineers. STRIDE-aligned threat modeling. Controls audit across foundation-model surfaces, agent runtimes, MCP servers, identity, and observability.
Week 4
Gap report & remediation plan
Findings finalized, mapped to the framework set, prioritized by exploitability and blast radius. Walk-through with engineering and executive stakeholders. Remediation plan handed off — execution stays with you, or with us if you want it.
Optional ongoing
Remediation, deployment, governance
Managed remediation, sovereign deployment on validated environments, 24/7 managed operations on Forge, and Reign rollout for runtime governance and evidence — all opt-in, scoped per engagement.
Who this is for
Two stakeholders. One review.
The review serves both audiences side-by-side — the executive read and the engineering read, on the same artifacts, in the same engagement.
AI EXECUTIVES & GOVERNANCE LEADS
CIO · CISO · CRO · Chief AI Officer · Head of Compliance
- Outside review — external engineers, not your own team grading their own work.
- Findings mapped to OWASP LLM Top 10, MITRE ATLAS, NIST AI RMF, FINOS AIGF, EU AI Act Article 9, and ISO/IEC 42001.
- Executive summary that separates the inside-help work from the outside-help work — what your team owns vs. what we recommend bringing in.
- Honest read on governance posture before the next audit, board meeting, or regulator conversation.
SENIOR TECHNICAL LEADS
Platform Eng · Security Eng · ML Platform · AppSec · DevOps lead
- Working sessions with engineers who run agent runtimes and managed AI infrastructure on Forge.
- Threat model artifacts and gap findings in formats your team can pick up and act on directly.
- Code-level recommendations across foundation-model config, agent runtimes, MCP servers, identity, and observability.
- Optional execution path on Forge if your team is at capacity — all opt-in, scoped per engagement.
Need governance on top of the runtime once the review lands? Explore Reign·Forge runs · Reign governs · BioCompute extends.
