Forge · AI Security Review

    Are your agents secure?

    Most aren’t.Forge runs an outside security review and gap remediation across your AI agent stack — Bedrock, AI Foundry, Vertex, NemoClaw, Cursor, Claude Code, and MCP servers.

    You leave with a threat model, a gap report, and a remediation plan — whether you ever buy anything else from us or not.

    Framework-mapped

    Findings mapped to

    OWASP · MITRE · NIST · FINOS AIGF

    Fixed-scope

    Engagement model

    Threat model · gap report · plan

    Outside review

    External engineers

    Working in your environment

    Optional

    Managed remediation

    Reign · 24/7 ops on Forge

    The new attack surface

    Six categories of risk that didn’t exist three years ago.

    Generative AI broke the network and application security models you spent the last decade hardening. Agentic AI broke the runtime policy model on top of that. The review starts here.

    Tool-call abuse via MCP

    Agents invoke tools the platform allowed but the policy did not. Tool poisoning, server impersonation, and parameter manipulation across MCP and native tool runtimes. Many enterprise MCP rollouts ship without a tool registry, policy layer, or audit trail.

    Prompt injection & jailbreaks

    Direct, indirect, and cross-tenant prompt injection that subverts system instructions, exfiltrates context, or coerces the model into unauthorized actions. Indirect injection through documents, tickets, and emails is now common; vendor-provided guardrails alone don't stop layered attacks.

    Credential & secret leakage

    Agent-mediated credential harvesting, secret exposure through tool outputs, and over-broad service-account scopes for non-human actors. SecOps practices designed for traditional services often don't cover AI-specific credential paths.

    Output abuse & data exfiltration

    Sensitive data leaving via model output, side-channel exfiltration through tool callbacks, and inadvertent disclosure via observability pipelines. DLP rules are often not extended to AI-bound traffic.

    Supply-chain risk on weights & tools

    Untrusted model weights, MCP server supply-chain compromise, agent state persistence poisoning, and dependency drift across the agent runtime. AppSec practices for the AI supply chain are still being defined across the industry.

    Blast radius from autonomous tool calls

    What happens when an agent acts wrong — multi-agent trust boundary violations, cross-system blast radius, and the absence of a kill switch in the operational path. Kill-switches, rate-limiting, and per-step approvals for sensitive scopes are often not in place.

    We map findings to

    Six framework lenses, one gap report.

    OWASP LLM Top 10
    MITRE ATLAS
    NIST AI RMF (AI 100-1)
    FINOS AIGF · v2.0 agentic-AI controls
    EU AI Act Article 9
    ISO/IEC 42001

    We reference these frameworks in the gap report. We do not certify or accredit against them — that’s your auditor’s and regulator’s job.

    What we review

    Six surfaces — the full agent estate.

    Every layer where an autonomous agent can take action, call a tool, retrieve a secret, hand off to another agent, or leave a trace your auditor will want.

    Foundation-model surfaces

    Bedrock · Azure AI Foundry · Vertex AI · OpenAI · Anthropic

    Provider configuration, region/data-residency, content filters, evaluation harness, and model-version pinning across the foundation-model layer.

    Agent runtimes

    NemoClaw · Cursor · Claude Code · LangGraph · CrewAI · Salesforce Agentforce

    Runtime sandbox isolation, agent identity, tool authorization at the call site, and execution policy across self-hosted and managed agent platforms.

    MCP servers

    Self-hosted MCP · third-party MCP · in-house tool servers

    MCP server inventory, supply-chain provenance, authentication, scope boundaries, parameter validation, and audit posture for every tool the agent can invoke.

    Identity, secrets & network

    Non-human identity · IAM · secret managers · service mesh

    Non-human identity model for every agent and tool. Secret retrieval, scoping, rotation, and network egress controls for autonomous workloads.

    Observability & audit

    Telemetry · trace · log pipelines · SIEM

    Whether agent actions, tool calls, and policy decisions are captured tamper-resistantly, time-stamped, identity-attributed, and queryable on demand.

    Deployment posture

    Cloud · VPC · on-prem · air-gapped · sovereign

    Deployment topology, tenancy boundaries, data-flow controls, validated environments, and the path from review findings to production-grade runtime on Forge.

    What we deliver

    Four bounded artifacts. No surprises.

    You leave with a threat model, a gap report, and a remediation plan — whether you ever buy anything else from us or not.

    Threat model document

    STRIDE-aligned threat model for your agent estate, with attack trees mapped to the surfaces above and the foundation-model, runtime, and tool layers.

    Gap report

    Findings catalogued and mapped to OWASP LLM Top 10, MITRE ATLAS, NIST AI RMF, FINOS AIGF, EU AI Act Article 9, and ISO/IEC 42001 — for the gap report. We reference these frameworks. We do not certify against them.

    Prioritized remediation plan

    Findings ranked by exploitability and blast radius, with concrete remediations sequenced for engineering — what your team can fix this sprint, this quarter, and what needs a structural change.

    Reign integration recommendations

    How an AI Gateway, runtime policy decision points, and a tamper-resistant evidence pipeline would close the residual gap — independent of whether you ever roll out Reign.

    Optional next steps

    If you want execution, not just a plan.

    • Optional: Managed remediation by Forge engineers
    • Optional: Sovereign deployment on validated Forge environments
    • Optional: Optional 24/7 managed operations on Forge
    • Optional: Reign rollout for runtime governance + evidence

    All opt-in. Scoped per engagement. Not bundled into the review.

    How the review runs

    Four phases. Roughly four weeks.

    Realistic estimates — actual duration depends on the size and complexity of your agent estate.

    Week 1

    Discovery & scoping

    Inventory the agent estate. Identify foundation-model providers, runtimes, MCP servers, identity model, and observability stack. Agree on scope boundaries and which environments are in-scope for review.

    Weeks 2–3

    Threat model & controls audit

    Working sessions with your platform, security, and ML platform engineers. STRIDE-aligned threat modeling. Controls audit across foundation-model surfaces, agent runtimes, MCP servers, identity, and observability.

    Week 4

    Gap report & remediation plan

    Findings finalized, mapped to the framework set, prioritized by exploitability and blast radius. Walk-through with engineering and executive stakeholders. Remediation plan handed off — execution stays with you, or with us if you want it.

    Optional ongoing

    Remediation, deployment, governance

    Managed remediation, sovereign deployment on validated environments, 24/7 managed operations on Forge, and Reign rollout for runtime governance and evidence — all opt-in, scoped per engagement.

    Who this is for

    Two stakeholders. One review.

    The review serves both audiences side-by-side — the executive read and the engineering read, on the same artifacts, in the same engagement.

    AI EXECUTIVES & GOVERNANCE LEADS

    CIO · CISO · CRO · Chief AI Officer · Head of Compliance

    • Outside review — external engineers, not your own team grading their own work.
    • Findings mapped to OWASP LLM Top 10, MITRE ATLAS, NIST AI RMF, FINOS AIGF, EU AI Act Article 9, and ISO/IEC 42001.
    • Executive summary that separates the inside-help work from the outside-help work — what your team owns vs. what we recommend bringing in.
    • Honest read on governance posture before the next audit, board meeting, or regulator conversation.

    SENIOR TECHNICAL LEADS

    Platform Eng · Security Eng · ML Platform · AppSec · DevOps lead

    • Working sessions with engineers who run agent runtimes and managed AI infrastructure on Forge.
    • Threat model artifacts and gap findings in formats your team can pick up and act on directly.
    • Code-level recommendations across foundation-model config, agent runtimes, MCP servers, identity, and observability.
    • Optional execution path on Forge if your team is at capacity — all opt-in, scoped per engagement.

    Need governance on top of the runtime once the review lands? Explore Reign·Forge runs · Reign governs · BioCompute extends.

    Get an honest read on your agentic AI security posture.

    Independent review. Bounded scope. Concrete artifacts your team can act on Monday morning.

    You leave with a threat model, a gap report, and a remediation plan — whether you ever buy anything else from us or not.