Defensible Safety and Security Programs in the AI Age

Nobody knows what “good enough” looks like yet. There is no straight compliance play. That changes everything about how security leaders need to operate.

Summary

  • This blog summarizes a discussion held at the Special Competitive Studies Project AI + Expo in April 2026. Participants were John Steven, Adam Isles, Dan Sutherland and Ben Joelson.
  • In the absence of AI/ML secure development standards, “skating to where the puck is headed” depends on threat modeling, tiered safety controls and continuous behavioral monitoring.
  • Organizations that fare best will treat AI-assisted security analysis as a defensive capability to be deployed continuously, not a threat to fear.
  • Defensibility depends on getting the right corporate functions in the room to align on threat, risk appetite and enable speed.

The Goalposts Are Murky, And That’s the Point

The NIST AI Risk Management Framework and ISO 42001 are abstract by design. But the deeper problem is that no one, not NIST, Gartner, nor private industry has converged on a concrete, universally accepted AI/ML Secure Development Lifecycle. There is no AI equivalent of PCI DSS or HIPAA that says “implement these controls and you are compliant.”

That absence matters. Traditional GRC works backward from a known standard. PCI DSS prescribes specific controls. HIPAA defines specific safeguards. Implement them, and you have a compliance argument, your bare-minimum defensibility. AI does not have that. Two organizations with identical ISO 42001 certifications can have wildly different actual risk postures.

Without a compliance floor, the game becomes two things.

First, skate to where the puck will be. Accepted practices will converge, probably around threat-modeling-driven risk assessment, tiered safety controls, and continuous behavioral monitoring. Threat modeling is always useful; it aligns business risk tolerance with technical approach and stack. But it is especially important in the “before times” prior to regulatory clarity. It helps an organization discover and enumerate its biggest concerns and industry’s common concerns, as eventually defined by regulators and industry groups, will likely mirror these. Alignment is likely. Organizations doing this work now will find that the eventual compliance framework feels like documentation of what they already do, not a retrofit. But the next 24 months will be highly dynamic. Over-investing in a specific framework that might not survive is its own risk.

Second, focus on defensibility. Defensibility comes down to documentation of reasoning — ideally reasoning grounded in threat modeling. Can an organization show why it made its security choices, tied to specific risks it identified? If a company can demonstrate that chain of logic,  even if a control ultimately fails, that is a defensible program.

The legal landscape reinforces this. The regulatory environment is fragmented and in flux, as explained in this piece by Dan Sutherland. Moreover, legal exposure is not just a regulatory fine; it is post-incident tort litigation. The standard will be “what would a reasonable AI company have done?” and that standard is being written right now by the companies taking this seriously. Organizations that invest in documented, threat-informed programs today are defining what “reasonable” means for the industry. Those that wait for a compliance framework will find the standard was set without them and above where they ended up.

Unless an organization is on the vanguard of selling “AI security as a differentiator,” and that is not most of us, there is a bit of divination involved in imagining what defensible looks like before standards converge. Short of prognosticating ability, the answer is getting the right people in the room: legal, compliance, product, information and physical security and business continuity. None of those disciplines alone can see the full picture. The cross-functional conversation is the risk management.

This means a risk committee. If an organization did not have one as part of its legacy security practices, now is the time. If nothing else, it allows the players — legal, compliance, product security and others — to learn each other’s language and build a shared context. That shared context pays dividends well beyond policy: when reactive drills happen, and they will, the response becomes collaborative and faster because the relationships and vocabulary already exist.

Product Security as Dr. No Will Not Work Here

Product security’s traditional posture is gatekeeping. The backlash is already real. In more than one organization, the business is frustrated with the speed of application delivery and security got thrown under the bus. Twenty-five percent of developer bandwidth going to remediation is an appreciable drag, and executives feel it.

The result: organizations are creating “fast lanes” for AI that explicitly bypass the Secure SDL. When the business routes around a security program, risk isn’t reduced, visibility is lost.

Why does this happen? Product security’s legacy metrics are about work and effort and in the business’s mind, work and effort only add up to delay and cost. Executives read the effort side clearly: “You spent $10M on assessments and produced a 25% tax on delivery.” The risk side? Not so much. Is a CVSS 8 thirty-three percent worse than a 6? What does that mean for the business? Executives ignore legacy AppSec metrics because they are inactionable esoterica. If they cannot connect security metrics to business outcomes, they will treat security as overhead to be optimized away.

The fix is not better dashboards. It is a different starting point. Product security needs to participate in the risk committee’s determination of what questions matter and then work overtime to answer them. Even discerning the right questions is valuable. Questions like: What privacy concerns do we have on behalf of our users? On behalf of our proprietary data? How should we think about data sharing, model training and the potential to leak through a breach of an agentic interface – Chat, MCP, or otherwise? These need to be resolved into potential risks that each stakeholder weighs in on.

When consensus is reached on what matters, the telemetry product security brings to this committee becomes meaningful. Take the hypothetical logistics software company, TruckNorrisAI, used as a case study in our panel – this helps truck carrier customers optimize freight routing. Instead of “YY criticals,” the reporting becomes: “We don’t yet understand how to protect our proprietary routing from partners or competitors and therefore key aspects of customer relationships and their shipping. Here is what we can do to ship on time and detect if we’re being exploited. We can reduce this risk to zero with a delay of X. Or we accept the residual risk and monitor.” Or: “We want to productize X, but our users expect Y to remain private, here are the options.” Now legal can map that to liability. Product leaders can make an informed call. The committee can set real risk appetite not CVSS thresholds.

Tiered controls follow from this consensus. Tier 1 (safety-of-life) gets zero tolerance, hard-coded guardrails AI cannot override. Tier 2 (regulatory) gets strong controls with human review. Tier 3 (efficiency) is where AI operates with the most latitude. This gives the business its fast lane with guardrails, not around them.

Agents Are Writing More Code Than Humans

Agents now produce more code than human developers in many organizations. But most organizations could barely govern their people and keep the Secure SDL compulsory. How do they do the same for agents?

The speed mismatch appears to be the core issue. If SAST takes four hours, or even thirty minutes, how do you slow agentic workflows down that much? Yes, organizations need review that works in seconds or minutes. But speed is a red herring. It is a symptom, not the disease.

The real issues are accountability and compulsory governance.

Accountability: who owns unattended work, its deliverables, and its consequences? Remember gating? Global Product Security made the call on whether you could release. A human carried that authority. Now: how does an agent carry release authority? How does accountability translate when the entity producing the work is non-human? Someone still has to own the decision. The agent accelerates analysis, but a human must remain accountable for the go/no-go. The question is where to draw the line: what can agents decide autonomously (Tier 3 efficiency decisions), and what requires human sign-off (Tier 1 safety, Tier 2 regulatory)? Governance tiers should mirror consequence tiers.

Compulsory governance: how do organizations force agentic development to adhere to an SSDLC and an S-AI/MLDLC  that ensures code is produced in line with governance policies whether human or agentic? The speed of the checks matters, but only in service of the larger question: are they happening at all, and can they be bypassed? An agent-speed SAST scan is worthless if the agent can skip it. The compulsory nature of the governance is the point, not the latency.

Making Defensible Programs Real

The practical answer comes in three steps.

Step 1: Convert legacy AST into agentic-speed checks. Re-implement SAST, SCA, secrets scanning as skills and compulsory gates inside the agent’s workflow. Not a four-hour batch job. Not a dashboard someone checks tomorrow. Skills that fire in real time, as a template the agent must follow. On the determinism question: we are already seeing in tools like Claude Code a mix of agentic and deterministic guardrails emerge as skills, templates and hooks that keep agents compliant with organizational security controls. Some checks need deterministic rules that cannot be reasoned around. Others benefit from agentic judgment within defined boundaries. Use the right tool for the job.

Step 2: Digitize your stakeholders’ expertise as skills. The consensus built in the risk committee needs to be operationalized, not filed in a SharePoint, by making guardrails engineering-traceable. Think of this as “digitizing legal as an agent,” or compliance, or any other stakeholder function. Product security is a natural choice to lead this effort. Stock AI slop will not do. Off-the-shelf guardrails lack an organization’s sensibilities: its risk appetite, regulatory posture and specific commitments to customers. This is not about replacing experts. It is about leveraging and scaling them. Agents should fail over to a human attendant in meaningful ways, sharing deep context, engaging them for the hard thinking tasks that require motivation, experience and judgment the agents do not have. The agent handles volume; the human handles judgment.

Step 3: Bake these into whatever the agent network is. This part is hard. There are myriad ways to engage AI in software development, and there is no obvious choke point, no proxy agent that works to front-run, guard or assure compliance across all of them. The most popular coding tools, Codex and Claude Code, have hooks for customization such as plugins, skills, templates but they do not yet have what most security leaders would consider enterprise-readiness features for effective observation and governance.

That gap is real, and it is one of the most consequential open problems in this space. But the direction is clear: whatever the delivery vector, the encoded checks must be embedded. Not optional. Not advisory. Compulsory, the same way SSDLC gates were supposed to be for humans. The tooling will catch up. The governance model needs to be ready when it does.

The governing insight: you cannot govern agents with process documents and training sessions,  that’s the human model. You govern agents by encoding decisions into their operating environment. The skill is the policy. The gate is the governance. If the risk committee decides “no customer PII in training data,” that becomes a check the agent cannot bypass, not a policy it has never read.

Non-determinism is Not an Excuse for Non-accountability

Consider an analogy any parent will recognize. A teenager’s behavior is profoundly non-deterministic. Parents set up guardrails like curfews, rules, expectations and let them loose. They do not achieve certainty. They achieve reasonable governance within an environment of unpredictability. Developers are non-deterministic and unreliable as well. Organizations have always governed humans who make inconsistent, sometimes baffling decisions through structure, guardrails and accountability.

The guardrails described earlier, agentic and deterministic hooks, templates, plugins, proxies, output evaluation apply directly. Establish the invariants that must hold, build the guardrails that enforce them, accept that not every output can be predicted in advance.

Security has always wanted certainty. Certainty has never been possible. Organizations were always dealing in probabilities of breach, of exploitation, of a control holding under pressure. AI makes the probabilistic nature of security more visible, but it does not fundamentally change the game. Practitioners have to get comfortable with this.

Testing does, however, require a different mental model. With traditional software, input X always returns output Y. With an LLM-powered product, the same input can produce different outputs. This does not make testing futile it means expanding from verifying specific outputs to establishing a behavioral envelope.

For an AI logistics platform, that envelope might be: the recommended route must not violate hard safety rules, must fall within a defined percentage of optimal distance, and must not change wildly from yesterday’s recommendation for the same inputs without an explainable reason.

In practice, three testing layers. First, deterministic guardrail testing, do the hard-coded safety rules work? Automate fully. Second, behavioral boundary testing: does the AI stay within acceptable ranges under stress? Statistical approaches (sampling) and simulation. Third, red team and adversarial testing: can attackers manipulate the model? Human expertise augmented by AI attack simulation.

It is not automation or human-in-the-loop. It is about where to put the humans. Deterministic safety checks on 100% of outputs, automated. Anomaly detection running continuously. Humans investigating flagged cases and running periodic red team exercises. Speed and oversight.

The Asymmetry Question: Are We Equipping Attackers?

AI-assisted vulnerability discovery tools have found thousands of previously unknown high-severity vulnerabilities in major open-source software. The window between a vulnerability existing and being found is collapsing toward zero. Does that help attackers more than defenders?

It helps both differently. Attackers need to find one vulnerability. Defenders need to find all of them. That framing appears to favor offense.

But it misses the structural advantage defenders have, they can run these tools proactively against their own code, in their own pipelines, before attackers see it. The organizations that fare best will treat AI-assisted security analysis as a defensive capability deployed continuously, not a threat to fear.

The practical consequence is that patch management and incident response must become dramatically faster. Expect a surge in critical CVEs, not because software got worse, but because the ability to find existing flaws got dramatically better. SLAs for critical vulnerabilities need to be measured in hours, not days. Automated detection and response to newly disclosed vulnerabilities without waiting for a human to read an advisory is no longer optional.

This is where Fast GRC becomes concrete: vulnerability management at AI speed, not committee speed.

“A Lot of This Sounds Like Process”

Presently, the solution to a lot of agentic problems is “another agent to watch it” or “process and people to watch it.” Risk committees, stakeholder alignment, documented reasoning, tiered controls, yes, there is organizational work here. But the fundamental shift in where process and policy live is actually kinda cool.

For the first time, we have policy as code that both humans and code can read.

The risk committee’s decisions do not live in a document humans are expected to follow. They live in skills, gates, and guardrails embedded in the agent’s operating environment. The committee’s consensus on data handling becomes a compulsory check in the pipeline. The legal team’s position on PII exposure becomes a deterministic rule no AI output can violate. A human developer can read that same skill and understand the policy it encodes because it is legible to both.

In the traditional model, governance is a human activity that produces documents. In the agentic model, governance is an engineering activity that produces executable constraints. The meeting still happens, the risk committee is still essential for determining what matters but the output is a skill that runs in production, not a PDF on SharePoint.

The difference is between a speed limit sign and a speed governor. The sign requires a human to read it, understand it, and choose to comply. The governor is embedded in the system. AI security governance needs to be the governor, not the sign.

Threat Modeling Ties It All Together

Threat modeling is the single most important security activity for AI products. It needs to happen at threelevels most organizations are not addressing.

At the product level: how can the AI system be manipulated to produce harmful outputs? For an AI logistics platform, this includes supply chain data poisoning, indirect prompt injection through structured data fields, and the cascading physical consequences of manipulated routing — hazmat on restricted roads, trucks in residential areas during school hours, drivers pushed past hours-of-service limits.

At the development level: how can AI coding tools introduce vulnerabilities? Agents making autonomous supply chain decisions, pulling unvetted libraries, generating plausible code with subtle security flaws.

At the ecosystem level: what are the cascading effects if the AI product fails? This is where product security connects to physical security, regulatory exposure and reputational risk.

The challenge is that threat modeling AI is a green field. This is not the first time the industry has faced this. Remember when open-source software was the problem? No one knew how to model its security or secure the supply chain. Cloud and SaaS presented the same disorientation. AI follows that pattern, but with twists that make it more consequential.

Dramatic legal Actions are happening: state attorneys general are filing civil complaints alleging AI prompt outputs are violating state consumer protection laws, and opening investigations into whether AI companies bear criminal responsibility where chatbots are used by someone in the commission of a crime. Organizations are in unprecedented territory and will not be comfortable with that reality. Threat modeling needs to account for it.

Threat modeling outputs should drive testing priorities directly. If the model identifies supply chain data manipulation as high-risk, the testing program must include adversarial data injection. If it surfaces discoverability of AI interactions as a legal exposure, the organization needs policies and technical controls around what data enters AI systems in the first place.

The Path Forward

The organizations that navigate this era successfully will share a few characteristics. They will not wait for compliance frameworks. They will build defensible programs grounded in threat modeling and documented reasoning. They will not treat product security as Dr. No, they will bring it into the risk committee to help determine what questions matter. They will not govern agents with human-speed processes, they will encode decisions into the agents’ operating environment.

The reasonable standard for AI security is being established by precedent right now. The companies that invest in robust, threat-informed programs today are defining what “reasonable” means for the industry.

Compliance is a floor, not a ceiling. The AI supply chain is the biggest blind spot. Fast GRC is not about cutting corners, it is about eliminating the latency between risk identification and risk response. Non-determinism is not an excuse for non-accountability.

The skill is the policy. The gate is the governance. That is what defensible looks like.

John Steven is a Senior Advisor to The Chertoff Group. He counsels organizations on threat modeling, architectural risk analysis, software-defined security governance and automation strategies that drive efficiency and resilience.

Our goal is to provide a solution tailored to your needs. Contact us today for a consultation. 

How can we help?

Fill out the information below. Provide as much detail and a team member will respond as soon as possible.