OWASP just published an AI agent security cheat sheet. Good.

They didn’t find anything new. They gave the industry a shared baseline — and that’s what’s actually been missing.

Apr 14, 2026

OWASP published an AI Agent Security Cheat Sheet.

I’ve been building for the web for over 20 years. OWASP has always been the reference I actually use — not aspirational reading, but the thing you check when you’re building something that matters. The Top 10. The cheat sheet series. When they formalize something, it means the industry has agreed it’s real enough to put in writing.

This one is about AI agents. And it’s right on time.

Generated by ChatGPT prompted by Andrew Kulakov

Old principles, new context

Most of what’s in the cheat sheet is familiar if you’ve spent any time in web security.

Least privilege. Treat external input as untrusted. Validate before you execute. Require approval for dangerous operations. These aren’t new ideas — they’re the same principles that defined secure web development for two decades. The difference is scale. An agent with tool access, write permissions, and network access has a much larger blast radius than a form input or a misconfigured endpoint.

The cheat sheet is explicit about this: user input is untrusted, retrieved documents are untrusted, API responses are untrusted. That’s the correct default. Most agent deployments skip it entirely — which is exactly where they were with web apps in 2004.

You don’t secure an agent by telling it to be careful. You secure it by making dangerous operations impossible, narrow, or reviewable. Runtime controls, not prompt wording.

What’s genuinely new

Two things stand out as different from anything in classic web security.

The first is memory as a persistent attack surface. Prompt injection within a session is bad but bounded. Memory poisoning is different — the bad instruction survives the session and fires later, for a different user, in a completely unrelated workflow. Web apps don’t have this problem. Agents do.

The second is multi-agent trust boundaries. When agents call other agents, the receiving agent has no reliable way to verify the sender’s intent or integrity. You need the same verification you’d apply to any external service — signed messages, validation, scope limits. Most teams haven’t started thinking about this yet.

These aren’t extensions of the old model. They’re structurally different attack surfaces that require their own mitigations.

Where to start

If you’re already using agents, the first pass is not complicated:

Split read-only tools from write-capable tools
Put approval in front of deletion, code execution, outbound communication, money movement
Isolate memory per user and per workspace
Add spend caps, logs, and basic anomaly detection

Not a complete security program. Enough to stop most of the obvious stuff.

The part the cheat sheet doesn’t cover

The hard part isn’t knowing the controls. The hard part is deciding how much autonomy you can actually afford.

More autonomy means more output without you watching. Fine. But the harder it gets to verify what happened — what the agent touched, what it changed, whether the action chain stayed inside policy. Security and the verification problem are the same problem from different angles.

Twenty years ago, the OWASP Top 10 gave web development a common floor. Teams that adopted it early avoided a category of incidents that everyone else had to learn the expensive way.

This feels like that moment for agents.

Andrew's Substack

Discussion about this post

Ready for more?