We Need a New Security Paradigm for the Agentic World

What I worked on

I went down a rabbit hole on the ways LLMs expand the attack surface of normal products and built OpenTrap as a way to test for it.

The goal is to turn AI attack research into runnable product tests so teams can check how their product behaves when adversarial content enters the normal workflow.

What I noticed

A lot of these attacks feel novel, but they are surprisingly simple to implement and scale.

That concerning now that more non-technical people are building agentic apps and workflows. The tooling makes it easier to build, but the security model has not caught up.

In the past, the main concern was whether untrusted input could execute code, persist somewhere, or deceive a human operator.

In agentic systems, the bar changes. Any input should be treated as untrusted if it can steer, persist, or deceive an agentic operator. That means any content sent to an LLM for a decision should be treated as untrusted input.

”Aha” Moment

If the model is connected to tools, payment flows, CRMs, inboxes, ticketing systems, MCP servers, or internal workflows the impact could be serious.

While working on the next OpenTrap test, I found and submitted exploits against Adobe’s and Notion’s AI assistants. Details here.

If large companies with dedicated security teams can miss these kinds of attack vectors, it is hard to imagine smaller teams catching them early.

Adobe exploit

Notion exploit

What still feels messy

N/A

Next step

OpenTrap is a work in progress. It currently includes one end to end trap around prompt injection. The goal is to validate the approach and collect feedback.

What I worked on

What I noticed

”Aha” Moment

What still feels messy

Next step

Command Palette

Choose Theme