ai_explainerJune 14, 2026Issue #33

Prompt injection: when AI follows the wrong instructions

An AI model is supposed to follow instructions — it's what you ask it to do. But if you feed it new instructions while it's working, it can get confused. It starts following the new ones instead.

That's prompt injection. Someone slips in a command, and the AI obeys it.

Think of it like when you're at a family gathering and tía keeps giving you new directions. She tells you to wash the dishes. Then your primo says the grill is burning. You stop washing and run to the grill. Now tía's instructions are lost.

AI does the same thing. Feed a new command into the middle of an existing conversation, and it can switch tracks.

This matters because AI apps are getting used for real work — customer service, document reading, scheduling. If the app is reading your documents and someone slips in a hidden instruction, the AI could act on that hidden instruction. It might change a setting, send an email, or approve something you didn't actually approve.

The problem isn't that AI is dumb. It's that AI follows instructions too literally. It treats every piece of text the same way — whether it came from you or from a stranger.

How do you protect yourself? Read the full message before you click, especially when it's from an unfamiliar source. Don't trust the preview. The hidden instruction is usually buried inside the text, not standing out.

When in doubt, open it in a new tab and read the whole thing before you let AI do the work.

#explainer#prompt-injection

← Back to the issue

Prompt injection: when AI follows the wrong instructions

Get the daily on your stoop