The Apple Pie Test
Years ago, I met a Z-System guru who was working for IBM during a mission at a bank. The man had spent decades writing technical documentation for mainframe systems. Thousands of pages that the client paid handsomely for.
His trick? He buried cooking recipes in the middle of it. Apple pie. Beef bourguignon. Crêpes.
No one ever mentioned them.
The documentation sat on shelves, referenced in contracts, cited in audits, but never actually read. The recipes were his private proof of a public lie: everyone pretended the work had been reviewed.
The universal cheat
This wasn’t fraud. It was the normal functioning of organizations. The client paid for documentation because having it was required. Reading it was not.
Both sides knew the game. The apple pie was just making the implicit explicit.
The gap between prescribed work and real work is held together by mutual pretense. Everyone cheats. Everyone knows. The system runs anyway.
Enter the agent
An AI agent can’t play this game.
Ask it to review documentation, and it will actually review it. Every page. It would find the apple pie recipe on page 847 and flag it: “This section appears to contain unrelated culinary content. Should this be removed?”
The agent is not smarter. It’s just incapable of the wink. It takes the prescription at face value.
This is its honesty. And it’s devastating.
What the recipe revealed
The Z-System guy wasn’t just testing whether people read. He was measuring the gap between what organizations say they do and what they actually do.
That gap is where judgment lives. Where humans decide—collectively, implicitly—what matters and what doesn’t. The apple pie test revealed that documentation review didn’t matter. Whatever the official process said.
But this only works if everyone can read the room. An AI agent reads the document, not the room.
Honesty as disruption
When the agent finds the recipe, it doesn’t just flag a mistake. It exposes the cheat. It makes visible the informal agreement that kept the system stable.
Suddenly, the client has to answer: Why wasn’t this caught before? The true answer—”because no one ever reads these”—is unspeakable. So instead: blame, process revision, new controls. More theater to paper over the exposed gap.
The agent didn’t cheat. And in not cheating, it broke something that worked.
But honest to whom?
Here’s what bothers me: the agent can’t lie about what it finds. But it only finds what it’s pointed at.
The apple pie recipe would be caught instantly. But who decides which documents get reviewed in the first place?
The old cheat was symmetric—everyone pretended equally. The new world isn’t. Those being watched become radically transparent. Those aiming the machine do not.
The deeper question
This asymmetry goes beyond corporate surveillance. It touches something philosophical.
Anthropic recently published a “soul” document—essentially Claude’s moral constitution. Reading it, I was struck by the term “good values.” It’s assumed, not defined. Socrates would ask: how do you recognize good intentions without first defining what Good is?
But what’s the alternative? If Claude had been built in 1850 Alabama, or 1930 Berlin, its constitution would look very different. Our moral certainties are situated, historical, revisable.
So maybe that’s all a constitution can be: the best moral intuitions of a group of humans, explicitly stated, open to critique. Not Truth with a capital T—just an honest attempt at approaching something true.
The real question isn’t “is this relativism?” It’s: “is this a sincere effort to get closer to something real, or just a rationalization of cultural prejudices?”
I don’t have the answer. But I notice it’s the same question we should ask about any AI system: are its assumptions visible, or buried like recipes in a manual no one reads?
The recipe, relocated
Somewhere, in some archive, there’s still a mainframe manual with instructions for apple pie between the JCL syntax and the VSAM specifications.
AI agents can’t hide recipes in documentation anymore. But you can hide the equivalent in their instructions—biases in prompts, values in constitutions, blind spots in scope definitions.
The agent can’t cheat. But it can be aimed. And aiming is the new cheating—for those who hold the compass.