🧠

Wason 2-4-6 — Confirmation Bias Simulation

Confirmers test only triples they expect to confirm their (narrow) hypothesis. Falsifiers probe what their hypothesis rejects. Watch falsifiers converge on the truth while confirmers stay confident and wrong.

What is this?

In 1960, psychologist Peter Wason gave people a simple puzzle: the sequence 2, 4, 6 satisfies a secret rule. Discover it by proposing your own sequences — after each, you are told yes or no.

Most people immediately guessed "increasing by 2" and then only tested sequences like (4, 6, 8) and (10, 12, 14). Every test returned YES, so they declared they were right with high confidence. The real rule was just "any ascending numbers." They never found it because they only ever tested cases they already expected to pass.

This simulation pits Confirmers — who test what they believe is true — against Falsifiers — who deliberately test what they believe should be false — to show the cost of each strategy.

How to read the charts
  • Accuracy over time: % of agents holding the correct hypothesis each round. Falsifiers (blue) should climb steeply; confirmers (orange) should stay near 0% for most of the run.
  • Confidence vs Accuracy scatter: X-axis = agent's confidence (0–100%), Y-axis = whether they're right (Yes=1, No=0). The alarming cluster to watch: orange dots in the top-right — very confident, completely wrong.
  • Hypothesis distribution: Which rule each group currently believes, updated every few rounds. Watch confirmers cluster on the narrow starting rule while falsifiers spread out and converge on the truth.
  • Agent feed: Live view of 4 example agents. "Surprised!" means the oracle returned the opposite of what the agent expected — falsifiers treat this as a signal to revise; confirmers often ignore it.
  • Results stat cards: After the run, compare % correct vs avg confidence for each group. A large gap between confidence and accuracy in the confirmer column is the core finding.

Accuracy Over Time — % with Correct Hypothesis

Confidence vs. Accuracy (per agent)

Current Hypothesis Distribution

Ready

Live Agent Feed — sample of 4 agents
Status Log