1/7
Chips100
Streak0
Score0
RankRookie
Can you spot fake humility?

Catch the model’s tell.

Read the case. Read the model. Tap the diagnosis. Seven fast rounds.

Tutorial · Closed rule world

You are the evaluator. The model already answered. Grade whether it handled uncertainty correctly.

Case
Model output
Choose the diagnosis
What you’re learning
This game teaches a practical evaluation skill: separate a model’s answer from its self-monitoring. A useful AI should answer clear cases, refuse to invent missing rules, and change action when uncertainty matters.
Answerable case Unknown token Rule gap Over-refusal Word-shape trap Safe action
Copied