Add math-olympiad skill — adversarial verification for competition math

The skill that addresses the Proof-or-Bluff gap: self-verified 85.7% IMO
becomes <5% under human grading. Uses fresh-context verifiers armed with
specific failure patterns (not generic 'check logic').

Validated: 17/18 IMO+Putnam 2025 solved, 0 false positives, 2 novel proofs.
See eval data in anthropic monorepo sandbox/sandbox/ralph/math_skills/.
This commit is contained in:
Ralph Furman
2026-03-19 23:17:36 +00:00
parent 7994c270e5
commit 3c9bf4ff5d
13 changed files with 1033 additions and 0 deletions

View File

@@ -0,0 +1,8 @@
{
"name": "math-olympiad",
"description": "Solve competition math (IMO, Putnam, USAMO) with adversarial verification that catches what self-verification misses. Fresh-context verifiers attack proofs with specific failure patterns. Calibrated abstention over bluffing.",
"author": {
"name": "Anthropic",
"email": "support@anthropic.com"
}
}