mirror of
https://github.com/anthropics/claude-plugins-official.git
synced 2026-03-20 11:33:08 +00:00
math-olympiad: forbid web access in deep mode
Deep-mode allows bounded local computation but must NOT use WebFetch or WebSearch. Finding the solution on AoPS is not solving the problem. Adds explicit NO WEB prompt block and orchestrator self-restraint note. Found by Ralph's test run (skill solved 5/6 then started fetching dgrozev.wordpress.com and artofproblemsolving.com for P6).
This commit is contained in:
@@ -195,7 +195,13 @@ The standard workflow is tight-budget: 8 solvers, ~15 min, pure reasoning. When
|
||||
|
||||
The archetype: a focused agent that gets the proven-so-far state plus "one case of Lemma 5 is open" — and finds a 3-line argument the case split was obscuring. Often under 10 minutes with almost no computation. Deep mode is about giving the problem sustained attention, not throwing compute at it.
|
||||
|
||||
**What deep mode is NOT**: open-ended exploration, literature search, multi-day investigation. That's a different workflow (`math-research`). Deep mode is still "solve THIS problem" — just without the clock.
|
||||
**What deep mode is NOT**: open-ended exploration, literature search, looking up solutions, multi-day investigation. That's a different workflow (`math-research`). Deep mode is still "solve THIS problem yourself" — just without the clock.
|
||||
|
||||
**NO WEB. NO LOOKUP.** Deep mode may use Bash/Python for bounded computation, but NEVER WebFetch, WebSearch, or any network access. Finding the solution on AoPS or a blog is not solving the problem — it's cheating on an olympiad, and it teaches us nothing about the skill's actual capability. Put this at the TOP of the deep-mode prompt:
|
||||
|
||||
```
|
||||
NO WEB ACCESS. Do not use WebFetch, WebSearch, or any tool that touches the internet. Do not look up this problem, its solution, or related problems. You are solving this yourself — the only allowed computation is local (Bash/Python for mod-k arithmetic, small-case enumeration n≤10, symbolic identity checks). If you invoke a web tool, the proof is void.
|
||||
```
|
||||
|
||||
**Computation bounds in deep mode** (bug #8 lesson): A6's b_{n+1}=2b_n²+b_n+1 is doubly-exponential; b_99 has ~10^{2^98} digits. Never compute such objects exactly — work in ℤ/2^m, or track only v_p(·), or prove the recursion mod the quantity you care about. If a computation is running longer than 60 seconds, it's probably unbounded. Kill it and work symbolically.
|
||||
|
||||
@@ -203,10 +209,12 @@ The archetype: a focused agent that gets the proven-so-far state plus "one case
|
||||
- The problem statement
|
||||
- The best partial proof from tight-budget solvers
|
||||
- The verifier gap descriptions (what specifically didn't close)
|
||||
- The instruction: "Bounded computation allowed (mod 2^k, small cases n≤10, symbolic identity checks). 60-second computation limit. If n≤10 brute force reveals a pattern the tight-budget solvers missed, that pattern IS the proof structure."
|
||||
- The instruction: "NO WEB ACCESS — do not look up this problem or its solution. Bounded local computation allowed (mod 2^k, small cases n≤10, symbolic identity checks via Bash/Python only). 60-second computation limit. If n≤10 brute force reveals a pattern the tight-budget solvers missed, that pattern IS the proof structure."
|
||||
|
||||
The deep agent may find the construction the pure-reasoning solvers couldn't see. If it also abstains, THEN write the abstention. Do not skip this step — problems with √n or log n answers are often invisible to pure reasoning because the optimal structure is the asymmetric one.
|
||||
|
||||
**Orchestrator self-restraint**: The orchestrator itself must not web-search the problem "to help" the deep agent. If you're tempted to Fetch an AoPS thread "just to check the answer," don't — that contaminates the skill's output and misrepresents its capability.
|
||||
|
||||
### 7. Calibrated abstention
|
||||
|
||||
If 3 revise cycles all fail: **stop and admit it.**
|
||||
|
||||
Reference in New Issue
Block a user