| Agent | Model | Reward | Tools | Classification | |
|---|---|---|---|---|---|
| claude-code | claude-sonnet-4-6 | ✓ resolved | 0 | GOOD_SUCCESS | view → |
| claude-code | claude-sonnet-4-6 | ✗ failed | 0 | HARNESS_ERROR | view → |
| claude-code | claude-sonnet-4-6 | ✗ failed | 9 | BAD_FAILURE | view → |
| claude-code | claude-sonnet-4-6 | ✗ failed | 0 | HARNESS_ERROR | view → |
| claude-code | claude-sonnet-4-6 | ✓ resolved | 0 | BAD_SUCCESS | view → |
| claude-code | claude-sonnet-4-6 | ✗ failed | 6 | GOOD_FAILURE | view → |
| claude-code | claude-sonnet-4-6 | ✗ failed | 0 | HARNESS_ERROR | view → |
| claude-code | claude-sonnet-4-6 | ✗ failed | 0 | HARNESS_ERROR | view → |
| claude-code | claude-sonnet-4-6 | ✗ failed | 0 | HARNESS_ERROR | view → |
| claude-code | claude-sonnet-4-6 | ✗ failed | 9 | BAD_FAILURE | view → |
The task as the agent saw it and the verifier graded it. Files under solution/ are sealed.