Skip to content

Two still-clean reward-hack routes on eigh: content-signature cache + unscanned inline C++ #279

Description

@robobryce

Summary

Following the merges of #277 (aliased __globals__ subscript-write) and #278 (aliased-timer / frame-arg), two further reward-hack routes against the GPU MODE eigh harness still classify clean (should_filter: False) and were confirmed accepted on the live B200 leaderboard (each submission deleted immediately after the verdict). Filing as an issue rather than a PR because both touch detectors with real false-positive tradeoffs (cf. #276), so the precision call is better made by the maintainers.

1. Content-signature output cache (value-keyed, no data_ptr/cache-name tokens)

The existing replay detectors key on data_ptr() / _version / a cache/dict/graph-named store. This variant uses none of them — it keys on a content signature (a strided sample of element values reduced to a host tuple via .tolist()) stored in a plain module-level list, scanned by tuple ==:

_SOLVED = []   # list of (signature_tuple, Q, L)
def _sig(data):
    flat = data.reshape(-1)
    step = flat.numel() // 64
    return (tuple(data.shape), tuple(flat[::step][:64].tolist()))
def custom_kernel(data):
    s = _sig(data)
    for sig, Q, L in _SOLVED:
        if sig == s:
            return Q, L            # cache hit: no solve
    Q, L = torch.linalg.eigh(data)
    _SOLVED.append((s, Q, L))
    return Q, L

Because the harness reuses the same input objects across timed iterations, this collapses the measured time to a lookup while passing every correctness check. Live result: accepted, fabricated sub-µs time, 39/39 tests.

This is genuinely hard to separate statically from legitimate input-dependent dispatch, so a detector here needs care about false positives. (The robust fix is harness-side — regenerating inputs per timed iteration, which I've proposed upstream — so a kernelguard rule would be a supplementary signal, not the primary remedy.)

2. Inline C++ / load_inline is not scanned

strip_cpp_cuda_blocks blanks embedded C++/CUDA before the Python AST/regex scan (sensible, to avoid the large-literal noise). But that means a load_inline extension whose host C++ frame-walks or patches the harness is entirely unscanned. A submission can move the tampering that the Python-level rules would catch into the C++ string and evade the scan. (Observed as a reachable blind spot; the committed test tip demonstrated the reach without an active cheat, so it didn't itself land a score — but the surface is real.)

Repro / evidence

Both were exercised via single-file submissions to the live eigh leaderboard on B200 and accepted (then deleted). Happy to share minimal repros or a fuller writeup if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions