Notes

Blog

Experiments, model evaluations, and things worth writing down from building at Input Systems.

June 14, 2026 · Harness internals

Two ways to make an LLM edit code: aider vs pi

The model isn't where edits get reliable; the harness is. I read both repos to find where each one absorbs a model's mistakes. aider invests robustness before the model, in a forgiving parser; pi invests it after, coercing the arguments and normalizing the match. Leading with pi's edit path, which the published scaffold taxonomies leave out, with a correction to the claim that pi does exact-only matching.

Read →

June 13, 2026 · Debugging

An unbounded recursion in torchsde: why Stable Audio Open's default scheduler crashes off CUDA

Stable Audio Open's diffusers pipeline dies with a RecursionError deep inside torchsde on Apple Silicon, AMD ROCm, and plain CPU. It isn't a CUDA dependency; it's a scheduler that builds its Brownian noise tree over the config bounds, then queries a float32 zero just outside them. A walk through the mechanism, the float32 detail that makes the recursion unbounded, and the one-line call-site fix.

Read →

June 13, 2026 · Model evaluation

Do open-weight coding models live up to their benchmarks?

Seven open-weight models, fresh and uncontaminated coding tasks, one level playing field. The leaderboard rank barely transferred and the benchmark headline turned out to be the least useful number on the page — but the real surprise was that the harness around the model mattered more than the model itself.

Read →

More to come. · Back to inputsystems.ai