Tags
Runyard now has a Compare page. Pick Device A and Device B from a catalogue of 160+ GPUs — Apple Silicon, NVIDIA RTX 50/40/30, AMD RX 7000/9000, Intel Arc, cloud rigs — and see a live score for every LLM across quality, speed, fit, and context. Toggle TurboQuant on Device B with a single switch and watch the scores update in real time. No spreadsheet. No guesswork.
The Compare page scores every model in the Runyard catalogue — 100+ LLMs — simultaneously across two devices you configure. Each device has a composite score (0–100) built from four dimensions: model quality, inference speed, memory fit, and context window headroom. The winner for each model row is highlighted. Sort by score, speed, model size, or name. Filter by use case. Click any row to see a detailed side-by-side breakdown with dimension bars, donut score, max context, quantization, and cloud run options.
Device B has a TurboQuant toggle. Flip it on and every score recalculates immediately. TurboQuant (Zandieh et al., ICLR 2026) applies 4× KV cache compression — meaning Device B can run 4× longer context on the same VRAM. That context boost feeds directly into the composite score: higher context headroom raises the Context dimension score, which flows into the overall composite. Models that were ties become Device B wins. Models that were marginal become viable.
When TurboQuant is active, Device B's context score recalculates using the TQ-expanded window (up to 4× the hardware max, capped at the model's spec limit). The composite score reflects that gain. The green glow on Device B's column and the ✦ marker on winning rows make it immediately obvious which models flip from "tie" to "B wins" once TQ is on.
Device A shows a red "No TurboQuant" callout when TQ is active on Device B — so you can see exactly how much context headroom Device A is leaving on the table.
Compare your GPU against any other device — with TurboQuant on and off.
Open Runyard Compare → →Tools
Find AI models that fit your exact hardware. Enter your specs and get a ranked list instantly.
Newsletter