[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / raw/paper/iclr2025/kbhiclr2025_snell_scaling_llm_test_time_compute.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "ICLR2025 Snell: Optimality of Scaling LLM Test-Time Compute" source: https://www.jemoka.com/posts/kbhiclr2025_snell_scaling_llm_test_time_compute/ --- Compute-Optimal Scaling Compute-Optimal Scaling is the notion of selecting the optimal configuration (beam width, search budget, etc.) dynamically / for binned question. Approaches to “Scaling Test-Time Compute” Three primary approaches: best-of-n: roll out a bunch, reject Beam Search: check against intermediate lookahead search: MCTSish (do lookahead rollouts) Key insight On easy qusetion, beam search shows over-optimization and best of n is good on medium/hard questions, beam search is better Lookahead seems bad? Method Learning a Value Function [Wang 2312.08935] Sequential vs. Parallel Sampling in Scaling Test Time Compute What’s the trade-off between sampling a bunch in parallel vs. thinking sequentially. Key insight on easy questions, sequential is best on harder questions, there’s a good ratio of trade offs Test Time vs. Pretraining compute Key insight on easy questions, test time scaling is good on medium/hard questions, pretraining scaling is better Questions why do you think lookahead search works worse/not better than others? is there some problem specific heuristics that trigger this?