Icon DISC
DISC icon

DISC

Dynamic Decomposition Improves LLM Inference Scaling

DISC adaptively partitions reasoning traces during inference so language models spend more compute on the most difficult steps. By subdividing challenging segments and prioritising their sampling, DISC delivers better solutions with fewer tokens across coding and math benchmarks.

+33% Accuracy gain vs Deepseek R1 at equal tokens
-6.7% pass@10 error on MATH500
-10.5% pass@10 error on LiveCodeBench

Adaptive decomposition

Automatically adjusts step sizes at inference time, zooming in on pivotal reasoning pivots without handcrafted heuristics.

Plug-and-play

Integrates with greedy, beam, or Monte Carlo tree search, controlling which prefixes receive more sampling budget.

Compute efficient

Allocates tokens to hard steps, improving sample and token efficiency across both proprietary and open-source LLMs.

Jonathan Li1,2,4*, Wei Cheng2, Benjamin Riviere4, Yue Wu3, Masafumi Oyamada5

Mengdi Wang3, Yisong Yue4, Santiago Paternain1, Haifeng Chen2

* Corresponding author

1 Rensselaer Polytechnic Institute    2 NEC Laboratories America    3 Princeton University   
4 California Institute of Technology    5 NEC Corporation

DISC teaser diagram showing dynamic decomposition workflow

How Dynamic Decomposition Works

DISC is a recursive inference procedure. It repeatedly proposes candidate prefixes, compares their rewards, and dynamically decides whether to advance or contract the step size. The result is a search process that focuses on the most uncertain parts of the trajectory while skipping past the easy ones.

Example Decomposition

Problem

A rectangle has a perimeter of 24 inches. What is the maximum possible area of the rectangle?

Dynamic decomposition

The algorithm expands, contracts, and replays prefixes until it stabilizes on a proof-quality answer.

Sampling initial prefix…
Solution:

Identify pivotal prefixes

DISC samples continuations from the current prefix and scores them with an outcome reward model to locate promising solution regions.

Adapt step granularity

Difficult prefixes trigger finer-grained decomposition, while easier prefixes advance in larger chunks to conserve budget.

Allocate compute where it matters

Sampling focuses on high-importance tokens, driving additional rollouts only when they improve the z-score over existing prefixes.

Plug into search

The same decomposition policy controls node expansion for greedy, beam, or MCTS search, making DISC a drop-in upgrade.

Simplified overview of DISC adaptive decomposition
Square format highlight emphasising the contract-versus-advance control loop.
Detailed DISC rollout schedule with reward tracking
Landscape layout digs into prefix scoring, branching probabilities, and z-score tracking through the search.

Adaptive recursion

Advance or contract step sizes on the fly until the reward landscape stops improving.

Minimal assumptions

Requires only an outcome reward model — unit tests, verifiers, or self-critique signals all qualify.

Provable convergence

Under mild support assumptions on the base policy, DISC monotonically improves the best solution prefix.

Key Findings

Finding 1: Dynamic decomposition cuts inference error

-

Across APPS, MATH500, and LiveCodeBench, DISC lowers pass@10 error by 5.0%, 6.7%, and 10.5% relative to the best static baseline. The gains compound on the hardest competition problems where sampling budget is scarce.

Finding 2: Compute concentrates on pivotal tokens

-

DISC repeatedly splits critical prefixes such as connective words and control tokens. These pivots steer the downstream reasoning, so the algorithm allocates more sampling budget there instead of overspending on settled regions.

Finding 3: Works across model families

-

From lightweight LLaMA and Mistral checkpoints to proprietary reasoning models such as R1, DISC consistently boosts accuracy. For LLaMA the pass@10 rate jumps from 0.01 to 0.04—a threefold relative lift.

Finding 4: Plug-and-play with search algorithms

-

The same decomposition operator drives greedy search, beam search, or MCTS. DISC expands nodes until rewards plateau, then contracts, providing precise control over inference-time compute without new hyperparameters.

Finding 5: Token efficiency improves with adaptive budgets

-

Under the same token budget, DISC recovers higher accuracy than sentence-level or token-level decomposition. When the sample budget is fixed, DISC returns better solutions with fewer tokens.

Finding 6: Minimal assumptions enable fast deployment

-
DISC only needs scalar rewards to drive decomposition. Unit tests, verifiers, or self-critique models can play that role, avoiding handcrafted heuristics or process reward models.

Benchmark Results

DISC reallocates inference budget toward the most uncertain prefixes, consistently outperforming static token, sentence, and single-step decomposition strategies across coding and mathematics benchmarks.

APPS (competition)

pass@10 error ↓

Down from 0.50 to 0.475 with a fixed 1.5k token budget.

5.0% relative reduction versus the strongest static decomposition.

MATH500

pass@10 error ↓

Drops from 0.15 to 0.14 using a verifier-trained outcome reward model.

6.7% relative reduction with fewer sampled continuations.

LiveCodeBench

pass@10 error ↓

Falls from 0.57 to 0.51 despite strict sandboxed unit tests.

10.5% relative reduction on freshly collected problems.

Deepseek-R1

pass@10 accuracy ↑

Accuracy lifts by over 85% with only ten sampled reasoning traces.

Maintains a 33% relative improvement when matched to the base model token budget.

LLaMA-3.1-8B

pass@10 accuracy ↑

pass@10 rate jumps from 0.01 to 0.04 on APPS coding problems.

4x gain (300% relative increase) within the same low sampling budget.

Qwen-2.5-7B

pass@10 accuracy ↑

Scores climb from 0.095 to 0.17 across budget-constrained runs.

79% relative increase while preserving resource efficiency.

Citation

If you find DISC helpful in your research, please cite the paper below:

@inproceedings{light2025disc,
  title        = {{DISC}: Dynamic decomposition improves {LLM} inference scaling},
  author       = {Light, Jonathan and Cheng, Wei and Riviere, Benjamin and Wu, Yue and Oyamada, Masafumi and Wang, Mengdi and Yue, Yisong and Paternain, Santiago and Chen, Haifeng},
  booktitle    = {Advances in Neural Information Processing Systems (NeurIPS 2025)},
  year         = {2025},
  eprint       = {2502.16706},
  archivePrefix= {arXiv},
  primaryClass = {cs.LG},
  url          = {https://arxiv.org/abs/2502.16706}
}