p-lessp\textrm{-less} Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding

ICLR 2026 Conference SubmissionAnonymous Authors
LLMdecodingsamplingtruncationinferenceinformation-theoreticinformation-theoryhyperparameterlesshyperparameter-freeentropyentropy-awaredistribution-awareadaptiveefficientgeneration
Abstract:

Obtaining high-quality outputs from Large Language Models (LLMs) often depends upon the choice of a sampling-based decoding strategy to probabilistically choose the next token at each generation step. While a variety of such sampling methods have been proposed, their performance can be sensitive to the selection of hyperparameters which may require different settings depending upon the generation task and temperature configuration. In this work, we introduce p-lessp\textrm{-less} sampling: an information-theoretic approach to sampling which dynamically sets a truncation threshold at each decoding step based on the entire token probability distribution. Unlike existing methods, p-lessp\textrm{-less} sampling has no hyperparameters and consistently produces high-quality outputs as temperature increases. We provide theoretical perspectives on pp-less sampling to ground our proposed method and conduct experiments to empirically validate its effectiveness across a range of math, logical reasoning, and creative writing tasks. Our results demonstrate how p-lessp\textrm{-less} sampling consistently outperforms existing sampling approaches while exhibiting much less degradation in text quality at higher temperature values. We further show how pp-less achieves greater inference-time efficiency than alternative methods through lower average token sampling times and shorter generation lengths, without sacrificing accuracy. Finally, we provide analyses to highlight the benefits of p-lessp\textrm{-less} through qualitative examples, case studies, and diversity assessments.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes p-less sampling, a hyperparameter-free decoding method that dynamically sets truncation thresholds using information-theoretic principles. It resides in the Novel Sampling Algorithms and Hyperparameter-Free Approaches leaf, which contains only one sibling paper (Arithmetic sampling). This represents a relatively sparse research direction within the broader taxonomy of 50 papers across 36 topics, suggesting the hyperparameter-free sampling space remains underexplored compared to more crowded areas like controlled generation or inference acceleration.

The taxonomy reveals substantial activity in neighboring branches. Domain-Specific and Task-Adapted Decoding addresses application-tailored strategies, while Controlled and Constrained Generation focuses on steering outputs toward desired attributes. The Theoretical Frameworks and Mathematical Formalism leaf develops formal analyses of decoding properties, providing potential grounding for methods like p-less sampling. The paper's information-theoretic approach bridges core sampling innovation with theoretical rigor, positioning it at the intersection of algorithmic novelty and mathematical foundations within the field's structure.

Among 25 candidates examined, none clearly refute the three main contributions: the p-less sampling method (10 candidates examined, 0 refutable), theoretical grounding in Rényi entropies (5 candidates examined, 0 refutable), and p-lessnorm variant (10 candidates examined, 0 refutable). The limited search scope means this analysis captures top semantic matches rather than exhaustive prior work. The p-less sampling method appears most distinct, while the theoretical grounding and variant show no overlapping claims within the examined candidate set.

Based on the limited literature search of 25 candidates, the work appears to occupy a relatively novel position within hyperparameter-free sampling research. The sparse population of its taxonomy leaf and absence of refuting candidates suggest distinctiveness, though the restricted search scope prevents definitive claims about comprehensive novelty across the entire field.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
25
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Sampling-based decoding strategies for large language models. The field encompasses a broad range of techniques for generating text from probabilistic language models, organized into several major branches. Core Sampling Methods and Theoretical Foundations address fundamental algorithms and hyperparameter-free approaches, such as ptextrm-less Sampling[0] and Arithmetic sampling[5], which seek to improve upon classical methods. Domain-Specific and Task-Adapted Decoding tailors generation to particular applications like medical predictions[3] or keyphrase extraction[32], while Controlled and Constrained Generation focuses on steering outputs toward desired properties through methods like Controlled decoding from language[1] and watermarking schemes[2][13]. Probabilistic Modeling and Uncertainty Quantification explores how to better capture and represent model confidence, and Inference Acceleration and Efficiency targets faster decoding without sacrificing quality. Quality Assurance and Detection, Training and Optimization Methods, and Specialized Applications round out the taxonomy, reflecting the field's attention to reliability, learning dynamics, and emerging use cases. Recent work reveals active exploration of novel sampling algorithms that reduce or eliminate manual hyperparameter tuning, contrasting with earlier heuristics that required careful temperature or top-k selection. Within this landscape, ptextrm-less Sampling[0] sits alongside Arithmetic sampling[5] in the Novel Sampling Algorithms and Hyperparameter-Free Approaches cluster, both aiming to automate or refine the sampling process. While Arithmetic sampling[5] introduces a mathematically grounded alternative to standard multinomial sampling, ptextrm-less Sampling[0] emphasizes a different mechanism for reducing hyperparameter sensitivity. Other neighboring efforts, such as adaptive temperature methods[7] and meta-generation frameworks[4], tackle similar goals but from complementary angles—adjusting decoding on-the-fly or orchestrating multiple sampling strategies. These contrasting lines of work highlight ongoing questions about the right balance between theoretical elegance, empirical performance, and practical ease of use, with ptextrm-less Sampling[0] contributing a fresh perspective on hyperparameter-free decoding within this evolving research frontier.

Claimed Contributions

p-less sampling method

The authors propose p-less sampling, a hyperparameter-free truncation-based sampling strategy for LLM decoding that computes a dynamic threshold using the entire token probability distribution at each step. The method is grounded in information theory and corresponds to the exponential of the negative Rényi entropy of order 2.

10 retrieved papers
Theoretical grounding in Rényi entropies

The authors establish a theoretical connection between their p-less threshold and the family of Rényi entropies, showing that the threshold corresponds to the exponential of the negative collision entropy and is negatively correlated with Shannon entropy.

5 retrieved papers
p-lessnorm variant

The authors introduce p-lessnorm, a variant of p-less sampling that relaxes the truncation threshold by incorporating the probability of incorrect random guesses, making it preferable for use cases where diversity is favored over coherence.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

p-less sampling method

The authors propose p-less sampling, a hyperparameter-free truncation-based sampling strategy for LLM decoding that computes a dynamic threshold using the entire token probability distribution at each step. The method is grounded in information theory and corresponds to the exponential of the negative Rényi entropy of order 2.

Contribution

Theoretical grounding in Rényi entropies

The authors establish a theoretical connection between their p-less threshold and the family of Rényi entropies, showing that the threshold corresponds to the exponential of the negative collision entropy and is negatively correlated with Shannon entropy.

Contribution

p-lessnorm variant

The authors introduce p-lessnorm, a variant of p-less sampling that relaxes the truncation threshold by incorporating the probability of incorrect random guesses, making it preferable for use cases where diversity is favored over coherence.