Feed the accepted token back (needed for repetition penalties and similar).
Adds stochastic (dist) sampling with an optional seed.
Adds DRY (Don't Repeat Yourself) sampling. seqBreakers lists token strings that reset the repetition check (e.g. ["\n"]). Pass seqBreakers = [] to use no breakers.
Adds grammar-constrained sampling. grammarStr is a GBNF grammar; grammarRoot is the root rule name (usually "root").
Adds greedy (argmax) sampling.
Adds per-token logit bias adjustments. Each entry in biases is a {token, bias} pair; bias > 0 increases probability, bias < 0 decreases it.
Adds min-P sampling.
Adds Mirostat v2 sampling (adaptive entropy targeting).
Adds repetition / frequency / presence penalties. penaltyLastN = -1 uses the full context; penaltyLastN = 0 disables the penalty.
Sample the next token. batchIdx = -1 uses the last output position.
Sample the next token from a LlamaContext.
Adds temperature scaling.
Adds temperature sampling with dynamic range extension (delta and exponent).
Adds top-K filtering.
Adds top-N-sigma sampling (keeps tokens within n sigma of the top logit).
Adds top-P (nucleus) sampling.
Adds typical-P sampling.
Adds XTC (exclude top choices) sampling.
Create a new sampler chain.
A sampler chain you configure then use to pick the next token.