Bases: LogitsProcessor
Source code in vllm/v1/sample/logits_processor/builtin.py
  instance-attribute  ¶
   
  Source code in vllm/v1/sample/logits_processor/builtin.py
  
    
    
 is_argmax_invariant() -> bool
Logit bias can rebalance token probabilities and change the outcome of argmax in greedy sampling.
 
 update_state(batch_update: BatchUpdate | None)
Source code in vllm/v1/sample/logits_processor/builtin.py
  
  Bases: LogitsProcessor
Source code in vllm/v1/sample/logits_processor/builtin.py
  instance-attribute  ¶
 min_p_cpu_tensor = zeros(
    (max_num_reqs,),
    dtype=float32,
    device="cpu",
    pin_memory=is_pin_memory,
)
 instance-attribute  ¶
   
 __init__(
    vllm_config: VllmConfig,
    device: device,
    is_pin_memory: bool,
)
Source code in vllm/v1/sample/logits_processor/builtin.py
  
  Source code in vllm/v1/sample/logits_processor/builtin.py
  
    
 update_state(batch_update: BatchUpdate | None)
Source code in vllm/v1/sample/logits_processor/builtin.py
  
  Bases: LogitsProcessor
Source code in vllm/v1/sample/logits_processor/builtin.py
  instance-attribute  ¶
   
 __init__(
    vllm_config: VllmConfig,
    device: device,
    is_pin_memory: bool,
)
Source code in vllm/v1/sample/logits_processor/builtin.py
  
    staticmethod  ¶
 add_request(
    params: SamplingParams,
    _: list[int] | None,
    output_tok_ids: list[int],
) -> tuple[int, Sequence[int], set[int]] | None
Source code in vllm/v1/sample/logits_processor/builtin.py
  
    
 is_argmax_invariant() -> bool
By censoring stop tokens, min-tokens can change the outcome of the argmax operation in greedy sampling.
 
 update_state(batch_update: BatchUpdate | None)
Source code in vllm/v1/sample/logits_processor/builtin.py
  
 process_dict_updates(
    req_entries: dict[int, T],
    batch_update: BatchUpdate | None,
    new_state: Callable[
        [SamplingParams, list[int] | None, list[int]],
        T | None,
    ],
) -> bool
Utility function to update dict state for sparse LogitsProcessors.