FeaturesΒΆ
Compatibility MatrixΒΆ
The tables below show mutually exclusive features and the support on some hardware.
The symbols used have the following meanings:
- β = Full compatibility
- π = Partial compatibility
- β = No compatibility
- β = Unknown or TBD
Note
Check the β or π with links to see tracking issue for unsupported feature/hardware combination.
Feature x FeatureΒΆ
| Feature | CP | APC | LoRA | SD | CUDA graph | pooling | enc-dec | logP | prmpt logP | async output | multi-step | mm | best-of | beam-search | prompt-embeds | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CP | β | ||||||||||||||
| APC | β | β | |||||||||||||
| LoRA | β | β | β | ||||||||||||
| SD | β | β | β | β | |||||||||||
| CUDA graph | β | β | β | β | β | ||||||||||
| pooling | π * | π * | β | β | β | β | |||||||||
| enc-dec | β | β | β | β | β | β | β | ||||||||
| logP | β | β | β | β | β | β | β | β | |||||||
| prmpt logP | β | β | β | β | β | β | β | β | β | ||||||
| async output | β | β | β | β | β | β | β | β | β | β | |||||
| multi-step | β | β | β | β | β | β | β | β | β | β | β | ||||
| mm | β | β | π ^ | β | β | β | β | β | β | β | β | β | |||
| best-of | β | β | β | β | β | β | β | β | β | β | β | β | β | ||
| beam-search | β | β | β | β | β | β | β | β | β | β | β | β | β | β | |
| prompt-embeds | β | β | β | β | β | β | β | β | β | β | β | β | β | β | β | 
* Chunked prefill and prefix caching are only applicable to last-token pooling.
 ^ LoRA is only applicable to the language backbone of multimodal models.
Feature x HardwareΒΆ
| Feature | Volta | Turing | Ampere | Ada | Hopper | CPU | AMD | TPU | Intel GPU | 
|---|---|---|---|---|---|---|---|---|---|
| CP | β | β | β | β | β | β | β | β | β | 
| APC | β | β | β | β | β | β | β | β | β | 
| LoRA | β | β | β | β | β | β | β | β | β | 
| SD | β | β | β | β | β | β | β | β | π | 
| CUDA graph | β | β | β | β | β | β | β | β | β | 
| pooling | β | β | β | β | β | β | β | β | β | 
| enc-dec | β | β | β | β | β | β | β | β | β | 
| mm | β | β | β | β | β | β | β | β | π | 
| logP | β | β | β | β | β | β | β | β | β | 
| prmpt logP | β | β | β | β | β | β | β | β | β | 
| async output | β | β | β | β | β | β | β | β | β | 
| multi-step | β | β | β | β | β | β | β | β | β | 
| best-of | β | β | β | β | β | β | β | β | β | 
| beam-search | β | β | β | β | β | β | β | β | β | 
| prompt-embeds | β | β | β | β | β | β | ? | β | β |