NVIDIA TensorRT
E256947
NVIDIA TensorRT is a high-performance deep learning inference optimizer and runtime library designed to accelerate AI models on NVIDIA GPUs in production environments.
All labels observed (4)
| Label | Occurrences |
|---|---|
| TensorRT | 3 |
| NVIDIA TensorRT canonical | 1 |
| NVIDIA TensorRT (indirectly via shared primitives) | 1 |
| TensorRT engine | 1 |
Statements (49)
| Predicate | Object |
|---|---|
| instanceOf |
deep learning inference optimizer
ⓘ
inference runtime library ⓘ |
| componentOf |
NVIDIA AI Enterprise software suite
ⓘ
surface form:
NVIDIA AI software stack
NVIDIA inference platform ⓘ |
| designedFor | production AI deployment ⓘ |
| developer |
NVIDIA Corporation
ⓘ
surface form:
NVIDIA
|
| distribution |
NVIDIA Developer website
ⓘ
NVIDIA GPU Cloud containers ⓘ |
| goal |
maximize inference throughput
ⓘ
minimize inference latency ⓘ |
| integratesWith |
NVIDIA CUDA
ⓘ
surface form:
CUDA
DeepStream SDK ⓘ
surface form:
NVIDIA DeepStream
NVIDIA Triton Inference Server ⓘ ONNX Runtime ⓘ PyTorch via ONNX export ⓘ TensorFlow via TensorRT integration ⓘ cuDNN ⓘ |
| license | proprietary ⓘ |
| optimizedFor |
Nvidia Maxwell GPU
ⓘ
surface form:
NVIDIA GPUs
|
| primaryUse | deep learning inference acceleration ⓘ |
| providesFeature |
CUDA integration
ⓘ
calibration for INT8 quantization ⓘ dynamic shapes support ⓘ dynamic tensor memory management ⓘ graph optimizations ⓘ kernel auto-tuning ⓘ layer fusion ⓘ multi-stream execution ⓘ plugin layer mechanism ⓘ |
| supportsDeploymentEnvironment |
cloud environments
ⓘ
edge devices ⓘ on-premises data centers ⓘ |
| supportsHardware |
NVIDIA Tesla data center GPUs
ⓘ
surface form:
NVIDIA data center GPUs
NVIDIA embedded GPUs ⓘ NVIDIA GeForce GPU line ⓘ
surface form:
NVIDIA gaming GPUs
|
| supportsLanguageBinding |
C++
ⓘ
Python ⓘ |
| supportsModelFormat |
NVIDIA framework-specific formats
ⓘ
ONNX ⓘ |
| supportsPrecision |
FP16
ⓘ
FP32 ⓘ INT8 ⓘ TF32 ⓘ |
| targetDomain |
computer vision inference
ⓘ
natural language processing inference ⓘ recommendation systems inference ⓘ speech and audio inference ⓘ |
| usedFor |
batch inference
ⓘ
real-time inference ⓘ |
Referenced by (6)
Full triples — surface form annotated when it differs from this entity's canonical label.
subject surface form:
NVIDIA AI Enterprise
this entity surface form:
TensorRT
this entity surface form:
NVIDIA TensorRT (indirectly via shared primitives)
this entity surface form:
TensorRT
this entity surface form:
TensorRT
this entity surface form:
TensorRT engine