NVIDIA TensorRT

E256947

deep learning inference optimizer inference runtime library

NVIDIA TensorRT is a high-performance deep learning inference optimizer and runtime library designed to accelerate AI models on NVIDIA GPUs in production environments.

Try in SPARQL Jump to: Surface forms Statements Referenced by

All labels observed (4)

Label	Occurrences
TensorRT	3
NVIDIA TensorRT canonical	1
NVIDIA TensorRT (indirectly via shared primitives)	1
TensorRT engine	1

Statements (49)

Predicate	Object
instanceOf	deep learning inference optimizer ⓘ inference runtime library ⓘ
componentOf	NVIDIA AI Enterprise software suite ⓘ surface form: NVIDIA AI software stack NVIDIA inference platform ⓘ
designedFor	production AI deployment ⓘ
developer	NVIDIA Corporation ⓘ surface form: NVIDIA
distribution	NVIDIA Developer website ⓘ NVIDIA GPU Cloud containers ⓘ
goal	maximize inference throughput ⓘ minimize inference latency ⓘ
integratesWith	NVIDIA CUDA ⓘ surface form: CUDA DeepStream SDK ⓘ surface form: NVIDIA DeepStream NVIDIA Triton Inference Server ⓘ ONNX Runtime ⓘ PyTorch via ONNX export ⓘ TensorFlow via TensorRT integration ⓘ cuDNN ⓘ
license	proprietary ⓘ
optimizedFor	Nvidia Maxwell GPU ⓘ surface form: NVIDIA GPUs
primaryUse	deep learning inference acceleration ⓘ
providesFeature	CUDA integration ⓘ calibration for INT8 quantization ⓘ dynamic shapes support ⓘ dynamic tensor memory management ⓘ graph optimizations ⓘ kernel auto-tuning ⓘ layer fusion ⓘ multi-stream execution ⓘ plugin layer mechanism ⓘ
supportsDeploymentEnvironment	cloud environments ⓘ edge devices ⓘ on-premises data centers ⓘ
supportsHardware	NVIDIA Tesla data center GPUs ⓘ surface form: NVIDIA data center GPUs NVIDIA embedded GPUs ⓘ NVIDIA GeForce GPU line ⓘ surface form: NVIDIA gaming GPUs
supportsLanguageBinding	C++ ⓘ Python ⓘ
supportsModelFormat	NVIDIA framework-specific formats ⓘ ONNX ⓘ
supportsPrecision	FP16 ⓘ FP32 ⓘ INT8 ⓘ TF32 ⓘ
targetDomain	computer vision inference ⓘ natural language processing inference ⓘ recommendation systems inference ⓘ speech and audio inference ⓘ
usedFor	batch inference ⓘ real-time inference ⓘ

Referenced by (6)

Full triples — surface form annotated when it differs from this entity's canonical label.

NVIDIA AI Enterprise software suite → includes → NVIDIA TensorRT ⓘ

subject surface form: NVIDIA AI Enterprise

NVIDIA Jetson embedded modules → supports → NVIDIA TensorRT ⓘ

this entity surface form: TensorRT

cuDNN → integratesWith → NVIDIA TensorRT ⓘ

this entity surface form: NVIDIA TensorRT (indirectly via shared primitives)

Tensor Cores → exposedThrough → NVIDIA TensorRT ⓘ

this entity surface form: TensorRT

NVIDIA Triton Inference Server → supportsFramework → NVIDIA TensorRT ⓘ

this entity surface form: TensorRT

NVIDIA Triton Inference Server → supportsFormat → NVIDIA TensorRT ⓘ

this entity surface form: TensorRT engine