NVIDIA inference platform

E892043

The NVIDIA inference platform is a comprehensive suite of hardware and software tools designed to accelerate and optimize AI model deployment and real-time inference across data center, edge, and embedded environments.

Try in SPARQL Jump to: Statements Referenced by

Statements (69)

Predicate Object
instanceOf AI inference platform
software and hardware platform
developer NVIDIA NERFINISHED
includesComponent NVIDIA AI Enterprise NERFINISHED
NVIDIA AI Workbench integration
NVIDIA Base Command Manager NERFINISHED
NVIDIA BlueField DPUs NERFINISHED
NVIDIA CUDA NERFINISHED
NVIDIA DGX systems NERFINISHED
NVIDIA EGX platform NERFINISHED
NVIDIA GPU operator NERFINISHED
NVIDIA GPUs NERFINISHED
NVIDIA Jetson platform NERFINISHED
NVIDIA NIM microservices NERFINISHED
NVIDIA NeMo microservices NERFINISHED
NVIDIA TensorRT NERFINISHED
NVIDIA TensorRT-LLM NERFINISHED
NVIDIA Triton Inference Server NERFINISHED
NVIDIA cuDNN NERFINISHED
NVIDIA networking NERFINISHED
optimizationFeature FP16 mixed precision
INT8 quantization
dynamic batching
layer fusion
model ensemble execution
precision calibration
providesCapability autoscaling of inference workloads
model optimization
model serving
multi-GPU inference
multi-node inference
observability and metrics for inference
purpose accelerate AI inference
optimize AI model deployment
relatedTo NVIDIA AI platform NERFINISHED
NVIDIA training platform
softwareStack CUDA NERFINISHED
NVIDIA AI Enterprise NERFINISHED
TensorRT NERFINISHED
Triton Inference Server NERFINISHED
cuDNN NERFINISHED
supportsDeployment Kubernetes NERFINISHED
bare-metal servers
cloud environments
edge devices
embedded modules
on-premises data centers
virtual machines
supportsEnvironment data center
edge
embedded systems
supportsFramework ONNX Runtime NERFINISHED
PyTorch NERFINISHED
TensorFlow NERFINISHED
XGBoost NERFINISHED
supportsModelFormat ONNX NERFINISHED
TensorFlow SavedModel NERFINISHED
TensorRT engine NERFINISHED
TorchScript NERFINISHED
supportsUseCase batch inference
computer vision inference
large language model inference
online prediction services
real-time inference
recommender systems
speech AI inference
targetUser AI developers
IT operations teams
MLOps engineers

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

NVIDIA TensorRT componentOf NVIDIA inference platform