NVIDIA inference platform

E892043

AI inference platform software and hardware platform

The NVIDIA inference platform is a comprehensive suite of hardware and software tools designed to accelerate and optimize AI model deployment and real-time inference across data center, edge, and embedded environments.

Try in SPARQL Jump to: Statements Referenced by

Statements (69)

Predicate	Object
instanceOf	AI inference platform ⓘ software and hardware platform ⓘ
developer	NVIDIA NERFINISHED ⓘ
includesComponent	NVIDIA AI Enterprise NERFINISHED ⓘ NVIDIA AI Workbench integration ⓘ NVIDIA Base Command Manager NERFINISHED ⓘ NVIDIA BlueField DPUs NERFINISHED ⓘ NVIDIA CUDA NERFINISHED ⓘ NVIDIA DGX systems NERFINISHED ⓘ NVIDIA EGX platform NERFINISHED ⓘ NVIDIA GPU operator NERFINISHED ⓘ NVIDIA GPUs NERFINISHED ⓘ NVIDIA Jetson platform NERFINISHED ⓘ NVIDIA NIM microservices NERFINISHED ⓘ NVIDIA NeMo microservices NERFINISHED ⓘ NVIDIA TensorRT NERFINISHED ⓘ NVIDIA TensorRT-LLM NERFINISHED ⓘ NVIDIA Triton Inference Server NERFINISHED ⓘ NVIDIA cuDNN NERFINISHED ⓘ NVIDIA networking NERFINISHED ⓘ
optimizationFeature	FP16 mixed precision ⓘ INT8 quantization ⓘ dynamic batching ⓘ layer fusion ⓘ model ensemble execution ⓘ precision calibration ⓘ
providesCapability	autoscaling of inference workloads ⓘ model optimization ⓘ model serving ⓘ multi-GPU inference ⓘ multi-node inference ⓘ observability and metrics for inference ⓘ
purpose	accelerate AI inference ⓘ optimize AI model deployment ⓘ
relatedTo	NVIDIA AI platform NERFINISHED ⓘ NVIDIA training platform ⓘ
softwareStack	CUDA NERFINISHED ⓘ NVIDIA AI Enterprise NERFINISHED ⓘ TensorRT NERFINISHED ⓘ Triton Inference Server NERFINISHED ⓘ cuDNN NERFINISHED ⓘ
supportsDeployment	Kubernetes NERFINISHED ⓘ bare-metal servers ⓘ cloud environments ⓘ edge devices ⓘ embedded modules ⓘ on-premises data centers ⓘ virtual machines ⓘ
supportsEnvironment	data center ⓘ edge ⓘ embedded systems ⓘ
supportsFramework	ONNX Runtime NERFINISHED ⓘ PyTorch NERFINISHED ⓘ TensorFlow NERFINISHED ⓘ XGBoost NERFINISHED ⓘ
supportsModelFormat	ONNX NERFINISHED ⓘ TensorFlow SavedModel NERFINISHED ⓘ TensorRT engine NERFINISHED ⓘ TorchScript NERFINISHED ⓘ
supportsUseCase	batch inference ⓘ computer vision inference ⓘ large language model inference ⓘ online prediction services ⓘ real-time inference ⓘ recommender systems ⓘ speech AI inference ⓘ
targetUser	AI developers ⓘ IT operations teams ⓘ MLOps engineers ⓘ

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

NVIDIA TensorRT → componentOf → NVIDIA inference platform ⓘ