GPU Plans by Microsoft Azure

GPU plans sold by Microsoft Azure with their specifications and prices. Each plan includes one or more GPU plus CPU, memory, disk and data transfer.

GPU Types

Cloud providers offer a variety of GPUs from vendors like NVIDIA and AMD, each optimized for different workloads. From AI training to graphics rendering, these powerful processors enable diverse applications. Here's a look at some common GPU types:

A10: A versatile data center GPU, balancing AI inference and graphics rendering. Offers strong performance for diverse workloads, including virtual workstations and AI-powered video processing. See also A10G
A40: A professional workstation GPU built for demanding graphics and AI tasks. It delivers exceptional performance for visual effects, 3D rendering, and AI-accelerated workflows in professional environments.
A100: A high-performance data center GPU designed for accelerating AI training and high-performance computing. Delivers exceptional computational power for complex simulations and large-scale deep learning models.
H100: NVIDIA's next-generation AI GPU, providing a significant leap in performance over the A100. Engineered for massive AI workloads, with improved transformer engine performance, and increased memory bandwidth.
H200: An enhanced version of the H100, designed to tackle the most demanding AI workloads. It offers increased memory capacity and bandwidth, enabling faster processing of massive datasets for large language models and generative AI.
L4: An energy-efficient GPU optimized for AI video and inference workloads in the data center. It excels at tasks like video transcoding, AI-powered video analysis, and real-time inference, while maintaining a low power footprint.
T4: An entry-level inference GPU, widely used in cloud environments for AI inference and graphics virtualization. Provides a cost-effective solution for deploying AI models and delivering virtual desktops.
L40S: A powerful data center GPU designed for professional visualization and demanding AI workloads. Ideal for rendering complex 3D models, running simulations, and accelerating AI-driven design and content creation.
NVIDIA V100: A previous-generation high-performance GPU, still widely used for AI training and scientific computing. It offers substantial computational power and memory bandwidth for demanding workloads. See also NVIDIA V100S.
AMD Radeon Pro V520: A professional workstation GPU designed for visualization and graphics-intensive applications. It delivers reliable performance for tasks like 3D modeling, rendering, and video editing.
Nvidia RTX 4000: The NVIDIA RTX 4000 Ada Generation is a powerful professional GPU with 20GB GDDR6 ECC memory. Featuring 6144 CUDA cores, 192 Tensor cores, and 48 RT cores, it excels in demanding creative, design, and AI workflows. Its single-slot, power-efficient design delivers high performance for complex tasks.
Nvidia Quadro RTX 6000: The Radeon Pro RTX 6000 is a high-end professional workstation graphics card. It boasts 48GB of GDDR6 ECC memory, 18,176 CUDA cores, 568 Tensor cores, and 142 RT cores, delivering exceptional performance for demanding tasks like 3D rendering, AI, and data science. Its 960 GB/s memory bandwidth and advanced features like DLSS 3 and SER accelerate workflows, making it a top choice for professionals.

This is not a comprehensive list and prices may vary before VPSBenchmarks can update them.

Microsoft Azure
GPU compute - NCasT4_v3 The NCasT4_v3-series virtual machines are powered by Nvidia Tesla T4 GPUs and AMD EPYC 7V12(Rome) CPUs. The VMs feature up to 4 NVIDIA T4 GPUs with 16 GB of memory each, up to 64 non-multithreaded AMD EPYC 7V12 (Rome) processor cores(base frequency of 2.45 GHz, all-cores peak frequency of 3.1 GHz and single-core peak frequency of 3.3 GHz) and 440 GiB of system memory.
Plan	GPU Type	GPU RAM	vCPUs	RAM	Storage	Price
NC4as T4 v3	1 x T4	16 GB	4	28 GB	180 GB	$0.53/hr
NC8as T4 v3	1 x T4	16 GB	8	56 GB	360 GB	$0.76/hr
NC16as T4 v3	1 x T4	16 GB	16	110 GB	360 GB	$1.22/hr
NC64as T4 v3	4 x T4	64 GB	64	440 GB	2880 GB	$4.35/hr
GPU Compute - NCsv3 NCv3-series VMs are powered by NVIDIA Tesla V100 GPUs. From 6 to 24 Intel Xeon E5-2690 v4 vCPUs.
Plan	GPU Type	GPU RAM	vCPUs	RAM	Storage	Price
NC24rs v3	4 x V100	64 GB	24	448 GB	2944 GB	$13.46/hr
NC24s v3	4 x V100	64 GB	24	448 GB	2944 GB	$12.41/hr
NC12s v3	2 x V100	32 GB	12	224 GB	1472 GB	$6.20/hr
NC6s v3	1 x V100	16 GB	6	112 GB	736 GB	$3.10/hr
GPU Compute - NVv3 The NVv3-series virtual machines are powered by NVIDIA Tesla M60 GPUs and NVIDIA GRID technology with Intel E5-2690 v4 (Broadwell) CPUs and Intel Hyper-Threading Technology. These virtual machines are targeted for GPU accelerated graphics applications and virtual desktops where customers want to visualize their data, simulate results to view, work on CAD, or render and stream content
Plan	GPU Type	GPU RAM	vCPUs	RAM	Storage	Price
NV12s v3	1 x M60	8 GB	12	112 GB	320 GB	$1.16/hr
NV24s v3	2 x M60	16 GB	24	224 GB	640 GB	$2.31/hr
NV48s v3	4 x M60	32 GB	48	448 GB	1280 GB	$4.62/hr
GPU Compute - NCads_H100_v5 The NCads H100 v5-series virtual machines are part of the Azure GPU family designed for Applied AI training and batch inference workloads. Powered by NVIDIA H100 NVL GPUs and 4th-generation AMD EPYC Genoa processors, these instances provide high-performance compute capabilities. They feature up to 2 GPUs with 94GB memory each and up to 96 non-multithreaded processor cores. This series is ideal for GPU-accelerated analytics, video processing, and autonomy model training, supporting high-throughput local NVMe storage and accelerated networking.
Plan	GPU Type	GPU RAM	vCPUs	RAM	Storage	Price
NC80adis H100 v5	2 x NVIDIA H100	188 GB	80	640 GB	7152 GB	$13.96/hr
NC40ads H100 v5	1 x NVIDIA H100	94 GB	40	320 GB	3576 GB	$6.98/hr
GPU Compute - NDsr H100 v5 The ND H100 v5-series virtual machines are Azure's flagship generative AI infrastructure, designed for massive-scale AI training and inference. These instances feature eight NVIDIA H100 Tensor Core GPUs interconnected via 400 Gb/s NVIDIA Quantum-2 InfiniBand. They are powered by 4th Gen Intel Xeon Scalable processors and provide significant performance leaps for large language models. With high-speed DDR5 memory and local NVMe storage, they deliver the throughput necessary for the most demanding deep learning workloads and high-performance computing applications.
Plan	GPU Type	GPU RAM	vCPUs	RAM	Storage	Price
ND96isr H100 v5	8 x NVIDIA H100	640 GB	96	1900 GB	28000 GB	$98.32/hr
GPU Compute - ND-H200-v5 The ND H200 v5-series is a flagship Azure virtual machine designed for high-end AI training and generative inference. It features eight NVIDIA H200 Tensor Core GPUs with 141GB of HBM3e memory each, providing 4.8 TB/s of bandwidth. Powered by Sapphire Rapids processors, it offers 96 vCPUs and 1850 GiB of RAM. With 3.2 Tb/s of InfiniBand interconnect, it is optimized for large-scale, low-latency AI clusters and complex scientific computing workloads.
Plan	GPU Type	GPU RAM	vCPUs	RAM	Storage	Price
ND96is_H200_v5	8 x NVIDIA H200	1128 GB	96	1800 GB	28000 GB	$84.80/hr