XcellHost | Blog

Tesla V100 Vs Tesla P100 – Key differences

Written by Abhishek Nimbalkar | Sep 5, 2020 6:33:07 PM

Accelerate your most demanding HPC and hyperscale data center workloads with NVIDIA® Tesla® GPUs. Data scientists and researchers can now parse petabytes of data orders of magnitude faster than they could by using traditional CPUs, in applications ranging from energy exploration to deep learning. Tesla accelerators also deliver the horsepower needed to run bigger simulations faster than ever before. Plus, Tesla delivers the highest performance and user density for virtual desktops, applications, and workstations.

What is Tesla V100?

NVIDIA® V100 Tensor Core is the most advanced data center GPU ever built to accelerate AI, high-performance computing (HPC), data science and graphics. It’s powered by NVIDIA Volta architecture, comes in 16 and 32GB configurations, and offers the performance of up to 32 CPUs in a single GPU. Data scientists, researchers, and engineers can now spend less time optimizing memory usage and more time designing the next AI breakthrough.

What is Tesla P100?

Today’s data centers rely on many interconnected commodity compute nodes, which limits high-performance computing (HPC) and hyperscale workloads. NVIDIA® Tesla® P100 taps into NVIDIA Pascal™ GPU architecture to deliver a unified platform for accelerating both HPC and AI, dramatically increasing throughput while also reducing cost.

1. Hardware Comparison

Processor SMs CUDA Cores Tensor Cores Frequency Cache Max. Memory Memory B/W
Nvidia P100 56 3,584 N/A 1,126 MHz 4 MB L2 16 GB 720 GB/s
Nvidia V100 80 5,120 640 1.53 GHz 6 MB L2 16 GB 900 GB/s

2. Benchmark Setup

P100 V100
CPU 2 x Intel Xeon E5-2680 v3 2 x Intel Xeon E5-2686 v4
GPU Nvidia Tesla P100 PCIe Nvidia Tesla V100 PCIe
OS RedHat Enterprise Linux 7.4 RedHat Enterprise Linux 7.4
RAM 64GB 128GB
NGC TensorFlow 17.11 17.11
Clock Boost GPU: 1328 MHz, memory: 715 MHz GPU: 1370 MHz, memory: 1750 MHz
ECC on on

3. Performance

Tesla P100

Modern high-performance computing (HPC) data centers are key to solving some of the world’s most important scientific and engineering challenges. NVIDIA® Tesla® accelerated computing platform powers these modern data centers with the industry-leading applications to accelerate HPC and AI workloads. The Tesla P100 GPU is the engine of the modern data center, delivering breakthrough performance with fewer servers resulting in faster insights and dramatically lower costs. Every HPC data center can benefit from the Tesla platform. Over 450 HPC applications in a broad range of domains are optimized for GPUs, including all 10 of the top 10 HPC applications and every major deep learning framework.

Tesla V100

The NVIDIA® V100 Tensor Core GPU is the world’s most powerful accelerator for deep learning, machine learning, high-performance computing (HPC), and graphics. Powered by NVIDIA Volta™, a single V100 Tensor Core GPU offers the performance of nearly 32 CPUs—enabling researchers to tackle challenges that were once unsolvable. The V100 won MLPerf, the first industry-wide AI benchmark, validating itself as the world’s most powerful, scalable, and versatile computing platform.

4. Fundamental & Architectural Differences

Tesla Product Tesla V100 Tesla P100
Architecture Volta Pascal
Code name GV100 GP100
Release Year 2017 2016
Cores / GPU 5120 3584
GPU Boost Clock 1530 MHz 1480 MHz
Tensor Cores / GPU 640 NA
Memory type HBM2 HBM2
Maximum RAM amount 32 GB 16 GB
Memory clock speed 1758 MHz 1430 MHz
Memory bandwidth 900.1 GB / s 720.9 GB / s
CUDA Support From 7.0 Version From 6.0 Version
Floating-point performance 14,029 gflops 10,609 gflops

5. Key Features

Tesla P100

Extreme performance Powering HPC, Deep Learning, and many more GPU Computing areas

NVLink™ NVIDIA’s new high speed, high bandwidth interconnect for maximum application scalability

HBM2 Fast, high capacity, extremely efficient CoWoS (Chip-on-Wafer-on-Substrate) stacked memory architecture

Unified Memory, Compute Preemption, and New AI Algorithms Significantly improved programming model and advanced AI software optimized for the Pascal architecture;

16nm FinFET Enables more features, higher performance, and improved power efficiency.

Tesla V100

• New Streaming Multiprocessor (SM) Architecture Optimized for Deep Learning

• Second-Generation NVIDIA NVLink™

• HBM2 Memory: Faster, Higher Efficiency

• Volta Multi-Process Service

• Enhanced Unified Memory and Address Translation Services

• Maximum Performance and Maximum Efficiency Modes

• Cooperative Groups and New Cooperative Launch APIs

• Volta Optimized Software

Conclusion

A critical question our customers ask is, what kind of GPU should I choose? Which GPU cards can help me deliver results faster?

If you want maximum Deep Learning performance, Tesla V100 is a great choice because of its performance. The dedicated TensorCores have huge performance potential for deep learning applications. NVIDIA has even termed a new “TensorFLOP” to measure this gain.

Tesla V100 is the fastest NVIDIA GPU available on the market. V100 is 3x faster than P100. If you primarily require a large amount of memory for machine learning, you can use either Tesla P100 or V100.