AMD Radeon Vega Frontier Edition Vs NVIDIA Pascal Tesla P100 DeepBench Demo – Vega Beats Pascal In Deep Learning Capabilities, For Now-February 2024-www.yitit.com

Yesterday, AMD showed off the first real-time benchmarks of the RadeonVega graphics card against the NVIDIA Pascal based Tesla P100 in deep learning benchmarks. In its first attempt, the RTG developed GPU was able to give NVIDIA's best compute card from last year a good beating but there's more to the benchmarks.

AMD Radeon Vega Vs NVIDIA Pascal Tesla P100 Deep Learning Performance Detailed

NVIDIA launched the Tesla P100 based on Pascal GP100 back in early 2016. Since then, it has been the fastest compute solution available to date. NVIDIA kicked off 2017 with the announcement of the next chapter in graphics deep learning. They announced the Tesla V100 based on Volta GV100 at GTC 2017. We already know the specs of these high-performance compute cards.

The Tesla P100 is a cut down configuration and features 3584 Cores for 10.6 TFLOPs (FP32) and 21.2 TFLOPs (FP16). Moving on, the Radeon Vega Frontier Editionwill have 4096 Cores for 13.0 TFLOPs (FP32) and 25 TFLOPs (FP 64). NVIDIA's Tesla V100 is also a cut down configuration like the Tesla P100 but has vast number of cores. The chip houses 5120 cores while there are in fact 5376 cores on the GPU.

The chip delivers an astonishing amount of compute rated at 15 TFLOPs (FP32) and 120 Tensor TFLOPs (FP16) with the new Tensor Cores. The Tensor cores are dedicated units inside the Volta chip which are used for deeplearning training and deliver up to 6 times higher FP16 output than GP100 or any GPU of its caliber.

GPU Family	AMD Vega	AMD Navi	NVIDIA Pascal	NVIDIA Volta
Flagship GPU	Vega 10	Navi 10	NVIDIA GP100	NVIDIA GV100
GPU Process	14nm FinFET	7nm FinFET	TSMC 16nm FinFET	TSMC 12nm FinFET
GPU Transistors	15-18 Billion	TBC	15.3 Billion	21.1 Billion
GPU Cores (Max)	4096 SPs	TBC	3840 CUDA Cores	5376 CUDA Cores
Peak FP32 Compute	13.0 TFLOPs	TBC	12.0 TFLOPs	>15.0 TFLOPs (Full Die)
Peak FP16 Compute	25.0 TFLOPs	TBC	24.0 TFLOPs	120 Tensor TFLOPs
VRAM	16 GB HBM2	TBC	16 GB HBM2	16 GB HBM2
Memory (Consumer Cards)	HBM2	HBM3	GDDR5X	GDDR6
Memory (Dual-Chip Professional/ HPC)	HBM2	HBM3	HBM2	HBM2
HBM2 Bandwidth	484 GB/s (Frontier Edition)	>1 TB/s?	732 GB/s (Peak)	900 GB/s
Graphics Architecture	Next Compute Unit (Vega)	Next Compute Unit (Navi)	5th Gen Pascal CUDA	6th Gen Volta CUDA
Successor of (GPU)	Radeon RX 500 Series	Radeon RX 600 Series	GM200 (Maxwell)	GP100 (Pascal)
Launch	2017	2019	2016	2017