Yesterday, AMD showed off the first real-time benchmarks of the RadeonVega graphics card against the NVIDIA Pascal based Tesla P100 in deep learning benchmarks. In its first attempt, the RTG developed GPU was able to give NVIDIA's best compute card from last year a good beating but there's more to the benchmarks.
AMD Radeon Vega Vs NVIDIA Pascal Tesla P100 Deep Learning Performance Detailed
NVIDIA launched the Tesla P100 based on Pascal GP100 back in early 2016. Since then, it has been the fastest compute solution available to date. NVIDIA kicked off 2017 with the announcement of the next chapter in graphics deep learning. They announced the Tesla V100 based on Volta GV100 at GTC 2017. We already know the specs of these high-performance compute cards.
The Tesla P100 is a cut down configuration and features 3584 Cores for 10.6 TFLOPs (FP32) and 21.2 TFLOPs (FP16). Moving on, the Radeon Vega Frontier Editionwill have 4096 Cores for 13.0 TFLOPs (FP32) and 25 TFLOPs (FP 64). NVIDIA's Tesla V100 is also a cut down configuration like the Tesla P100 but has vast number of cores. The chip houses 5120 cores while there are in fact 5376 cores on the GPU.

The chip delivers an astonishing amount of compute rated at 15 TFLOPs (FP32) and 120 Tensor TFLOPs (FP16) with the new Tensor Cores. The Tensor cores are dedicated units inside the Volta chip which are used for deeplearning training and deliver up to 6 times higher FP16 output than GP100 or any GPU of its caliber.
| GPU Family | AMD Vega | AMD Navi | NVIDIA Pascal | NVIDIA Volta |
|---|---|---|---|---|
| Flagship GPU | Vega 10 | Navi 10 | NVIDIA GP100 | NVIDIA GV100 |
| GPU Process | 14nm FinFET | 7nm FinFET | TSMC 16nm FinFET | TSMC 12nm FinFET |
| GPU Transistors | 15-18 Billion | TBC | 15.3 Billion | 21.1 Billion |
| GPU Cores (Max) | 4096 SPs | TBC | 3840 CUDA Cores | 5376 CUDA Cores |
| Peak FP32 Compute | 13.0 TFLOPs | TBC | 12.0 TFLOPs | >15.0 TFLOPs (Full Die) |
| Peak FP16 Compute | 25.0 TFLOPs | TBC | 24.0 TFLOPs | 120 Tensor TFLOPs |
| VRAM | 16 GB HBM2 | TBC | 16 GB HBM2 | 16 GB HBM2 |
| Memory (Consumer Cards) | HBM2 | HBM3 | GDDR5X | GDDR6 |
| Memory (Dual-Chip Professional/ HPC) | HBM2 | HBM3 | HBM2 | HBM2 |
| HBM2 Bandwidth | 484 GB/s (Frontier Edition) | >1 TB/s? | 732 GB/s (Peak) | 900 GB/s |
| Graphics Architecture | Next Compute Unit (Vega) | Next Compute Unit (Navi) | 5th Gen Pascal CUDA | 6th Gen Volta CUDA |
| Successor of (GPU) | Radeon RX 500 Series | Radeon RX 600 Series | GM200 (Maxwell) | GP100 (Pascal) |
| Launch | 2017 | 2019 | 2016 | 2017 |









