Details regarding the NVIDIA Ada Lovelace Gaming GPU which will power the GeForce RTX 40 series graphics cards have been revealed. The new information comes from Kopte7kimi & talks about the block diagram of the next-gen architecture.
NVIDIA GeForce Ada Lovelace GPU SM Block Diagram Detailed: Bigger & Better Than Ever For Gamers!
The NVIDIA Ada Lovelace GPU architecture is no mystery anymore. We have learned the specific configurations that will power the next Gen AD10* series SKUs for GeForce RTX 40 series graphics cards and we have also seen leaked specifications of the lineup. Now, it's time to talk purely about the next-generation graphics chip itself.
NVIDIA AD102 'Ada Lovelace' Gaming GPU 'SM' Block Diagram (Image Credits: Kopite7kimi):
NVIDIA GA102 'Ampere' Gaming GPU 'SM' Block Diagram:
Starting with the GPU configuration, Kopite7kimi compares the top AD102 GPU to various other GPUs from the green team. These include the gaming-focused Ampere GA102 and Turing TU102 while there's also the HPC-Focused Hopper GH100 and Ampere GA100 added to the list. I'll only compare the AD102 to its gaming predecessors since the HPC-focused designs are vastly different than consumer-centric offerings.
The NVIDIA Ada Lovelace AD102 GPU will feature up to 12 GPC (Graphics Processing Clusters). This is an increase of 70% versus GA102 which features only 7 GPCs. Each GPU will consist of 6 TPCs and 2 SMs which is the same configuration as the existing chip. Each SM (Streaming Multiprocessor) will house four sub-cores which is also the same as the GA102 GPU. What's changed is the FP32 & the INT32 core configuration. Each sub-core will include 128 FP32 units but combined FP32+INT32 units will go up to 192. This is because the FP32 units don't share the same sub-core as the IN32 units. The 128 FP32 cores are separate from the 64 INT32 cores.
So in total, each sub-core will consist of 128 FP32 plus 64 INT32 units for a total of 192 units. Each SM will have a total of 512 FP32 units plus 256 INT32 units for a total of 768 units. And since there are a total of 24 SM units (2 per GPC), we are looking at 12,288 FP32 Units and 6,144 INT32 units for a total of 18,432 cores. Each SM will also include two Wrap Schedules (32 thread/CLK) for 64 wraps per SM. This is a 50% increase on the cores (FP32+INT32) and a 33% increase in Wraps/Threads vs the GA102 GPU.
NVIDIA Ada Lovelace 'AD103' GPU Specs 'Preliminary':
GPU Name | AD103 | GA102 | GA103 | TU102 |
---|---|---|---|---|
GPC | 7 (Per GPU) | Same | 1.16x | 1.16x |
TPC | 6 (Per GPC) | Same | 1.20x | Same |
SM | 2 (Per TPC) | Same | Same | Same |
Sub-Core | 4 (Per SM) | Same | Same | Same |
FP32 | 128 (Per SM) | Same | Same | 2x |
FP32+INT32 | 192 (Per SM) | 1.5x | 1.5x | 1.5x |
Warps | 64 (Per SM) | 1.33x | 1.33x | 2x |
Threads | 2048 (Per SM) | 1.33x | 1.33x | 2x |
L1 Cache | 192 KB (Per SM) | 1.5x | 1.5x | 2x |
L2 Cache | 64 MB (Per GPU) | 10.6x | 16x | 10.6x |
ROPs | 32 (Per GPC) | 2x | 2x | 2x |