yitit
Home
/
Hardware
/
AMD Provides First Look At Aldebaran “CDNA 2” Instinct MI200 Series MCM GPU Block Diagram
AMD Provides First Look At Aldebaran “CDNA 2” Instinct MI200 Series MCM GPU Block Diagram-February 2024
Feb 12, 2026 8:58 PM

AMD has offered some further insight on its CDNA 2 "Aldebaran" GPU-powered Instinct MI200 series which are the first to feature an MCM design. The Instinct MI200 GPUs have been detailed by AMD Architects, Alan Smith & Norman James, during Hot Chips 34.

AMD Provides First Look At Aldebaran "CDNA 2" Instinct MI200 Series GPU Block Diagram, First In HPC To Feature MCM Design

AMD is officially the first to MCM technology and they are doing so with a grand product which is their Instinct MI200 codenamed Aldebaran. The AMD Aldebaran GPU will come in various forms & sizes but it's all based on the brand new CDNA 2 architecture which is the most refined variation of Vega. Some of the main features before we go into detail are listed below:

AMD CDNA 2 architecture – 2nd Gen Matrix Cores accelerating FP64 and FP32 matrix operations, delivering up to 4X the peak theoretical FP64 performance vs. AMD previous-gen GPUs.Leadership Packaging Technology – Industry-first multi-die GPU design with 2.5D Elevated Fanout Bridge (EFB) technology delivers 1.8X more cores and 2.7X higher memory bandwidth vs. AMD previous-gen GPUs, offering the industry’s best aggregate peak theoretical memory bandwidth at 3.2 terabytes per second.3rd Gen AMD Infinity Fabric technology – Up to 8 Infinity Fabric links connect the AMD Instinct MI200 with 3rdGen EPYC CPUs and other GPUs in the node toenable unified CPU/GPU memory coherency and maximize system throughput, allowing for an easier on-ramp for CPU codes to tap the power of accelerators.

AMD Instinct MI200 GPU Die Shot:

Inside the AMD Instinct MI200 is an Aldebaran GPU featuring two dies, a secondary and a primary. It has two dies with each consisting of 8 shader engines for a total of 16 SE's. Each Shader Engine packs 14 CUs with full-rate FP64, packed FP32 & a 2nd Generation Matrix Engine for FP16 & BF16 operations. The whole GPU is fabricated on TSMC's 6nm process node and comes packed with a total of 58 Billion transistors.

AMD Instinct MI200 GPU Block Diagram:

Each die, as such, is composed of 112 compute units or 7,168 stream processors. This rounds up to a total of 224 compute units or 14,336 stream processors for the entire chip. The Aldebaran GPU is also powered by a new XGMI interconnect. Each chiplet features a VCN 2.6 engine and the main IO controller. Each GPU chiplet has four 1024-bit memory controllers for the HBM2e memory.

As for the cache, each GPU chiplet features a total of 8 MB of L2 capacity which is physically partitioned into 32 slices. Each slice delivers 128B/CLK with enhanced queuing and arbitration plus enhanced atomic operations. The per GCD memory subsystem includes 64 GB of HBM2e memory per chiplet with an aggregated 1.6 TB/s of bandwidth per GCD which is partitioned into 32 channels with a 64B/CLK for efficient operational voltage. The in-Package interconnect includes a 400 GB/s bi-sectional bandwidth across the two GCDs.

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_8

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_7

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_6

2 of 9

There are a total of 8 Infinity Fabric interconnects of which one on each GPU can be used for PCI-Express interconnect. The interconnect is rated at a coherent CPU-GPU transfer rate of 144 GB/s. You can scale up to 500 GB/s using the external Infinity Fabric link with a total of four MI200 series GPUs or scale out using a PCIe Gen 4 ESM AIC for 100 GB/s bandwidth.

AMD Instinct MI200 "Aldebaran GPU" Performance Metrics:

In terms of performance, AMD is touting various record wins in the HPC segment over NVIDIA's A100 solution with up to 3x performance improvements in AMG.

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_1

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_2

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_4

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_5

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_10

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_11

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_12

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_13

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_14

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_15

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_16

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_17

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_18

amd-cdna-2-instinct-mi200-aldebaran-mcm-gpu-hot-chips-34_19

2 of 9

As for DRAM, AMD has gone with an 8-channel interface consisting of 1024-bit interfaces for an 8192-bit wide bus interface. Each interface can support 2GB HBM2e DRAM modules. This should give us up to 16 GB of HBM2e memory capacity per stack and since there are eight stacks in total, the total amount of capacity would be a whopping 128 GB. That's 48 GB more than the A100 which houses 80 GB HBM2e memory. The memory will clock in at an insane speed of 3.2 Gbps for a full-on bandwidth of 3.2 TB/s. This is a whole 1.2 TB/s more bandwidth than the A100 80 GB which has 2 TB/s.

2022-06-10_2-56-26

2022-06-10_2-56-33

2022-06-10_2-31-20

2 of 9

The AMD Instinct MI200 CDNA 2 "Aldebaran" GPUs are already powering the world's fastest super-computer, the Frontier, which is also the world's first Exascale machine, offering 1.1 ExaFLOPs of compute horsepower and currently listed at the top within the TOP500 and Green500 lists. AMD has also unveiled its future plans for the Instinct MI300 APU lineup which will further leverage the chiplet architecture and take things to the next level.

AMD Radeon Instinct Accelerators

Accelerator NameAMD Instinct MI400AMD Instinct MI300XAMD Instinct MI300AAMD Instinct MI250XAMD Instinct MI250AMD Instinct MI210AMD Instinct MI100AMD Radeon Instinct MI60AMD Radeon Instinct MI50AMD Radeon Instinct MI25AMD Radeon Instinct MI8AMD Radeon Instinct MI6
CPU ArchitectureZen 5 (Exascale APU)N/AZen 4 (Exascale APU)N/AN/AN/AN/AN/AN/AN/AN/AN/A
GPU ArchitectureCDNA 4Aqua Vanjaram (CDNA 3)Aqua Vanjaram (CDNA 3)Aldebaran (CDNA 2)Aldebaran (CDNA 2)Aldebaran (CDNA 2)Arcturus (CDNA 1)Vega 20Vega 20Vega 10Fiji XTPolaris 10
GPU Process Node4nm5nm+6nm5nm+6nm6nm6nm6nm7nm FinFET7nm FinFET7nm FinFET14nm FinFET28nm14nm FinFET
GPU ChipletsTBD8 (MCM)8 (MCM)2 (MCM)
1 (Per Die)
2 (MCM)
1 (Per Die)
2 (MCM)
1 (Per Die)
1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)1 (Monolithic)
GPU CoresTBD19,45614,59214,08013,3126656768040963840409640962304
GPU Clock SpeedTBD2100 MHz2100 MHz1700 MHz1700 MHz1700 MHz1500 MHz1800 MHz1725 MHz1500 MHz1000 MHz1237 MHz
INT8 ComputeTBD2614 TOPS1961 TOPS383 TOPs362 TOPS181 TOPS92.3 TOPSN/AN/AN/AN/AN/A
FP16 ComputeTBD1.3 PFLOPs980.6 TFLOPs383 TFLOPs362 TFLOPs181 TFLOPs185 TFLOPs29.5 TFLOPs26.5 TFLOPs24.6 TFLOPs8.2 TFLOPs5.7 TFLOPs
FP32 ComputeTBD163.4 TFLOPs122.6 TFLOPs95.7 TFLOPs90.5 TFLOPs45.3 TFLOPs23.1 TFLOPs14.7 TFLOPs13.3 TFLOPs12.3 TFLOPs8.2 TFLOPs5.7 TFLOPs
FP64 ComputeTBD81.7 TFLOPs61.3 TFLOPs47.9 TFLOPs45.3 TFLOPs22.6 TFLOPs11.5 TFLOPs7.4 TFLOPs6.6 TFLOPs768 GFLOPs512 GFLOPs384 GFLOPs
VRAMTBD192 GB HBM3128 GB HBM3128 GB HBM2e128 GB HBM2e64 GB HBM2e32 GB HBM232 GB HBM216 GB HBM216 GB HBM24 GB HBM116 GB GDDR5
Infinity CacheTBD256 MB256 MBN/AN/AN/AN/AN/AN/AN/AN/AN/A
Memory ClockTBD5.2 Gbps5.2 Gbps3.2 Gbps3.2 Gbps3.2 Gbps1200 MHz1000 MHz1000 MHz945 MHz500 MHz1750 MHz
Memory BusTBD8192-bit8192-bit8192-bit8192-bit4096-bit4096-bit bus4096-bit bus4096-bit bus2048-bit bus4096-bit bus256-bit bus
Memory BandwidthTBD5.3 TB/s5.3 TB/s3.2 TB/s3.2 TB/s1.6 TB/s1.23 TB/s1 TB/s1 TB/s484 GB/s512 GB/s224 GB/s
Form FactorTBDOAMAPU SH5 SocketOAMOAMDual Slot CardDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Half LengthSingle Slot, Full Length
CoolingTBDPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive Cooling
TDP (Max)TBD750W760W560W500W300W300W300W300W300W175W150W

Comments
Welcome to yitit comments! Please keep conversations courteous and on-topic. To fosterproductive and respectful conversations, you may see comments from our Community Managers.
Sign up to post
Sort by
Login to display more comments
Hardware
Recent News
Copyright 2023-2026 - www.yitit.com All Rights Reserved