AMD Launching Full Range Of New Vega Radeon GPUs “Soon” – To Feature Both HBM2 & GDDR5/X Memory-April 2024-www.yitit.com

AMD is reportedly preparing an entirely new top-to-bottom lineup of Radeon graphics cards based on its next generation Vega architecture. The new family, which I'll referto as the Radeon 500 series from here on out for the sake of simplicity, will feature second generation high bandwidth memory as well asGDDR5/X memory.

A New Top-To-Bottom Range Of Radeon Graphics Cards Based On The Vega Architecture

According to Fudzilla, AMD will be rolling out its next generation Vega architecture across the entire range of its 2017 Radeon graphics cards and it'll do it "soon". The new lineupwill span atop-end 4K 60FPS triple A gaming Radeon graphics card, the very same one that was demoed last week, to mid-range and entry level offerings for 1440p and 1080p gaming. The highest end models will feature HBM2 whilst the mid-range and more budget oriented cards will feature GDDR5/X memory.

HBM2 At The High-End, GDDR5/X In The Mid-Range And Entry-Level

AMD's Robert Hallock confirmed to Wccftech.com earlier this year that the GCN graphics architecture is compatible with both HBM and GDDR5 memory standards. Which is why this doesn't come as particularly surprising to us. Especially considering the complexity associated with stacking the high bandwidth memory dies as well as the additional cost of requiring aninterposer to connect the memory to the GPU die.

Robert Hallock, Technical Marketing lead at AMD

“AMD helped lead the development of HBM, was the first to bring HBM to market in GPUs, and plans to implement HBM/HBM2 in future graphics solutions.

At this time we have only publicly demonstrated a GDDR5 configuration of the Polaris architecture.It’s important to understand that HBM isn’t (currently) suitable for all GPU segments due to the current HBM cost structure. In the mainstream GPU segment, GDDR5 remains an extremely cost-effective, efficient and viable memory technology.

We have the flexibility to use HBM or GDDR5 as costs require. Certain market segments are cost sensitive, GDDR5 can be used there. Higher-end market segments where more cost can be afforded, HBM is viable as well.”

Fudzilla further reports that there's no confirmation regarding whether AMD will be using standard GDDR5 memory or the faster GDDR5X for its mid-range and entry-level products.

We've already seen one upcoming Radeon graphics card based on Vega in action. The yet unreleased graphics card was demoed in a head-to-head comparison with Nvidia's GTX 1080.The demo Vega graphics card had 8GB of HBM2 and itoutperformed the 1080 by 10%whilst runningDoom in Vulkan at 4K.

The Vega Architecture - AMD's CleverNext Generation Compute Unit

One big announcement that AMD made in its recent press event where Vega was demoed is that the new architecture features what the company calls its NCU, short for Next Compute Unit. We had already detailed key parts of this new design in our exclusive piece about Vega 10 and Vega 11 a couple of months ago.

This new architecture holds several key advantages over its predecessor. Chief among which is that each SIMD inside a given Vega NCUis now capable of simultaneously processing variable length wavefronts. Which to the average person sounds like a bunch of meaningless technical jargon, I know it did to me when I first learned about it. However, once you scratch the surface and truly understand what this means you quickly begin to realize how much of a bigdeal thisreally is.

AMD Radeon Instinct_Final for Distribution-page-017

In AMD's current GCNimplementation, each compute unit has four 16-wide vector SIMD units, capable of executing four 16-wide wavefronts ( a group of threads ) over four cycles. In addition to one scalar unit, capable of executing one instructionper cycle. This unit is delegated time-critical tasks, where the four-cycle turnaround of the SIMD unit is simply not good enough.

Unfortunately, these 16-wide SIMD units work exactly the same no matter how small of a wavefront they're fed. The SIMD unit has to spend four cycles executing whatever threads are presented to it, no matter what. Which means that executing a16-wide wavefront would take just as long as executing a4-wide wavefront as an example, rendering the other 12 ALUs inside the SIMD completely useless. Graphicsworkloads are inherently non-uniform, which means that it'seffectively impossible to find any scenario whereall 16-wide SIMD units would befully occupied at any given time.

Variable Width Wavefront SIMDs, Getting More Performance Out Of Fewer Cycles

This is no longer the case in AMD's new GCN implementation inside Vega. The V9 architectureincludes new clever schedulers and coherency subsystemsthat allow several wavefronts, of different widths, tobe executed simultaneously inside any compute unitthat's able to accommodate the workload. So that more ALUs would be doing useful work at any given time instead of idling or executing predicted off threads that produce no results

AMD Vega architecture

This in effect allows each NCUto finish considerably more work in the same amount of time compared to a traditional CU. In addition to freeing up valuable cache and memory resources forother compute units. It's very hard to predict how much of a difference this big of animprovement in resource utilization and CU occupancywill yield given how unpredictable and inherently fluctuantgraphics workloads are. Vega's Next Compute Units are therefor not only faster but also more power efficient. Although byhow much exactly remains to be seen.

AMD Vega Lineup

Graphics Card	Radeon R9 Fury X	Radeon RX 480	Radeon RX Vega Frontier Edition	Radeon Vega Pro	Radeon RX Vega (Gaming)	Radeon RX Vega Pro Duo
GPU	Fiji XT	Polaris 10	Vega 10	Vega 10	Vega 10	2x Vega 10
Process Node	28nm	14nm FinFET	FinFET	FinFET	FinFET	FinFET
Stream Processors	4096	2304	4096	3584	4096 (?)	Up to 8192
Performance	8.6 TFLOPS 8.6 (FP16) TFLOPS	5.8 TFLOPS 5.8 (FP16) TFLOPS	~13 TFLOLPS ~25 (FP16) TFLOPS	11 TFLOLPS 22 (FP16) TFLOPS	>13 TFLOLPS >25 (FP16) TFLOPS	TBA TBA
Memory	4GB HBM	8GB GDDR5	16GB HBM2	TBA	TBA	TBA
Memory Bus	4096-bit	256-bit	2048-bit	2048-bit	2048-bit	4096-bit
Bandwidth	512GB/s	256GB/S	480GB/S	400GB/S	TBA	TBA
TDP	275W	150W	TBA	TBA	TBA	TBA
Launch	2015	2016	June 2017	June 2017	July 2017	TBA