At CES 2016, NVIDIA's CEO, Jen-Hsun Huang presented the latest Drive PX 2 board that will be powered by the next generation Pascal GPU architecture. The Pascal GPU architecture is one which will be powering the next iteration of professional and consumergraphics cards, succeeding Maxwell and besting it in every possible way asis anticipated by enthusiasts and PC builders.

NVIDIA's Pascal GPU Analysis - What To Expect From NVIDIA's Next-Gen GPU Powerhouse
NVIDIA's Pascal GPUs are not being launched any time soonbut we know quite a lot about them from previous reports. NVIDIA provided us with a bitmore details at their conference so let's take a look at what's Pascal all about. In 2014, NVIDIA introduced Maxwell, their last architecture to use the 28nm process node. We had seen 28nm on the GPU market since 2012 when AMD and NVIDIA launched their first products based on the (then latest) process tech, codenamed Kepler and GCN (1.0).
The Race To FinFET - What It Means For The GPU Industry
Over the years, this process was refined and we got to see some beefy designs such as the GK110, GM200 from NVIDIA and Hawaii, Fiji from AMD.Measuring up to 601mm2 (GM200) and integrating an insane amount of transistors (8.9 Billion on Fiji), the 28nm process proven to be a real deal for the graphics market as it served the market for a good four years time frame. But hardware and technology grows at a fast pace and a new node has long been demanded by GPU makers to build their next graphics chips.
As every generation of graphics card passes, we anticipate the successor to offer a great performance increase in the coming generation of graphics cards. When the industry shifted from 40nm to 28nm, we saw GPUs that were supposed to be aimed at mid-range offerings beating the big cores from the previous generation. The GTX 680, NVIDIA's first 28nm graphics card obliterated the flagship GF110 core, featuring better performance and better power efficiency. The performance improvement was around 25% on a process that had just seen the light of day.
More than a year later, NVIDIA showed off just what kind of performance they had in their hands with the 28nm Kepler GPU. When the GTX 780 Ti launched, it featured more than 50% performance lead over the GTX 580. This was the moment where the flagship Kepler core got compared to the flagship Fermi core. It was known that NVIDIA had given priority to HPC for their compute-oriented Kepler cores which was the sole reason why we got to see GK104 as a flagship offering in 2012 in the first place. However, by this time, the 28nm node was fully learned and mastered by GPU companies.
When Maxwell and Fiji graphics cards came to the market, we saw a shift to gaming-only products rather than professional/HPC focused parts. The main reason for this shift was both NVIDIA and AMD knew that they had reached a certain bottleneck with the 28nm process where they could either go for a better performance in a single department (Gaming) or split it into two departments (Gaming/Compute) which would have resulted in worse efficiency and outrageously huge dies which they would have been selling at the fraction of their real cost to make the competitive against their own offerings. Result was GM200 and Fiji.

Both GPUs are great but they have something in common, they aren't armed with the strong compute hardware which their older gen predecessors had (Hawaii/GK110). While they were efficient, their performance increases weren't as big given the hardware updates they had received by the time. The Titan X was 30% faster than the GTX 780 Ti and the same could be said for the Fury X over R9 290X. While we once saw the mid-range GTX 680 delivering a nice 25% lead over GTX 580, the GTX 980 could only manage to deliver a 5-10% lead over the GTX 780 TI. By that time, it was clear that 28nm process had become a bottleneck and a new node was required by GPU manufacturers to experiment with and make next generation graphics processors.
| GPU Architecture | NVIDIA Fermi | NVIDIA Kepler | NVIDIA Maxwell | NVIDIA Pascal |
|---|---|---|---|---|
| GPU Process | 40nm | 28nm | 28nm | 16nm (TSMC FinFET) |
| Flagship Chip | GF110 | GK210 | GM200 | GP100 |
| GPU Design | SM (Streaming Multiprocessor) | SMX (Streaming Multiprocessor) | SMM (Streaming Multiprocessor Maxwell) | SMP (Streaming Multiprocessor Pascal) |
| Maximum Transistors | 3.00 Billion | 7.08 Billion | 8.00 Billion | 15.3 Billion |
| Maximum Die Size | 520mm2 | 561mm2 | 601mm2 | 610mm2 |
| Stream Processors Per Compute Unit | 32 SPs | 192 SPs | 128 SPs | 64 SPs |
| Maximum CUDA Cores | 512 CCs (16 CUs) | 2880 CCs (15 CUs) | 3072 CCs (24 CUs) | 3840 CCs (60 CUs) |
| FP32 Compute | 1.33 TFLOPs(Tesla) | 5.10 TFLOPs (Tesla) | 6.10 TFLOPs (Tesla) | ~12 TFLOPs (Tesla) |
| FP64 Compute | 0.66 TFLOPs (Tesla) | 1.43 TFLOPs (Tesla) | 0.20 TFLOPs (Tesla) | ~6 TFLOPs(Tesla) |
| Maximum VRAM | 1.5 GB GDDR5 | 6 GB GDDR5 | 12 GB GDDR5 | 16 / 32 GB HBM2 |
| Maximum Bandwidth | 192 GB/s | 336 GB/s | 336 GB/s | 720 GB/s - 1 TB/s |
| Maximum TDP | 244W | 250W | 250W | 300W |
| Launch Year | 2010 (GTX 580) | 2014 (GTX Titan Black) | 2015 (GTX Titan X) | 2016 |









