The inevitablerelease of Nvidia’s next generation 16nm Pascal graphics cards with stacked high bandwidth memory is inching closer. Only three weeks ago we discovered four different Nvidia Pascal graphics cards being shipped across Nvidia's testing facilities. Today we're reporting on two more graphics boards. One that's entirely new and another that's an updated designthat may be progressing towards full functionality soon.
[UPDATED03/09/201604:17 PM EST]Two additional Nvidia Pascal graphics cards were spotted, with a value per unit of $1100 and $700. Additional details and informationhas been added after the sixth paragraph.

We reported two weeks ago that Nvidia is rumored todemoits next generation Pascal graphics cards at GTC in April with a product launch at Computex in June. Whispers have reached us of Nvidiaplanning toshowcase a Pascal graphics card at the show for the very first time.
Asource claimed that this will take place on April 5th during Nvidia’s CEO Jen-Hsun Huang’s keynote. A pascal graphics board will allegedly be showcased on stage during the keynote. We were told that it’s not just going to be a prototype to visuallydemonstratethe form factor like last year, but an actual working Pascalgraphics card.
Two New Nvidia Pascal Graphics Cards Spotted, Valued At $900 & $600 - Potential GTC Demo Units
Our two newNvidia graphics boards are listed as "COMPUTER GRAPHICS CARDS" in Nvidia's shipping description. Both carry heftyper unit values. The first is valued at $600 and the second at $900. So we're potentially looking at high-end graphics cards here. However, it's still important to note that these values don't always accurately reflectactual product pricing and can be vastly different from actual product cost and selling price.

Bothboards start with the same 699 serial number. We've pointed out in a previous articlethat the earliest record of a board carrying that serial numberappears in December. Soweknow that we're looking at Nvidia graphics boards that are new and did not exist at any point before December. This could potentially explain Pascal's absence from CESin Januaryif no Pascal graphics cards were ready at the time. Which led toNvidia's decisionto showcase the Pascal Drive PX2 module with Maxwell GPUs instead.
The two new boardsare as follows :
| Serial Number | Value Per Unit |
|---|---|
| 699-1G610-0000-000 | $600 |
| 699-12914-0076-100 | $900 |
The second entry is one that we've seen before, all be it with a slightly different serial number. The card we had seen earlier was shipped in February and had the following serial number :699-12914-0071-100 and carried a significantlylower value of $500, vs the new iteration which is listed at $900. This indicates that in all likelihood more components have been added to the board and it's inching closer to full operational capacity. Hopefully in time for the upcoming GPU Technology Conference in April.
________________________________
[UPDATED03/09/201604:17 PM EST]

After digging a little bit deeper we've spotted two additional boards. These graphics cards were shipped late last month and carry the following serial numbers.
699-1G411-0000-000
699-12914-0000-100
The first graphics card has actually been spotted once before and carried a lower per unit value of $600. The second graphics card is the third that we've seen with the12914 serial number. Howeverall three stillhad unique 4 digit strings and are likely variations of the same unit. These boards have alsoappeared under three completely different per unit values.From theinitial listingat $600 to itssecondappearance at $1100 and last quote at $700.However, again we would like to remind everyone that these value figures have little relevance to actual product pricing and are quotedfor insurance purposes. Their wide variabilitymakesitincrediblydifficult todraw any solid conclusions about actual cost.
________________________________
Could These Nvidia Pascal Graphics Cards Be GTC Demo Units?
Interestingly, all of Nvidia’s scheduled talks at GTC start with one or two alphabets and the digit six. That is they all follow this formula X6###. Where X is one or two letters, six is constant and # is a variablenumber. Keeping this in mind, the opening keynote of Nvidia’s CEO is given the variable 699.

As it happens,all six Nvidia graphics cards that have appeared in shipping records carried this very same serial number, matching that of Jen-Hsun’s keynote. These digits could bea code name for Pascal inside Nvidia which is why we're seeing them on these graphics cards and Jen-Hsun's keynote. Whatever they actually stand for we know we'veseen them enough times to know that it's not a coincidence.
There's no way of knowing for certain whether these are GP100 or GP104 boards as of yet.Interestingly GP100 or “Big Pascal” as we’d like to call it has been spotted a few months back. Back then Nvidia only had GPUs but there was no evidence of any actual boards. So looks like Pascal has come a long waysince then.
What we know so far about Nvidia's flagship Pascal GP100GPU :
Pascal graphicsarchitecture.2x performance per watt estimated improvement overMaxwell.To launch in 2016, purportedly the second half of the year.DirectX 12 feature level 12_1or higher.Successor to the GM200 GPU found in the GTX Titan X and GTX 980 Ti.Built on the 16nm FinFET manufacturing process from TSMC.Allegedly has a total of 17 billion transistors, more than twice that of GM200.Will feature four 4-Hi HBM2 stacks, for a total of 16GB of VRAMand 8-Hi stacks for up to 32GB for the professional compute SKUs.Features a 4096-bit memory businterface, same as AMD's Fiji GPU power the Fury series.Features NVLink (only compatible withnext generation IBM PowerPC server processors)Supports half precisionFP16 compute at twice the rate of full precision FP32.
| GPU Architecture | NVIDIA Fermi | NVIDIA Kepler | NVIDIA Maxwell | NVIDIA Pascal |
|---|---|---|---|---|
| GPU Process | 40nm | 28nm | 28nm | 16nm (TSMC FinFET) |
| Flagship Chip | GF110 | GK210 | GM200 | GP100 |
| GPU Design | SM (Streaming Multiprocessor) | SMX (Streaming Multiprocessor) | SMM (Streaming Multiprocessor Maxwell) | SMP (Streaming Multiprocessor Pascal) |
| Maximum Transistors | 3.00 Billion | 7.08 Billion | 8.00 Billion | 15.3 Billion |
| Maximum Die Size | 520mm2 | 561mm2 | 601mm2 | 610mm2 |
| Stream Processors Per Compute Unit | 32 SPs | 192 SPs | 128 SPs | 64 SPs |
| Maximum CUDA Cores | 512 CCs (16 CUs) | 2880 CCs (15 CUs) | 3072 CCs (24 CUs) | 3840 CCs (60 CUs) |
| FP32 Compute | 1.33 TFLOPs(Tesla) | 5.10 TFLOPs (Tesla) | 6.10 TFLOPs (Tesla) | ~12 TFLOPs (Tesla) |
| FP64 Compute | 0.66 TFLOPs (Tesla) | 1.43 TFLOPs (Tesla) | 0.20 TFLOPs (Tesla) | ~6 TFLOPs(Tesla) |
| Maximum VRAM | 1.5 GB GDDR5 | 6 GB GDDR5 | 12 GB GDDR5 | 16 / 32 GB HBM2 |
| Maximum Bandwidth | 192 GB/s | 336 GB/s | 336 GB/s | 720 GB/s - 1 TB/s |
| Maximum TDP | 244W | 250W | 250W | 300W |
| Launch Year | 2010 (GTX 580) | 2014 (GTX Titan Black) | 2015 (GTX Titan X) | 2016 |
We've learned last yearthat Nvidia’s flagship Pascal code named GP100 may have taped out on TSMC’s 16nm FinFET manufacturing process in June. Interestingly just shortly afterwards AMD announced that it hadtaped out two FinFET chips. It’s absolutely not a coincidence that both companies completed their FinFET designs at the same time. Both are pushing for a very aggressive time to market timetable to debut their next generation FinFET based GPUs this year.
Word On The Street Is That We Might See The First Pascal Graphics Cards Launch In June - Mobility Versions To ComeFirst
This one comes directly from sweclockers.com where the site has claimed on two occasionsover the past few weeks that Nvidiais planning tolaunch its very firstlineup of Pascal graphics cards aroundComputex in June. This launch will specifically be for the mobility lineup going into gaming notebooks. Swerclockers makes no mention of when we should expect desktop Pascal graphics cards but the site goeson toclaim that Nvidia is facing challenges bringing Pascal up to speed onTSMC's16nm FinFETwhich they say maythrow a wrench in the plans and result in postponement.




2 of 9
The plan to introduce the mobility lineup in mid June has reportedly been set in motion but could face delays owing to the ambiguity of Pascal's readiness. As such the probabilityof a paper launch in Computex or apostponement the launch entirely to a later date is described as being "great" the site reports.
Our take is that thereports of Nvidia wanting to launch its chips on the mobile side first are likely grounded in reality. The company will want to deliver mobile Pascal products on time for the OEMs' product refresh cycle before they roll out new products for theback to school season which spans July to September.
To a great extent a similar limitation does not exist for desktop PCsfor a variety of factors. For one the AIB market commands the lion's share of the desktopgraphics market. Additionally OEMs have much greater flexibility switching out graphics cardsin their desktop products. This means that we might be looking at market availability of desktop Pascal graphics cards around Q3 to Q4 of this year.
Nvidia's Pascal : Everything We Know Right Now
We've learned last yearthat Nvidia’s flagship Pascal code named GP100 may have taped out on TSMC’s 16nm FinFET manufacturing process in June. Interestingly just shortly afterwards AMD announced that it hadtaped out two FinFET chips. It’s absolutely not a coincidence that both companies completed their FinFET designs at the same time. Both are pushing for a very aggressive time to market timetable to debut their next generation FinFET based GPUs this year.





2 of 9
What we know so far about Nvidia's flagship Pascal GP100GPU :
Pascal graphicsarchitecture.2x performance per watt estimated improvement overMaxwell.To launch in 2016, purportedly the second half of the year.DirectX 12 feature level 12_1or higher.Successor to the GM200 GPU found in the GTX Titan X and GTX 980 Ti.Built on the 16nm FinFET manufacturing process from TSMC.Allegedly has a total of 17 billion transistors, more than twice that of GM200.Will feature four 4-Hi HBM2 stacks, for a total of 16GB of VRAMand 8-Hi stacks for up to 32GB for the professional compute SKUs.Features a 4096-bit memory businterface, same as AMD's Fiji GPU power the Fury series.Features NVLink (only compatible withnext generation IBM PowerPC server processors)Supports half precisionFP16 compute at twice the rate of full precision FP32.
| GPU Architecture | NVIDIA Fermi | NVIDIA Kepler | NVIDIA Maxwell | NVIDIA Pascal |
|---|---|---|---|---|
| GPU Process | 40nm | 28nm | 28nm | 16nm (TSMC FinFET) |
| Flagship Chip | GF110 | GK210 | GM200 | GP100 |
| GPU Design | SM (Streaming Multiprocessor) | SMX (Streaming Multiprocessor) | SMM (Streaming Multiprocessor Maxwell) | SMP (Streaming Multiprocessor Pascal) |
| Maximum Transistors | 3.00 Billion | 7.08 Billion | 8.00 Billion | 15.3 Billion |
| Maximum Die Size | 520mm2 | 561mm2 | 601mm2 | 610mm2 |
| Stream Processors Per Compute Unit | 32 SPs | 192 SPs | 128 SPs | 64 SPs |
| Maximum CUDA Cores | 512 CCs (16 CUs) | 2880 CCs (15 CUs) | 3072 CCs (24 CUs) | 3840 CCs (60 CUs) |
| FP32 Compute | 1.33 TFLOPs(Tesla) | 5.10 TFLOPs (Tesla) | 6.10 TFLOPs (Tesla) | ~12 TFLOPs (Tesla) |
| FP64 Compute | 0.66 TFLOPs (Tesla) | 1.43 TFLOPs (Tesla) | 0.20 TFLOPs (Tesla) | ~6 TFLOPs(Tesla) |
| Maximum VRAM | 1.5 GB GDDR5 | 6 GB GDDR5 | 12 GB GDDR5 | 16 / 32 GB HBM2 |
| Maximum Bandwidth | 192 GB/s | 336 GB/s | 336 GB/s | 720 GB/s - 1 TB/s |
| Maximum TDP | 244W | 250W | 250W | 300W |
| Launch Year | 2010 (GTX 580) | 2014 (GTX Titan Black) | 2015 (GTX Titan X) | 2016 |


2 of 9
Nvidia Pascal - 2X Perf/Watt, Stacked Memory, NV-Link And Mixed Precision Compute
TSMC’s new 16nm FinFET process promises to be significantly more power efficient than planar 28nm. It also promises to bring about a considerable improvement in transistor density. Which would enable Nvidia to build faster, significantly more complex and more power efficient GPUs.




2 of 9
TSMC’s 16FF+ (FinFET Plus) technology can provide above 65 percent higher speed, around 2 times the density, or 70 percent less power than its 28HPM technology. Comparing with 20SoC technology, 16FF+ provides extra 40% higher speed and 60% power saving. By leveraging the experience of 20SoC technology, TSMC 16FF+ shares the same metal backend process in order to quickly improve yield and demonstrate process maturity for time-to-market value.
Apart from HBM2 and 16nm there is one big compute-centric feature that Nvidia will debut with Pascal. And it’s NVLink. Pascal will be the first GPU from the company to support this new proprietary server interconnect.
NVIDIA Volta GPUs and IBM Power9 CPUs Enabled Supercomputers in 2017:
The technology targets GPU accelerated servers where the cross-chip communication is extremely bandwidth limited and a major system bottleneck. Nvidia states that NV-Link will be up to 5 to 12 times faster than traditional PCIE 3.0 making it a major step forward in platform atomics. Earlier this year Nvidia announced that IBM will be integrating this new interconnect into its upcoming PowerPC server CPUs.NVLink will debut with Nvidia’s Pascal in 2016 before it makes its way to Volta in 2018.

NVLink is an energy-efficient, high-bandwidth communications channel that uses up to three times less energy to move data on the node at speeds 5-12 times conventional PCIe Gen3 x16. First available in the NVIDIA Pascal GPU architecture, NVLink enables fast communication between the CPU and the GPU, or between multiple GPUs. Figure 3: NVLink is a key building block in the compute node of Summit and Sierra supercomputers.
VOLTA GPU Featuring NVLINK and Stacked Memory NVLINK GPU high speed interconnect 80-200 GB/s 3D Stacked Memory 4x Higher Bandwidth (~1 TB/s) 3x Larger Capacity 4x More Energy Efficient per bit.
NVLink is a key technology in Summit’s and Sierra’s server node architecture, enabling IBM POWER CPUs and NVIDIA GPUs to access each other’s memory fast and seamlessly. From a programmer’s perspective, NVLink erases the visible distinctions of data separately attached to the CPU and the GPU by “merging” the memory systems of the CPU and the GPU with a high-speed interconnect. Because both CPU and GPU have their own memory controllers, the underlying memory systems can be optimized differently (the GPU’s for bandwidth, the CPU’s for latency) while still presenting as a unified memory system to both processors. NVLink offers two distinct benefits for HPC customers. First, it delivers improved application performance, simply by virtue of greatly increased bandwidth between elements of the node. Second, NVLink with Unified Memory technology allowsdevelopers to write code much more seamlessly and still achieve high performance. via NVIDIA News






2 of 9
Unlike with Maxwell, Nvidia has laid major focus on compute and GPGPU acceleration with Pascal. The slew of new features and technologies that Nvidia will debut with Pascal emphasize this focus. Including the use of next generation stacked High Bandwidth Memory, high-speed NVLink GPU interconnect and the addition of mixed precision compute at double the rate of full precision compute to pushperf/watt. We can’t wait to see Pascal in action later this year, but until then stay tuned for the latest.
| GPU Family | Vega | NVIDIA Pascal |
|---|---|---|
| Flagship GPU | Vega 10 | GP102 |
| GPU Process | 14nm FinFET | 16nm FinFET |
| GPU Transistors | Up To 18 Billion | 12 Billion |
| Memory | Up to 16 GB HBM2 | 12GB GDDR5X |
| Bandwidth | 512 GB/s | 480 GB/s |
| Graphics Architecture | Vega (NCU) | Pascal |
| Predecessor | Fiji (Fury Series) | GM200 (900 Series) |









