Nvidia Pascal Announcement

IMO, another lackluster presentation by nVIDIA with extremely vague announcements and worse, no examples of performance with and without asynchronous compute, which probably means that nVIDIA indeed does not have a proper implementation of it. Performance numbers were thrown around in a very random fashion and most details about the chips were not presented at all. Then, the "Founders Edition" novelty silliness sounds like a PR way of hiding what may be low yields. It all just seemed sad and artificial, like something is seriously wrong and they know its a matter of time till it is discovered. At least this is my impression of it (and GP100 launch).
There were previous rumors about Pascal lacking severely in async compute:
http://www.bitsandchips.it/52-engli...scal-in-trouble-with-asyncronous-compute-code
 
Last edited:
In the press release, they're touting "New asynchronous compute advances improve efficiency and gaming performance."
Rather curiously placed under the paragraph "Super Craftsmanship".

Exactly. The whole thing was not consistent at all. That sounds more like a check box sentence because they "must" have it, just as they said Maxwell has it...with limitations. If they could really do it well, they would have shown it, being the Marketing monster that they are, they would not let the opportunity go to waste, like they didn't with Fermi and Tesselation.
 
sli bridges again.....
What about them? You put it in when you insert the card into your system, then it sits there until you take the card out. There's no problem here.

Besides, bespoke bridges between GPUs circumvent the need for passing through the (CPU, these days) PCIe switching hub, avoiding bogging down the I/O subsystem with framebuffer data. This is more important still for Intel socket 11xx systems, since SLI effectively reduces cards down to PCIe 8x, halving available bandwidth.
 
That was for creating multiple view ports though, not single pass rendering.
No difference, it's the same thing.

The only new adjustment to what they now call "single pass rendering", is subdividing a single view port as well. And that mostly gives additional savings by being able to choose the transformation matrices according to the required lense correction right from the start, so you don't waste time on rendering the borders in an resolution which is discarded by the lens correction anyway. All post processing is then only applied to a set of g-buffers which is - in terms of distortion - already close to the final output.

That's probably the actual reason for the performance improvement. Not doing it "in a single pass", but essentially having a better functioning LOD.

As long as you aren't limited by GS/VS throughput (OK, that actually WAS the case in the demo shown!), you should be able to achieve the same (or at least a similar) speedup by just scissoring manually across multiple draw calls.

I don't think Nvidia does it in software, but you certainly can, with only minor disadvantages.
 
What about them? You put it in when you insert the card into your system, then it sits there until you take the card out. There's no problem here.

Besides, bespoke bridges between GPUs circumvent the need for passing through the (CPU, these days) PCIe switching hub, avoiding bogging down the I/O subsystem with framebuffer data. This is more important still for Intel socket 11xx systems, since SLI effectively reduces cards down to PCIe 8x, halving available bandwidth.
What's the delivered bandwidth? Last time I checked was 1/1.1Gbps.. On AFR/SFR (and any other linked adapter mode) you need to share resources, not only the back buffer.. Looks like more a tech loophole to avid exposing SLI tech to MB chipset manufactures (ie: AMD & intel) and asking for royalties.. On multi-GPU system bandwidth is the major issue, even with explicit adapter control allowed with low-overhead/close-to-metal APIs.
 
No difference, it's the same thing.

The only new adjustment to what they now call "single pass rendering", is subdividing a single view port as well. And that mostly gives additional savings by being able to choose the transformation matrices according to the required lense correction right from the start, so you don't waste time on rendering the borders in an resolution which is discarded by the lens correction anyway. All post processing is then only applied to a set of g-buffers which is - in terms of distortion - already close to the final output.

That's probably the actual reason for the performance improvement. Not doing it "in a single pass", but essentially having a better functioning LOD.

As long as you aren't limited by GS/VS throughput (OK, that actually WAS the case in the demo shown!), you should be able to achieve the same (or at least a similar) speedup by just scissoring manually across multiple draw calls.

I don't think Nvidia does it in software, but you certainly can, with only minor disadvantages.

I have looked at both presentations (Maxwell era and now) and they do seem subtly different.
The Multi-Projection Acceleration seemed to relate to create viewports in VR that enabled different levels of detail for a specific "window" in terms of a single pass rendering, rather than single pass rendering left and right displays.
To me it seemed the MPA was designed to overcome some issues with the Maxwell VR capability.
With Pascal they also specifically presented Lens Matched Shading as part of the Single Pass rendering, 1hr 4m into presentation.

Just curious, was Maxwell able to environment correct 3 independent screens (wrap around-surround graphics) in a single pass?
This is what the Single Pass Rendering ties into as well with Pascal; around 53-55mins in presentation.
Hence why I am curious how it is implemented, and that NVIDIA is introducing subtle differences in this presentation to previous VR-Maxwell stuff.

Edit:

Ah, in Maxwell something similar "may" had been the Viewport Multicast, not sure how well it compares and something fingers crossed reviewers will check out.
Cheers
 
Last edited:
Exactly. The whole thing was not consistent at all. That sounds more like a check box sentence because they "must" have it, just as they said Maxwell has it...with limitations. If they could really do it well, they would have shown it, being the Marketing monster that they are, they would not let the opportunity go to waste, like they didn't with Fermi and Tesselation.
To be fair, yesterday's disclosure was aimed at the gamer audience of Dreamhack. That's not where you give a lot of technical detail in general, but concentrate more on the wow factor. If tech briefings stay as superficial as yesterday's live stream, I concur, then there's something fishy.
 
To be fair, yesterday's disclosure was aimed at the gamer audience of Dreamhack. That's not where you give a lot of technical detail in general, but concentrate more on the wow factor. If tech briefings stay as superficial as yesterday's live stream, I concur, then there's something fishy.

True, but the press was also treated and invited with the Power of 10 enigma. And, as far as I know, its not like the press had access to much more information either? What would be the problem with giving packs with more information to the press while still putting on the pony show? I guess we will see, but ever since the way GP100 presentation was handled that I feel something is rotten in the reign of Denmark...
 
......
Edit:
Ah, in Maxwell something similar "may" had been the Viewport Multicast, not sure how well it compares and something fingers crossed reviewers will check out.
Cheers
Ok I tried to see both Maxwell and Pascal being equal perspective but I cannot.
Even the Maxwell Viewport Multicast has constraints and focuses on viewports in the context mentioned earlier; voxelization-cube maps-variable resolution shadow/detail to the same window/display.

Cheers
 
Last edited:
Yes, there were two slides. But the numbers Huan brabbled over and over again during the last 5 minutes were only "2x performance, 3x efficiency".
Of course that's what he babbled over and over because it makes it look so much better. Early in the show the slide showed some 5-10% better efficiency than Maxwell depending on consumption
 
Close to nothing has been said about 1070's performance and I sincerely doubt it'll match a Titan X, much less be faster.
$370 does sound great for something (that might be) close to 980 Ti's performance at a smaller TDP, though. With this, Polaris 10 might get pushed to lower than $299.
Does it though? We know even slowest Polaris 10 should be 970/R9290 level for VR. Then there's rumours of 980 Ti -like performance in 3DMark for the top Polaris 10. It's around 40 % difference between the 2 points, not impossible to cover on 1 chip if you do 3 or more models from it (which is likely since they only have 2 chips)

1070 should be around 980 Ti performance, too.
 
Does it though? We know even slowest Polaris 10 should be 970/R9290 level for VR. Then there's rumours of 980 Ti -like performance in 3DMark for the top Polaris 10. It's around 40 % difference between the 2 points, not impossible to cover on 1 chip if you do 3 or more models from it (which is likely since they only have 2 chips)

1070 should be around 980 Ti performance, too.
Roy recently said that their Polaris cards are not comparable to the 1070 and 1080 because those are high end.....
Make of that what you will.
Cheers
 
Ok I tried to see both Maxwell and Pascal being equal perspective but I cannot.
Even the Maxwell Viewport Multicast has constraints and focuses on viewports in the context mentioned earlier; voxelization-cube maps-variable resolution shadow/detail to the same window/display.

Cheers

You're probably overthinking it. All of those use cases rely on the same ability to project the same scene geometry to multiple viewports in a single pass.
 
Other than being clumsy to install, a dedicated bridge should have nothing but benefits compared to PCIe as long as the BW is available. And Pascal seems to double the available BW, so I don't see the problem.
so... 2.0 - 2.2 Gbps? You know right PCI-E 3.0 has almost the same bandwidth per line?
 
You're probably overthinking it. All of those use cases rely on the same ability to project the same scene geometry to multiple viewports in a single pass.
Possibly, but all of this is coming from my original question that pertained to a single pass rendering for multiple displays; whether that is both left+right lens, 3 displays,etc.

I have gone through the previous VRWorks and Maxwell presentation and none of them showed that capability, their focus was on a single window/display.
Now with Pascal they are showing the ability to do this for both multiple displays (so you can wrap 3 monitors and have the 3d environment corrected) and VR.
I appreciate I could be wrong with regards to what the presentation showed at 55mins and 1hr 4mins, but I have not seen any information from NVIDIA before Pascal to suggest otherwise.
Anyone with NVIDIA Maxwell-VR papers saying otherwise?

Edit:
In some way it could be deemed as serious as that would mean 30 minutes of that presentation was meaningless and should be criticised for misrepresenting the technology implemented.

Cheers
 
Last edited:
Back
Top