Nvidia Pascal Announcement

Its changed its definite active now on maxwell 2 cards, don't know its the same AMD path but its turned on by default.
 
Its changed its definite active now on maxwell 2 cards, don't know its the same AMD path but its turned on by default.
Yes but I am talking exclusively about AoTS, in context of the previous posts about the current benchmark :)
Ext3H does also raise a point I had not considered, not only does it potentially mean multiple DX12 paths for rendering associated with async compute, but also more complexity with the driver from NVIDIA when it comes to DX12.
In an ideal world it will all sync up nicely, but NVIDIA is not heavily engaged with every game project at a low program level.

If you are talking about AoTS, remember where you saw them removing that DX12 path to disable async compute for NVIDIA?

Cheers
 
Actually not just software. You need a hardware implementation (the way you code the colors, the brightness etc). AMD have practical support for HDR in Fury and i think, not sure, in the 380(games and pics but not videos since the standards for HDR videos weren't complete at the time)

Robert Hallock stated it would become available on current 300-series graphics cards, not just the 380.
 
If you are talking about AoTS, remember where you saw them removing that DX12 path to disable async compute for NVIDIA?

Cheers
They didn't remove DX12 path, they only removed Async compute from NVIDIAs DX12 path at one point
 
They didn't remove DX12 path, they only removed Async compute from NVIDIAs DX12 path at one point
I am specifically talking about it as it was in my OP and the DX12 rendering path.....
All my posts are in that context as can be seen by the way I am quoting Kollock....
Sorry if I shortened it to path only a few posts later for laziness and expecting it to be understood in context of previous posts :)

Nov 2015 said:
Personally, I think one could just as easily make the claim that we were biased toward Nvidia as the only 'vendor' specific code is for Nvidia where we had to shutdown async compute. By vendor specific, I mean a case wherewe look at the Vendor ID and make changes to our rendering path. Curiously, their driver reported this feature was functional but attempting to use it was an unmitigated disaster in terms of performance and conformance so we shut it down on their hardware. As far as I know, Maxwell doesn't really have Async Compute so I don't know why their driver was trying to expose that. The only other thing that is different between them is that Nvidia does fall into Tier 2 class binding hardware instead of Tier 3 like AMD which requires a little bit more CPU overhead in D3D12, but I don't think it ended up being very significant. This isn't a vendor specific path, as it's responding to capabilities the driver reports.

From our perspective, one of the surprising things about the results is just how good Nvidia's DX11 perf is. But that's a very recent development, with huge CPU perf improvements over the last month. Still, DX12 CPU overhead is still far far better on Nvidia, and we haven't even tuned it as much as DX11.

Mid Feb 2016 said:
Async compute is currently forcibly disabled on public builds of Ashes for NV hardware. Whatever performance changes you are seeing driver to driver doesn't have anything to do with async compute.
I can confirm that the latest shipping DX12 drivers from NV do support async compute. You'd have to ask NV how specifically it is implemented.

This sort of reminds me arguing about the FireStrike results :)
Cheers
 
AFIK all current GPUs can output r10g10b10a2 and r11g11b10.... And since there is no a fixed and well defined standard, as MS suggested in the last GDC conference, you can do it all manually....

edit: only issue is the display port/hdmi support...
 
AFIK all current GPUs can output r10g10b10a2 and r11g11b10.... And since there is no a fixed and well defined standard, as MS suggested in the last GDC conference, you can do it all manually....

edit: only issue is the display port/hdmi support...
NVIDIA had at least at some point disabled 10bit support on GeForces while it was enabled on Quadros
 
Ive seen some links about the Nvidia support of HDR but they only say it exist, not which standard use if any. And only mention it for pics and games not for videos(what I want the most) Compare to the AMD side and the full article they did about it.
 
NVIDIA had at least at some point disabled 10bit support on GeForces while it was enabled on Quadros
But not by disabling the corresponding texture formats, only by limiting what format they allow on the HDMI/DVI/DP link. The rest is just software, and possibly the driver applying some additional tone mapping to ensure that legacy applications are not accidentally using the entire dynamic range.
 
I tried to compare a few metrics for the 1080 and the prior gen products, related to the marketed power numbers, transistor count, and area.
The error bars are most likely very wide, but there are some things that might be interesting.

The Titan X has a 250W power budget with 8 billion transistors and 601mm2 of area.
The 1080 has 180W from 7.2B transistors. Is there something more concrete than the rumored 300-330mm2 for area?
The GTX 980 has 165W for 5.2B for 398mm2.

In terms of headline power versus transistor count, it is 31.3 W/Gtransistor for the Titan X, 25 for 1080, and 31.7 for the 980.
I think it might be interesting since we see that Maxwell at 28nm is rather consistent, and Pascal is not close to being half the power per transistor that the process node should be able to afford.
One unknown is what the narrower GDDR5X bus (vs the Titan) might do for power consumption. It seems like it could be more efficient, and that could allow for a little more power draw at the transistor level than the calculations above show.
That might indicate how much more in the ~30% more performance iso-power/50-60% less power iso-performance tradeoff this specific implementation went with. That might have implications for any such iso-clock product, which could make a mobile or Nano-type product dive down in power between the rumored ranges of Polaris 11 and 10.

In terms of area, it's .42 and .41 W/mm2 for the 28nm GPUs. Without firm numbers for Pascal, I am not sure. Using 300 or 330 gets .54 - .6 W/mm2.

What might be interesting when compared with GCN at 28nm and maybe Polaris 10:
I'm getting 30.9 W/Gtrans for Fury (~19.7 Nano), 44.4 for the 390X, and 38 for the 380X.
Power density is .46W/mm2 (.29), .62, and .53 respectively.
(I'm giving both the 1080 and Nano the same benefit of the doubt for their estimated board power versus what an 8-pin could allow.)

If Polaris 10 were to be 2x-2.5x as power-efficient as Tonga, then it's ~15.2-19.0 W/Gtrans, which would sound like what would happen if Polaris did not push the clock much beyond where GCN is already at.
There is some slack there, although given the rumored 100mm2 or so less die area, power density concerns may limit how much the silicon could be pushed in order to position Polaris 10 against the 1070, besides unknowns related to where GCN's preferred clock range might be at this node. I saw some rumored 150W numbers for a ceiling (would be worse power-density wise than 28nm).

What is also interesting with Nvidia' clock range and apparent losing of about 2/3 the power efficiency gain at iso-performance(transistor-level) is where 1.7 GHz might fit if corrected for GCN's higher power at lower clocks and AMD's FinFET slide concerning the power/clock curve. The 1080 seems to be trading against efficiency for area savings at this point. Its power density seems to be around the 390X, so whether there is much desirable room to go higher outside for a mass-market product is unclear to me.
 
NVIDIA had at least at some point disabled 10bit support on GeForces while it was enabled on Quadros
I am not talking about proprietary OGL extension to bypass DWM output... on Windowed mode, only Microsoft can do something (ie: allow 10-11 bit mode on compositor, which is going to be in redstone or in redstone 2).
As for render target, there are no issues using 10 or 11 integers.. And in full-screen mode you can output 10 and 11 bit per colour channel on every currently available GPU.
Current issues are related mostly to DP/HDMI output standard formats.
 
There is some slack there, although given the rumored 100mm2 or so less die area, power density concerns may limit how much the silicon could be pushed in order to position Polaris 10 against the 1070, besides unknowns related to where GCN's preferred clock range might be at this node.

Given how Polaris 10 is confirmed to be the higher performance part, I'm sure it's supposed to be a ~230mm^2 chip. I haven't heard of a <100mm^2 GPU, not even the lower-end Polaris 11, but it could be that..
 
Given how Polaris 10 is confirmed to be the higher performance part, I'm sure it's supposed to be a ~230mm^2 chip. I haven't heard of a <100mm^2 GPU, not even the lower-end Polaris 11, but it could be that..

That 100mm2 was for the relative difference in die area to some of the estimates for the 1080/1070.
 
Regarding this HDR thing - that's just 10 bpc output right? I've run 10bpc on Radeon 6950 and GTX 970. I have a Benq BL3200PT monitor.

I experimented a bit with Alien Isolation's deep color setting. I really couldn't see anything different. I assume higher color depth should reduce banding problems.
 
Regarding this HDR thing - that's just 10 bpc output right? I've run 10bpc on Radeon 6950 and GTX 970. I have a Benq BL3200PT monitor.

I experimented a bit with Alien Isolation's deep color setting. I really couldn't see anything different. I assume higher color depth should reduce banding problems.
No it isn't just 10bpc is much much more than that. I suggest you to read more about it.
 
Back
Top