He says something like "software based real time ray traced global illumination" around the 10:10 markI think its SSGI (Screen Space Global Illumination) and not RealTimeRayTracing...
He says something like "software based real time ray traced global illumination" around the 10:10 markI think its SSGI (Screen Space Global Illumination) and not RealTimeRayTracing...
We only have actual consumption figures for the Xbox series X. Let's also not forget the ps5 has a 256bit bus, 16gb gddr6. So doubling the CUs alone isn't going to linearly increase the power consumption
XSX power consumption is already know... Despite it having a 310W PSU, its power usage is about 210W while running Gears 5. That is for everything, meaning RAM, CPU/GPU, SSD and so on. Other games use a lot less power.
If you shove off 50W for all the additional components (as an educated guess), you're left with ~160W for a 52CU RDNA2 GPU at 1.8GHz.
It's possible that they haven't expected the doubling of FP32 units in Ampere.
It's possible that they haven't expected the doubling of FP32 units in Ampere.
It's not that they would, it's that you'd get a lot less of a performance increase by making just a wider Turing, even with higher clocks. So it is possible that AMD expected that instead of what Ampere turned out to be. Still I wouldn't count on this. They both tend to know quite a lot about each other well in advance of each generation release.Even then, nVidia would'nt have release a new génération slower (or as fast as) than the old one (even if it was the top of the line)
It's not that they would, it's that you'd get a lot less of a performance increase by making just a wider Turing, even with higher clocks. So it is possible that AMD expected that instead of what Ampere turned out to be. Still I wouldn't count on this. They both tend to know quite a lot about each other well in advance of each generation release.
What is the original performance/power point?
I just don't see why the clocks that are coming in the cards aren't the originally intended.
We do have a PS5 clocked at 2.23GHz on what seems to be a ~150-200W power budget for the APU.
Its a 350 watt psu for PS5 and 340 watt PSU for PS5 All Digital. I think it has more than 200 watts power budget for the APU.
They were originally targeting the RTX 2080 Ti. With the release of the RTX 3080 and the performance and the higher power that it has AMD probably needed to so the same to compete. Thus the higher clocks and power.
XSX power consumption is already know... Despite it having a 310W PSU, its power usage is about 210W while running Gears 5. That is for everything, meaning RAM, CPU/GPU, SSD and so on. Other games use a lot less power.
If you shove off 50W for all the additional components (as an educated guess), you're left with ~160W for a 52CU RDNA2 GPU at 1.8GHz.
Thats still without portions of the GPU being utilized, like the RTRT hardware. Its entirely unknown how that may have an impact.
But the XB1X consumed ~170W at the wall in gears, and XSX consumes ~210W. Unless you're suggesting the CPUs consume next to zero power, the APU budget shouldn't even be close to 200W.
And I'm sure Lisa Su told you that personally?
You mean you're actually extrapolating data from an almost identical architecture, on an almost identical process, and likely on a slightly worse silicon bin, and suggesting we compare that to PC GPUs?? You can't do that! Because....reasons.
Well since the RDNA2 HW seemingly cannot do RT concurrently, I'd say it shouldn't have much of an impact at all, if any.
They were originally targeting the RTX 2080 Ti. With the release of the RTX 3080 and the performance and the higher power that it has AMD probably needed to so the same to compete. Thus the higher clocks and power.
I don't know really. Maybe they really just hit their power limits or some current limit at those points in time? I am no Furmark expert by any means.In the videos I linked, Furmark framerates varied massively. Any ideas why?
As a texture-type instruction, we know that a BVH instruction cannot issue in parallel with a texture/vmem operation, but is that the same as them not working concurrently? Outside of that initial few cycles, it could be hundreds of cycles before the buses used by the BVH instruction are needed by it. Are we sure a wavefront cannot issue a memory operation or maybe another BVH? Texturing and memory ops can be issued freely until a waitcnt instruction is encountered and not enough have resolved.RDNA2 cannot do texturing concurrently AFAIK. It depends how much power RT units are taking. Which is not known.
2.0 and 2.4 are both "multi gigahertz clock frequency" but the resulting power consumption of a chip on these can be drastically different.
As a texture-type instruction, we know that a BVH instruction cannot issue in parallel with a texture/vmem operation, but is that the same as them not working concurrently? Outside of that initial few cycles, it could be hundreds of cycles before the buses used by the BVH instruction are needed by it. Are we sure a wavefront cannot issue a memory operation or maybe another BVH? Texturing and memory ops can be issued freely until a waitcnt instruction is encountered and not enough have resolved.
2.0 and 2.4 are both "multi gigahertz clock frequency" but the resulting power consumption of a chip on these can be drastically different.
As a texture-type instruction, we know that a BVH instruction cannot issue in parallel with a texture/vmem operation, but is that the same as them not working concurrently? Outside of that initial few cycles, it could be hundreds of cycles before the buses used by the BVH instruction are needed by it. Are we sure a wavefront cannot issue a memory operation or maybe another BVH? Texturing and memory ops can be issued freely until a waitcnt instruction is encountered and not enough have resolved.
You doubt that AMD will adjust their clocks according to what NV has launched already?I think we all know that. However claiming that AMD did not intend to clock x.xx ghz (rumoured) speed and only did so as a reaction to AMD, is speculation at best.
You doubt that AMD will adjust their clocks according to what NV has launched already?
They can of course lower them just as well as increasing them but both possibilities are essentially a given at this point. They would be stupid not to.
Even normal load-stores are quite liberal — operations can be freely reordered, and only RF writeback is in program order. The texture load-store path has been supporting varying latency and a huge swarm of capabilities since GCN anyway, say for example, address coaleasing or the lack thereof can cause a load instruction to take a varying number of cycles to complete, even though multiple load instructions can be issued back-to-back. RDNA enhanced it further by adding a low-latency path bypassing the samplers, and RDNA 2 BVH intersection seems to be merely a (new) cherry on the "filtering/pre-processing" pie.As a texture-type instruction, we know that a BVH instruction cannot issue in parallel with a texture/vmem operation, but is that the same as them not working concurrently? Outside of that initial few cycles, it could be hundreds of cycles before the buses used by the BVH instruction are needed by it. Are we sure a wavefront cannot issue a memory operation or maybe another BVH? Texturing and memory ops can be issued freely until a waitcnt instruction is encountered and not enough have resolved.