Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
Cerny said Kraken with 5.5GB/s raw bandwidth is typically 8-9GB/s effective bandwith, but it can go as high as over 22GB/s. That's 4:1.
I think that ~22GB/s is the throughput of the link that connects the decompressor to main ram. That speed would be achieved when decompressing things that have very low entropy. Since the output of the decompression is formats directly fed to the GPU, such data does actually exist, even if it's a small portion of the total data. 2:1 is probably closer to what they expect typical content to compress to.

you mean milli amps right ;)

hahah 198 amps could charge 5 teslas simultaneously

Power = current*voltage. 198 amps * 1.32V is only 261W.
 
AMD has claimed a 50% uplift in power efficiency for RDNA1 over RDNA2, to which they specifically mention higher clocks:

mmqT0fX.png
Ds8yZwC.png


It remains to be seen how much of these 50% come from better IPC and how much it comes from higher clocks.

The 5700 had a typical clock (game clock) at around 1700MHz. If you consider 35% of the total 50% improvements are coming from increased clocks, then we should expect a 5700 XT equivalent card at the same power budget to be clocked at 1700*1.35 = 2.3GHz like you suggested.




Did he say how much memory is available to developers?
Even if he did, it wouldn't say how much DDR4 the OS is taking for its features, e.g. for video recording.
If the DDR4 is there.
15% from IPC, IIRC it was in some slide.
 
I think that ~22GB/s is the throughput of the link that connects the decompressor to main ram. That speed would be achieved when decompressing things that have very low entropy. Since the output of the decompression is formats directly fed to the GPU, such data does actually exist, even if it's a small portion of the total data. 2:1 is probably closer to what they expect typical content to compress to.



Power = current*voltage. 198 amps * 1.32V is only 261W.
oh snap!
thanks mate haha.

I can't believe I didn't think of doing a simple equation to work that out.
Insane number of amps pumping there
 

James Stanard answered a few things:

James Stanard said:
Sorry. We're not ready to share more publicly, but we have revealed more to Xbox licensees. I am actively improving the Xbox Texture Compressor (XBTC) and will be sharing frequent updates with the Xbox community as they come.

James Stanard said:
One of the most overlooked elements is our new sampler feedback streaming. It allows us to elegantly stream individual texture pages (rather than whole mips) based on GPU texture fetches. Special filtering hardware allows graceful fallback to the resident mip levels.
This fundamentally amplifies our memory size because we don't have to hold texture pages in memory that aren't needed. It also reduces our streaming bandwidth demands by only streaming what we need.

Richard Geldreich said:
This "memory amplification" approach is what Halo Wars 1 did on Xbox 360. We decompressed BC1-5 textures into a GPU memory cache, using something like crunch. Except, we didn't have sampler feedback, so we had to estimate which mips were needed.
When the user clicks to a completely different part of the map, the engine needs a bunch of new mipmap levels. We had a visual "pop" as the highest mip was unpacked into the GPU tex cache. Thankfully in an RTS the HVS doesn't notice it after moving to a new part of a map.
James Stanard said:
We've made this all more graceful and eliminated the pops. Sampler Feedback makes this system easier to use, but being able to clamp mips on a per-tile basis and custom texture filtering hardware make this broadly useful.

Long explanation of sampler feedback:

https://twitter.com/JamesStanard/status/1241061511071596544
James Stanard said:
Textures come in multiple resolutions called "mip levels". So far away textures are blurry and don't look all "pixely" (aliased), but close-up textures look sharp and clear. /1
As you get closer to a surface, you need to start reading the higher resolution texture. To save on memory, you might not store those high resolution textures and instead stream them off disk (streaming out textures that are now farther away from you.) /2
The highest resolution of a texture can be several megabytes. But it's possible that you don't need the whole texture if some of the texture is close to you, while most of it is still far away. Think about a texture that paints the ground, like the size of a football field. /3
If you only need high resolution on the corner of the texture closest to you, you should only have to stream in that portion of the texture. This portion of the texture might only be a quarter megabyte. /4
So not only do you only need to make space for a quarter megabyte in memory (as opposed to the multi-megabyte mip level), you only need to stream a quarter megabyte from disk. /5
The final piece of the puzzle is how do you know which part of the texture you need to stream. Sampler Feedback gives you that information. Basically, you sample the texture, and the feedback unit tells you if you tried to read the higher resolution texture region. /6
With this feedback, you initiate a streaming request, and when that completes, you notify the GPU it's okay to read that high resolution texture. But this can happen for each of the individual regions of the texture. /7
When you can't read the detailed texture region, the GPU automatically falls back to reading the less detailed region, even carefully blending around the edges so it's not obvious. But streaming speeds are such that it should only take a frame or two to up-res. /end

https://twitter.com/Remij010/status/1241104425143259136
Twitter user said:
Presumably it's standard for RDNA2 since RDNA2 fully supports DX12U's featureset. If PS5 fully supports RDNA2.. then it likely does as well. It's exposed through DX12U on Series X and PS5 devs likely can code for it directly.
https://twitter.com/JamesStanard/status/1241105373735432192
James Stanard said:
No, this isn't part of RDNA2. Xbox Velocity Architecture is all custom to Xbox. We developed a lot of custom tech for Xbox Series X just like Sony did for PS5.
Remij010 then asked "Just to be clear, what I meant wasn't that SFS (and XVA features) were a part of RDNA2 on PS5.. but rather RDNA2 from a hardware standpoint could support those features on PS5.. should Sony expose them. It's not a component in hardware exclusive to Series X, right?"
Unfortunately no answer.

About BCPack:

https://twitter.com/JamesStanard/status/1241076025477357568
James Stanard said:
It's a new compression codec specifically designed for game textures. They are almost always "block compressed" (BC) to begin with. We compress these textures even further, but for obvious reasons, we didn't want to call it "BCCompress".

A little about audio:

https://twitter.com/PresskottCore/status/1241112536780546048
Twitter user said:
And what is used for sound processing, can it utilise eac fully saturated?
https://twitter.com/JamesStanard/status/1241113225757724672
James Stanard said:
I'm a graphics guy, so I don't know much about it. I've just heard we've got an amazing dedicated sound chip for handling all sorts of complex 3D positional effects like Dolby Atmos.
 
There seems to be some misinformation in there for sure.
DX12U supports SFS.
RDNA 2 and Turing are announced to be DX12U.

So the hardware and software stacks have to be supported by both.
Unless they are supported only by software, and the XSX supports it in hardware.

XVA cannot be an extension of HBCC. Which is what I thought it was earlier. If it was HBCC, it can't be on Turing.

We need more information on this, something is amiss.
edit: nvm, the XVA is a mixture of DX12U and 2 hardware components. The 2 hardware components are specific to XSX, but not specific to DX12U
 
Last edited:
Nice post with all the sources applied. Still some form of secret sauce this gen with the audio and BCpack thingys ;) The memes are lol down those twitter posts, wont link to them, they dont belong here :p
 
It's just those tweets people have posted.





His tweets sound like someone curious about BCPack rather than someone who owns the tech.
Bcpack allows Ms a compression of 2:1 while Sony has ~1.6:1. Is that of any significance if the raw speed is faster than compressed?

This guy is just wrong. The BCPack compression is already factored into the figures MS has provided for decompressed throughput.

XVA cannot be an extension of HBCC. Which is what I thought it was earlier. If it was HBCC, it can't be on Turing.

I don't think this logically follows. There's no reason nVidia can't have implemented similar features in their own hardware.
 
This guy is just wrong. The BCPack compression is already factored into the figures MS has provided for decompressed throughput.



I don't think this logically follows. There's no reason nVidia can't have implemented similar features in their own hardware.
they would have talked about it back when Turing was announced. I just think it’s highly unlikely.
But HBCC was on a Radeon 7, and it doesn’t support DX12U or SFS. And neither does RDNA1.
Texture space shading on Turing though.
 
Last edited:
Status
Not open for further replies.
Back
Top