That RDNA 1.8 Consoles Rumor *spawn*

It needs to be done in real time if you do it after loading it into memory, after streaming it from the SSD. This could potentially allow you to get your mip 0 (the highest res one) using 1/4 of the data from SSD, or for no transfer from SSD at all if you use a mip level already in memory.

You can do it before shipping the game and better yet it's guaranteed to work everywhere at nearly no performance hit. If storage space is a concern then you can use texture compression as well ...

And texture streaming is an orthogonal issue altogether. Inferencing shaders can't really beat texture sampling hardware when it comes to these tasks and they have optimized compression paths too ...

I/O traffic stopped being a bottleneck ironically since the new consoles introduced SSDs so using ML to minimize this traffic isn't going to matter in the slightest. A better case for ML texture upscaling would've been on systems using an HDDs since high I/O traffic would prove to be a bottleneck ...
 
You can do it before shipping the game and better yet it's guaranteed to work everywhere at nearly no performance hit. If storage space is a concern then you can use texture compression as well ...

And texture streaming is an orthogonal issue altogether. Inferencing shaders can't really beat texture sampling hardware when it comes to these tasks and they have optimized compression paths too ...

I/O traffic stopped being a bottleneck ironically since the new consoles introduced SSDs so using ML to minimize this traffic isn't going to matter in the slightest. A better case for ML texture upscaling would've been on systems using an HDDs since high I/O traffic would prove to be a bottleneck ...

Microsoft is using ML to take a lower res texture and up res it on the fly
https://www.windowscentral.com/microsoft-wants-use-machine-learning-improve-poor-game-textures
You were talking about machine learning and content generation. I think that's going to be interesting. One of the studios inside Microsoft has been experimenting with using ML models for asset generation. It's working scarily well. To the point where we're looking at shipping really low-res textures and having ML models uprez the textures in real time. You can't tell the difference between the hand-authored high-res texture and the machine-scaled-up low-res texture, to the point that you may as well ship the low-res texture and let the machine do it... Like literally not having to ship massive 2K by 2K textures. You can ship tiny textures... The download is way smaller, but there's no appreciable difference in game quality. Think of it more like a magical compression technology. That's really magical. It takes a huge R&D budget. I look at things like that and say — either this is the next hard thing to compete on, hiring data scientists for a game studio, or it's a product opportunity. We could be providing technologies like this to everyone to level the playing field again

So by the sound of the quote it looks like ms is willing to do the work for game developers to reap the benefits of it. You say SSDs remove the bottleneck but I'd like to ask how. Even the ps5's ssd pales in comparison to the GDDR ram in the console.

As this person said you can ship much lower res textures which means more games will fit on a users ssd and at the same time downloads will be faster and for those on metered connections cost less. Then you can move the low res textures into ram for the gpu to upscale in real time thus needing less bandwidth both on the graphics card when its being read and ssd has to transfer it. If they can use this with sfs they may be able to use even less bandwidth.

I don't see the down size here
 
What’s the point of this? The Pro was more Vega than the One X. FP16 and HBCC didn’t come close to being game changers for Vega PC gpus. The PS4 didn’t need all the features added to later versions of GCN to become best selling console of its gen with plenty of good looking games to boot.

It’s not like every RDNA2 feature must be present or these two consoles will be failures. The XSX biggest advantage will come from its 20% more Tflops and 25% more local memory bandwidth. The PS5 has the raw bandwidth advantage with its SSD.

The unique arch features of both consoles may cancel each other out or not be readily employed or exploited by third parties devs (how much resources are going to be spent on features that are not standard across platforms).

It’s like we are making a mountain out of molehill to beat a dead horse. We don’t have enough info about the hardware and the bits of info we are getting of late aren’t doing much to change that reality.
 
Last edited:
It needs to be done in real time if you do it after loading it into memory, after streaming it from the SSD. This could potentially allow you to get your mip 0 (the highest res one) using 1/4 of the data from SSD, or for no transfer from SSD at all if you use a mip level already in memory.

So in the same way you'd trigger streaming from the SSD, you trigger an up-res instead. Smaller downloads, smaller installs, less data needed to be streamed from the SSD. I can see why some developers are looking into it.
So why having an ultra fast SSD ? [emoji6]

I see ML for textures something more useful in Stadia-like consoles.
 
So by the sound of the quote it looks like ms is willing to do the work for game developers to reap the benefits of it. You say SSDs remove the bottleneck but I'd like to ask how. Even the ps5's ssd pales in comparison to the GDDR ram in the console.

Simple, SSDs have a higher I/O bandwidth than HDDs do but of course it isn't going to match the speed of VRAM. More importantly why are you concerned about SSDs not being able to match the speeds offered by VRAM in the case of streaming ? In the near future, SSDs will be capable of streaming 3 raw uncompressed 24-bit 4096x4096 per 16.6ms/frame which is multiple times more texel density than 4K resolution itself is capable of displaying so I'm failing to see how from your perspective that I/O traffic will be an issue at all ...

Also, ML isn't this magical change in the paradigm of data compression you think it is that will somehow revolutionize our current texture streaming systems. It's just another data compression method with it's own faults ...

As this person said you can ship much lower res textures which means more games will fit on a users ssd and at the same time downloads will be faster and for those on metered connections cost less. Then you can move the low res textures into ram for the gpu to upscale in real time thus needing less bandwidth both on the graphics card when its being read and ssd has to transfer it. If they can use this with sfs they may be able to use even less bandwidth.

I don't see the down size here

Well I can already see a downside because using ML isn't anywhere near close to a lossless conversion process that they seem to hint. Sometimes using ML to do texture upscaling will noticeably diverge from the original texture result.

By comparison doing texture compression using offline preprocessing and then using the texture sampler's hardware block to do the decompression will get you near ground truth result which is close to the quality of the uncompressed texture. This way developers can minimize the signal to noise ratio thus controlling the quality/data loss tradeoff depending on the compression formats they use. Best of all, developers don't have to train any black box models to get texture compression technology right. Texture compression is so much simpler to deal with and you don't burn shader ALU operations either which could be used for more useful things ...

Sure you could do inferencing on a 128x128 texture size and output a 4096x4096 texture size result but it isn't going to be pretty. While you could fix these quality issues by increasing the input resolution of the textures but then you'd have to give up on being able to use a small amount of data anyways thus negating your initial premise of the advantage behind in using ML. We also can't guarantee that the trained model will consistently work the way we expect it to either compared to a hardware decompression block which will give us more consistent results ...

In conclusion, it's pretty hard to beat 2+ decades worth of optimizations behind texture samplers ... (even Larrabee prototypes had texture sampling hardware and future GPUs will still have them despite having less graphics state than their predecessors did)
 
Source ?

Tiled resources was a bad idea before when GCN was introduced and it's still a bad idea now with RDNA. I'm not holding out much hope that RDNA2 will improve upon their implementation in this regard ...

Source? The problem encountered with tiled resources during the tile determination step of texture streaming has nothing really to do with tiled resources per say but with how the texture indirection was being used and how the residency map was being updated (two render passes for an exact solution as per David Cornell, one for determining the texture coordinates and another for translating the coordinates to a memory address using texture indirection). Sampler feedback greatly facilitates the tile determination stage (indeed Sampler Feedback is the ultimate tile determination process, nothing can beat it by definition).
 
I am Italian and I can tell that multiplayer.it is not reliable source about computer graphics programming. And what is going on demonstrates Rosario Leonardi shouldn't talk about such technical things to an inadequate site (a videogame website where more half of the content is clickbait and the other half is advertising masked as news).
buon giorno!

RDNA 1 and RDNA2, something in between, doesn't matter as much, and it's normal, imho. What matters is that he mentioned there isn't ML stuff on the console. Something like DLSS would be the consoles salvation.

EdNdcJwUYAM5Xqz
 
buon giorno!

RDNA 1 and RDNA2, something in between, doesn't matter as much, and it's normal, imho. What matters is that he mentioned there isn't ML stuff on the console. Something like DLSS would be the consoles salvation.

EdNdcJwUYAM5Xqz
I think there were some hints that the Series X may have some ML, right?
 
Mesh shader is and Microsoft/nVidia name for same thing as AMD primitive shader.
I actually looked into it and the answer is a "Yes-maybe".

Primitive shaders as per AMD:

The implementation of what AMD terms as primitive shaders starts with the patent: Combined World-Space pipeline shader stage (https://patentimages.storage.googleapis.com/ee/f6/07/493c36aae3ec7b/US20180082470A1.pdf) .
The world-space shader regroups the vertex, hull and geometry shader functions and enables or disables them as required.
Mesh Shaders theoretically provides the following advantages:

(1) The Input Assembler stages is eliminated meaning that index compression is possible and that more generally mesh shader input is user defined.
(2) Rendering of any primitive does not need to be done in API mandated order such that one screen-space pipeline will not stall another as in the classical graphic pipeline
(3) Culling of invisible primitives can be done before the vertex/hull/geometry/tessellation stage (or world-space shader stage) thus making huge computational savings.

Those advantages are brought forward by AMD in the filing "Optimizing Primitive Shaders" which described AMD's equivalent of mesh shaders.(https://patentimages.storage.googleapis.com/66/54/00/c86d30f0c1c61e/US20200193703A1.pdf).
The Input Assembler is done away with and rendering characteristic of the "world-space shader stage" is deferred (except for rending that relates to positional data) until invisible/insignificant primitives are culled in a so-called "Deferred attribute shading stages". The latter shader stage is explicitly driven by user-provided shader code that takes up the role of the primitive assembler in a move away from fixed-function hardware.

So far, "world-space shader" seems to comply with D3D definition of "primitive shader" and its further optimisation into Deferred attribute shading stage seems to be a match for D3D's definition of mesh shaders, right?

Wrong.

In a filing predating the aforementioned patent applications, "deferred attribute shading stage" is subsumed into a...primitive shader stage as described in the aptly named filing: "Primitive Shaders"
(https://patentimages.storage.googleapis.com/16/34/77/be4393dc5704c5/US20180082399A1.pdf).

So which is which? Guess we will know only after the teardowns.

As for the XSX, we know that it indeed has mesh shading because of two filings from Xbox ATG members :

(1) Index Buffer Compression - which explicitly identifies the mesh shader stage and explains its functioning (obviously the absence of the input assembler is a big hint) (https://patentimages.storage.googleapis.com/54/e8/07/67037358a9952f/US20180232912A1.pdf)
(2) Compact visibiliy state for GPUs compatible with hardware instancing - which describes the use of visibility buffers for what amounts to the culling stage AMD's Deferred attribute shading stage. (https://patentimages.storage.googleapis.com/4f/08/aa/909db72030cd0b/WO2019204064A1.pdf)
 
buon giorno!

RDNA 1 and RDNA2, something in between, doesn't matter as much, and it's normal, imho. What matters is that he mentioned there isn't ML stuff on the console. Something like DLSS would be the consoles salvation.

EdNdcJwUYAM5Xqz

PS5 being between RDNA1 and 2 was already known publically a long time (github, rumors and other leaks). Then MS advertised their VRS heavily, and the ML speculations then. The sony engineer gets attacked because he's just that, a sony engineer (or was?).
 
I'm putting Mark Cerny's quote here... because it certainly belongs in this thread...

"I'd like to make clear two points that can be quite confusing: First we have a custom amd gpu based on rdna2 tech. What does that mean?

Amd is continuously improving and revising their tech. For rdna2 their goals were roughly speaking to reduce power consumption by rearchitecting the gpu to put data closer to where it's needed, to optimize for performance, and to add more advanced features. But that feature set is malleable which is to say we have our own needs for ps5 and that can factor into what the amd roadmap becomes. So collaboration is born. If we bring concepts to amd that are felt to be widely useful, then they can be adopted into rdna2 and used broadly including in pc gpus. If the ideas are sufficiently specific to what we're trying to accomplish like the gpu cache scrubber, then they end up being just for us. If you see a similar discrete gpu available as a pc card roughly at the same time we release our console, that means our collaboration with amd succeeded in producing tech useful in both worlds. It doesn't mean we at sony simply incorporated the pc part into our console."
AMD really got all their bases covered.
 
The guy is italian and he talked to the site

We know it is the AMD implementation since "Road to PS5". Don't read the added comment of Gavin Stevens who doesn't know about what he is talking about. No access to any devkit.

https://www.eurogamer.net/articles/...s-and-tech-that-deliver-sonys-next-gen-vision





The site is not very important. What is important is the message and if Sony must do a decent interview it is for explain which feature they choose and the reason it seems Raytracing and VRS reading multiples Sony patents are here but not sure about ML and Sampler feedback.

And they must explain what are the PS5 features missing of the RDNA 2 GPU if there is more than GPU scrubbers and if they are purely custom or they will appear later on a more advanced RDNA GPU.

There are zero patents from Sony pertaining to VRS per primitive as opposed as to screen area (due to display curvature). I am more and more convinced that Sony somehow missed out on it.
 
I actually looked into it and the answer is a "Yes-maybe".

Primitive shaders as per AMD:

The implementation of what AMD terms as primitive shaders starts with the patent: Combined World-Space pipeline shader stage (https://patentimages.storage.googleapis.com/ee/f6/07/493c36aae3ec7b/US20180082470A1.pdf) .
The world-space shader regroups the vertex, hull and geometry shader functions and enables or disables them as required.
Mesh Shaders theoretically provides the following advantages:

(1) The Input Assembler stages is eliminated meaning that index compression is possible and that more generally mesh shader input is user defined.
(2) Rendering of any primitive does not need to be done in API mandated order such that one screen-space pipeline will not stall another as in the classical graphic pipeline
(3) Culling of invisible primitives can be done before the vertex/hull/geometry/tessellation stage (or world-space shader stage) thus making huge computational savings.

Those advantages are brought forward by AMD in the filing "Optimizing Primitive Shaders" which described AMD's equivalent of mesh shaders.(https://patentimages.storage.googleapis.com/66/54/00/c86d30f0c1c61e/US20200193703A1.pdf).
The Input Assembler is done away with and rendering characteristic of the "world-space shader stage" is deferred (except for rending that relates to positional data) until invisible/insignificant primitives are culled in a so-called "Deferred attribute shading stages". The latter shader stage is explicitly driven by user-provided shader code that takes up the role of the primitive assembler in a move away from fixed-function hardware.

So far, "world-space shader" seems to comply with D3D definition of "primitive shader" and its further optimisation into Deferred attribute shading stage seems to be a match for D3D's definition of mesh shaders, right?

Wrong.

In a filing predating the aforementioned patent applications, "deferred attribute shading stage" is subsumed into a...primitive shader stage as described in the aptly named filing: "Primitive Shaders"
(https://patentimages.storage.googleapis.com/16/34/77/be4393dc5704c5/US20180082399A1.pdf).

So which is which? Guess we will know only after the teardowns.

As for the XSX, we know that it indeed has mesh shading because of two filings from Xbox ATG members :

(1) Index Buffer Compression - which explicitly identifies the mesh shader stage and explains its functioning (obviously the absence of the input assembler is a big hint) (https://patentimages.storage.googleapis.com/54/e8/07/67037358a9952f/US20180232912A1.pdf)
(2) Compact visibiliy state for GPUs compatible with hardware instancing - which describes the use of visibility buffers for what amounts to the culling stage AMD's Deferred attribute shading stage. (https://patentimages.storage.googleapis.com/4f/08/aa/909db72030cd0b/WO2019204064A1.pdf)

Well found.
So some real problems are there against XSX... That combined with the extreme clock of this console GPU and weird solution found to control temperature spikes lead me to a prudent wait & see in regards of buying it... can't say is a bad console but also not totally convinced.
 
Well found.
So some real problems are there against XSX... That combined with the extreme clock of this console GPU and weird solution found to control temperature spikes lead me to a prudent wait & see in regards of buying it... can't say is a bad console but also not totally convinced.
Man.. Don't do that. We nerds like to speculate about tech but never lose sight that you are buying consoles to play games.
 
Source? The problem encountered with tiled resources during the tile determination step of texture streaming has nothing really to do with tiled resources per say but with how the texture indirection was being used and how the residency map was being updated (two render passes for an exact solution as per David Cornell, one for determining the texture coordinates and another for translating the coordinates to a memory address using texture indirection). Sampler feedback greatly facilitates the tile determination stage (indeed Sampler Feedback is the ultimate tile determination process, nothing can beat it by definition).

I think you're overcomplicating what Sampler Feedback's are. The feature as I like to think about it is really nothing more than a feedback texture containing a requested MIP level for the sampled location.

Having actually looked at the spec for myself, in the case of streaming there is also a MinMip/'residency' texture describing the current MIP level. Herein lies the problem behind "MinMip texture", since it reflects the current tiled resource mappings so updating your MinMip texture involves doing calls to UpdateTileMappings API to change the tiled resource mappings so it pretty much kills any hope of this texture streaming system being performant ...

Seeing as how AMD specifically discouraged against using that API before and they don't recommend using tiled resources for RDNA, I'm going to assume that their recommendation remains true for RDNA2 until they show a robust reduction in their binding costs otherwise ...

Sampler feedbacks are about as valuable as tiled resources are when it comes to texture streaming which amounts to nearly nothing ...
 
Well found.
So some real problems are there against XSX... That combined with the extreme clock of this console GPU and weird solution found to control temperature spikes lead me to a prudent wait & see in regards of buying it... can't say is a bad console but also not totally convinced.

Before saying this wait at least Digitalfoundry face off, maybe the devs like Billy Khan lead engine programmer of ID software praised the console for a reason. ;)

And he has devkits, not like people speculating here.;)

 
Last edited:
Back
Top