Unreal Engine 5, [UE5 Developer Availability 2022-04-05]

I am not sure if it's public Info - but I think the PS5 maybe has less RAM available than XSX? Sony has not puvlicly given the Reservation numbers out for threads and RAM yet.

They did not give the reservation maybe it is 3GB of OS reservation like the PS4 but at the end it doesn't change the fact that loading will be fast.
 
I am not sure if it's public Info - but I think the PS5 maybe has less RAM available than XSX? Sony has not publicly given the Reservation numbers out for threads and RAM yet.

There was one recent message by a verified Dice employee who mentioned in passing how it makes no sense to compare movies/shows with games that have to fit within 12 GBs of ram.
 
I've no doubt that loading will be really really fast this generation after using (the updated?) Quick Resume: I recorded video of quick resuming Battlefield V and Control: Ultimate Edition into empty memory (I don't have a capture card so I had to use my phone):

https://streamable.com/z6zp25
https://streamable.com/ba4g8o
BFV is a pretty demanding X1XE game while Control: UE is a Scarlett game, the speed is really impressive.
 
I'd love to know what big developers would rather have had if given the option:

Let's use Xbox Series X:

16GB RAM + fast NVME SSD (actual specs)

Or...

20GB RAM + SATA SSD

The 2nd option gives you unified memory bandwidth but I'm assuming you lose much of the benefits of Direct Storage and possibly SFS with the SATA SSD.

25% more ram but loose 4000% IO performance.

Decisions decisions.

I assume by unified you mean that all the memory would be to 2 chips and then at the faster speed. This would be almost certainly be instantly lost as all that memory would be used as a cache and not read from because of the slower IO.
 
Last edited:
25% more ram but loose 4000% IO performance.
...

Sry, to correct you a bit, but it is impossible to loose more than 100% if you can't go into negative numbers.
Something can be 4000% faster, but you can't be 101% slower (if you don't go backwards). 100% slower would already mean that you don't do anything ;)

Back to the IO-bandwidth vs memory:
I'm still not thinking that high IO bandwidth will change much above a certain point. There will be always certain scenarios where the one thing is better than the other, but I don't think that e.g. going above 1GB/s will change that much more. This should already be good for 90% of all scenarios. Everything more will get less and less efficient used. With that much IO Bandwidth and low latencies , you can already load a lot of data into memory and use a small buffer. The buffer might get smaller with more bandwidth, but not that much more.

Also it might be counter-productive to use a to small buffer, if you want to use Raytracing. That even things behind you might come into sight.


Don't get me wrong, having more is almost always better, but it get less effective.

On the other side, you trade memory for bandwidth. The HDD from the old gen didn't do much to really produce memory contention problems. But now we have the GPU, CPU (much faster) and the SSD that fight for the memory bandwidth. Losses through memory contention are no longer that small (especially if you read something quite small that decompresses into memory as big "file"). And when you have something bandwidth demanding like RT going on, you might decide that you take the route of a larger in memory buffer and instead save some memory bandwidth for better calculations, ...
It always depends.
 
Last edited:
It's an example of how the ability to keep more data in VRAM can be advantageous over being able to fill VRAM up quickly (but still much slower than the data already being there).



All of these things represent potential advantages to having a faster IO, no-ones denying that. As I said, each solution (more VRAM + slower IO vs less VRAM + faster IO) has it's pro's and cons. The idea that I'm pushing back on that smaller VRAM + fast IO is universally better under all circumstances.

All of the above scenario's you mentioned could potentially be done (and done better at that) with more VRAM albeit the developer would be limited in other ways. For example, the fast travel scenario - most games have fixed fast travel entry points. So if you can store 50% of your game content in VRAM at any given time, you can potentially pre-cache the entry points to all of the fast travel options while leaving the areas away from the entry points to be streamed in post travel. This would potentially allow for faster fast travel transitions than loading the data in from SSD.

Beam down from ship to planet - assuming there's more than 1 planet in the game then you pre-cache the one (or several) that are closest to the ship. Again, the result is no transition screen required at all because you're not limited by data transfer speed since that data's already in VRAM.

I'm not saying there aren't scenario's where the fast IO is simply better full stop. Of course there are. But equally there are scenario's where having more VRAM would be better full stop.



But as I mentioned earlier you wouldn't have to fill all the memory just to start the game. It may only require 4GB to actually get into the game while on a 32GB system the remaining 28GB is pre-cached while playing. With a SATA SDD for example that would be only 4 seconds of initial load time with the rest of the data filled in about another 30 seconds of play time. Yes the initial load would be a few seconds faster with the NVMe drive but after that you have 32GB of RAM to play with, and if you don't actually need more than 1.1GB/s (with decompression) worth of streaming during gameplay (which I'll wager most games won't) then the only downside is slower initial loads, and maybe fast travels if you can't pre-cache them with the extra RAM.

Am I saying a 32GB PS5 with a SATA SSD would have been the better design choice? No. I'm simply saying it would provide a different mix of advantages and disadvantages vs the current design.
I don't think anyone is saying it's 100% better in all scenarios - but what we (or at least I) am saying is that it hits several spots which more VRAM does not, eg cost and easier for devs to work with/design around. We know Cerny went around talking to devs about what they wanted and the fast IO was top of the list.

Even in your examples, on PS5 the dev 'just' has to load the data - they don't have to plan anything like 'load parts of the nearest or most likely 'jump to' points'...it can all be done on the fly.

Of course having more VRAM would be better - I can't see how anyone would deny that (outside of impact to cost of course), but it's the fast IO feeding that RAM that's removing more barriers to game design.

I'm not sure why you think 4GB would be enough to start a game, we're saying currently a view is around 8-10GB right? But you're example needs to be better or we're just standing still - even so that's just the initial load, we know background loading can affect performance whilst it's happening (for the next 30 seconds or so and then you have all the design requirements on top.

Think of it this way, games have been designed the same way forever even though PCS have had massive advantages in GPU, CPU and IO - yet now we can move away from that...but if the consoles hadn't made the strides they have we'd still be restricted to the lowest common denominator which would be and IO of what, 500MB/s tops? So you probably go back to loading screens for the first ~XX seconds and have janky fast travel and designs around the IO bottleneck...which would be better than where we were, but not where we can be in a couple years when the advantages are fully realised.
 
(Can I ask just for shadows and small objects not to pop when I approche them but be there from afar ? :eek: )
I think most pop-ins like that are due to the engine inaccurately approximating the LOD?

Pop-ins caused by asset streaming should mostly look like: a low detail model and texture mip is displayed, then after a while the object is rendered in full detail.
 
I think most pop-ins like that are due to the engine inaccurately approximating the LOD?

Pop-ins caused by asset streaming should mostly look like: a low detail model and texture mip is displayed, then after a while the object is rendered in full detail.

Sure.

My "tone" was more like, it's nice too see this crazy hardware, new techs using them, etc, but the main things I notice in games are related to view distance and shadow casting, which are "old" problems. Like, in CP2077, you see crazy RT stuff, and the next second NPC not casting shadow and small elements poping right and left :)
 
  • Like
Reactions: snc
Sure.

My "tone" was more like, it's nice too see this crazy hardware, new techs using them, etc, but the main things I notice in games are related to view distance and shadow casting, which are "old" problems. Like, in CP2077, you see crazy RT stuff, and the next second NPC not casting shadow and small elements poping right and left :)
Ah! I totally agree, QoL improvements like you mentioned help a lot and are often overlooked.
 
Sry, to correct you a bit, but it is impossible to loose more than 100% if you can't go into negative numbers.
Something can be 4000% faster, but you can't be 101% slower (if you don't go backwards). 100% slower would already mean that you don't do anything ;)

Ok I missed a word. I mean loose the 4000% increased IO performance. Sata is such a step down it should be allowed negative numbers tho.

I have got lost, for "more" ram are folks discussing 20gb or a more gen on gen figure like 60gb? Certainly nobody can argue with what wins cost wise.
 
That's because the shader isn't computing the correct light propagation on the surface and normalizing that approximation with an accurate enough occlusion term. The other thing is that Lumen doesn't handle is area lights. I'm tired of the same old lighting setup with a hierarchy of cubes, spheres, etc.. to approximate illumination from a pre-pass. I want a complete light loop for every light source (not just directional) and factor in it's size as well as a decent inverse-square falloff and I want my BRDFs to use importance sampling on the surface as well as rays shot from the light source and to the material using a proper PDF and sampling function.

You can be tired all you want of performance optimizations and trade-offs, but they ain't going away from real time rendering, regardless of if you go all-in on RT or other solutions. Metro Enhanced still simplifies significant portions of their lighting transport down voxel-like grid of probes, which cause many of the problems you are complaining about.

If you can throw performance consjderations out the window and daydream about a hypothetical magical real time engine that can path trace triangles with hundreds of paths per pixel, each with hundreds of bounces, then at least compare thar to a hypothetical voxel/sdf cone tracer with voxel resolution down to pixel level. Both of them are equally impossible, and both can be improved upon to the point of approaching ground truth if performance is of no concern.
 
I don't think anyone is saying it's 100% better in all scenarios - but what we (or at least I) am saying is that it hits several spots which more VRAM does not, eg cost and easier for devs to work with/design around. We know Cerny went around talking to devs about what they wanted and the fast IO was top of the list.

Even in your examples, on PS5 the dev 'just' has to load the data - they don't have to plan anything like 'load parts of the nearest or most likely 'jump to' points'...it can all be done on the fly.

Of course having more VRAM would be better - I can't see how anyone would deny that (outside of impact to cost of course), but it's the fast IO feeding that RAM that's removing more barriers to game design.

I don't really disagree with any of this. My main push back was against the developing narrative that the fast IO makes more VRAM essentially irrelevant and to a large degree even more powerful GPU's irrelevant outside of "more resolution and faster framerates". Is still think the new IO is awesome and obviously opens up significant new approaches to game design that more RAM and faster GPU's don't (although RT can certainly simplify the development process).

I'm not sure why you think 4GB would be enough to start a game, we're saying currently a view is around 8-10GB right? But you're example needs to be better or we're just standing still

Given 8GB is where many high end GPU's currently sit I'm pretty sure that modern games don't need that or more data in VRAM just to be able to start up the gameplay. The vast majority of that is likely to be used for pre-caching data. Resident Evil VIllage for example loads from cold on my HDD in 13 seconds. That's a max of about 1.5GB. And yet in the graphics menu with all settings maxed it tells me it needs something like 7-8GB. So the initial load is most certainly smaller than the total use of VRAM.

- even so that's just the initial load, we know background loading can affect performance whilst it's happening (for the next 30 seconds or so and then you have all the design requirements on top.

But isn't this entire discussion based on the new IO's enabling fast streaming during game? Whether the streaming is from a slower drive or a faster drive, it's still going to be happening. And in fact streaming is the norm today anyway so I don;t see how this is any different from the current situation. Also the hardware decompressors in the consoles (along with Direct Storage on the PC) should massively mitigate the overhead.

Think of it this way, games have been designed the same way forever even though PCS have had massive advantages in GPU, CPU and IO - yet now we can move away from that...but if the consoles hadn't made the strides they have we'd still be restricted to the lowest common denominator which would be and IO of what, 500MB/s tops? So you probably go back to loading screens for the first ~XX seconds and have janky fast travel and designs around the IO bottleneck...which would be better than where we were, but not where we can be in a couple years when the advantages are fully realised.

I definitely agree that the move to SSD's as a baseline is very significant and taken in the context of the history of console generations and it's impact on game design it's definitely the most stand out change of this generation. What I don't necessarily think is that it's the biggest contributing factor to what we would consider to be "next gen graphics" this generation, i.e. that a PS4 with the PS5 IO would be creating better looking games than the PS5 with the PS4 IO.

Also, while the high speed NVMe's are obviously awesome and do open up options not possible on slower SSD's, I do wonder whether the more significant jump was from HDD to SSD, with the high speed NVMe SSD aspect being of lesser importance. As I understand it, Nanite for example is possible even on SATA SSD's and the main enabling factor was the reduced latency and instant seek times vs an HDD. I'm sure the 5x speed improvement helped too though. Based on the comments made at one of the UE5 presentations it sounds like their streaming requirements where in the hundreds of MB/s rather than GB/s so it makes me wonder as to the full extent of the additional utility brought by the ultra fast drives. Reduced load times are the obvious one which includes things like portals or even doors that lead to new environments (e.g. going from outdoor to indoor). These are obviously important but perhaps not more so than the paradigm shift that the basic SDD use enables via the likes of Nanite.
 
On the other side, you trade memory for bandwidth. The HDD from the old gen didn't do much to really produce memory contention problems. But now we have the GPU, CPU (much faster) and the SSD that fight for the memory bandwidth. Losses through memory contention are no longer that small (especially if you read something quite small that decompresses into memory as big "file"). And when you have something bandwidth demanding like RT going on, you might decide that you take the route of a larger in memory buffer and instead save some memory bandwidth for better calculations, ...
It always depends.

Nah man, had SSDs been viable before, they would have been a win to any Disk based console. Storage IO has been one of the main bottlenecks to most games using robust streaming engines. And most games not using robust streaming engines before were just leaving graphical variety, quality and level scope on the table to avoid having to tackle that problem (and filling their games with loading screens and segmenting their worlds in small contained areas.

As an example, Andy Gavin, ND co-founder and lead software engineer in their early years implemented a streaming system for crash bandicoot on PS1 and he claims that was one of its fundamental advantages over other games, and why it was so visually ritch. Even some of their performance trickery was enabled by streaming, like how they baked all the level culling ahead of time (since levels were completely on-rails) but the consequent pre-baked display-lists would not have fit in PS1 memory had they not been able to stream them in chunks throughout the level.
 
Nah man, had SSDs been viable before, they would have been a win to any Disk based console. Storage IO has been one of the main bottlenecks to most games using robust streaming engines. And most games not using robust streaming engines before were just leaving graphical variety, quality and level scope on the table to avoid having to tackle that problem (and filling their games with loading screens and segmenting their worlds in small contained areas.

As an example, Andy Gavin, ND co-founder and lead software engineer in their early years implemented a streaming system for crash bandicoot on PS1 and he claims that was one of its fundamental advantages over other games, and why it was so visually ritch. Even some of their performance trickery was enabled by streaming, like how they baked all the level culling ahead of time (since levels were completely on-rails) but the consequent pre-baked display-lists would not have fit in PS1 memory had they not been able to stream them in chunks throughout the level.
Yes, streaming is important and only now with SSDs really possible with minimal buffering, but memory bandwidth is also important and still quite "rare". There is no big cache like in AMDs latest graphics cards, that reduces bandwidth-needs. IO is important up to a certain point (wherever this is. I would say it is below 1GB/s for current chips, everything above that is in the regions of diminishing returns). This might also be very scenario depended.

E.g. the new R&C game use it for fast traversal. But still needs to "load" 1-2s to really change the world. With that "load" time you still see, that a buffer in memory is filled with the new data (and getting processed). Everything else needs less streaming speed, else they wouldn't need 1-2s read for buffering on a world change. Question is, how much data do they need to buffer so everything can be streamed in on demand (a few frames ahead). Damn those NDA ... but maybe with the next developer conference we will get some insights.
 
I'd love to know what big developers would rather have had if given the option:

Let's use Xbox Series X:

16GB RAM + fast NVME SSD (actual specs)

Or...

20GB RAM + SATA SSD

The 2nd option gives you unified memory bandwidth but I'm assuming you lose much of the benefits of Direct Storage and possibly SFS with the SATA SSD.
Is there still a big price difference between sata ssd vs nvme ssd?

If the savings from sata wasn't enough to offest 4GB, could've ended up with 16GB & sata ssd :runaway:

If they both equated to the same cost I personally would go with the nvme ssd. Strengthen the weakest link in the chain and have a more balanced, flexible in terms of game design system imo.
 
Last edited:
I don't really disagree with any of this. My main push back was against the developing narrative that the fast IO makes more VRAM essentially irrelevant and to a large degree even more powerful GPU's irrelevant outside of "more resolution and faster framerates". Is still think the new IO is awesome and obviously opens up significant new approaches to game design that more RAM and faster GPU's don't (although RT can certainly simplify the development process).
I think there's a difference between not necessary and irrelevant. It's a bit like converting everyone from petrol/diesel to electric, in that instance we have finite resources vs costs in the gaming space (if that makes any sense!). As you say, most games are being built with lower VRAM in mind and this would not be just because of consoles, the PC equivalent makes it disproportionately more expensive.

But isn't this entire discussion based on the new IO's enabling fast streaming during game? Whether the streaming is from a slower drive or a faster drive, it's still going to be happening. And in fact streaming is the norm today anyway so I don;t see how this is any different from the current situation. Also the hardware decompressors in the consoles (along with Direct Storage on the PC) should massively mitigate the overhead.
One scenario is streaming just what it needs when it needs it, the other is streaming everything it might possibly want at any one time...it's like apples to oranges really, and then we have bottlenecks of dataflow to consider...which I mention later (regarding Cernys presentation),

I definitely agree that the move to SSD's as a baseline is very significant and taken in the context of the history of console generations and it's impact on game design it's definitely the most stand out change of this generation. What I don't necessarily think is that it's the biggest contributing factor to what we would consider to be "next gen graphics" this generation, i.e. that a PS4 with the PS5 IO would be creating better looking games than the PS5 with the PS4 IO.
I kind of get where you're coming from, and I had a similar conversation pre-launch with a friend. In my mind, the quality and variety of the textures is part of the graphical make up - I know it's not rendering grunt that a GPU does, but still, if the SSD affords more detailed and varied textures in my mind that makes a game have better graphics.

Also, while the high speed NVMe's are obviously awesome and do open up options not possible on slower SSD's, I do wonder whether the more significant jump was from HDD to SSD, with the high speed NVMe SSD aspect being of lesser importance. As I understand it, Nanite for example is possible even on SATA SSD's and the main enabling factor was the reduced latency and instant seek times vs an HDD. I'm sure the 5x speed improvement helped too though. Based on the comments made at one of the UE5 presentations it sounds like their streaming requirements where in the hundreds of MB/s rather than GB/s so it makes me wonder as to the full extent of the additional utility brought by the ultra fast drives. Reduced load times are the obvious one which includes things like portals or even doors that lead to new environments (e.g. going from outdoor to indoor). These are obviously important but perhaps not more so than the paradigm shift that the basic SDD use enables via the likes of Nanite.
I guess it all depends where you sit regarding Cernys presentation on the SSD and the explanation that the speed is just one aspect and whilst it helps it's resolving all the other bottlenecks along the way that are unlocking the advantages.

I really do think it's early days to write off the impact of SSDs - we already have the XSX multi-game QR feature which is pretty amazing and games just scratching the possibilities.
 
Is there still a big price difference between sata ssd vs nvme ssd?

If the savings from sata wasn't enough to offest 4GB, could've ended up with 16GB & sata ssd :runaway:

If they both equated to the same cost I personally would go with the nvme ssd. Strengthen the weakest link in the chain and have a more balanced, flexible in terms of game design system imo.
Well, the chip in the xbox at least is quite expensive. It is a single chip that is capable to deliver 2,4GB/s (=> low latencies). There are not many chips on the marked that can do that. It might have been cheaper for MS to go with a bigger m.2 SSD that has multiple chips, but that might just be to big, especially for an external solution. A two chips solution might have worked externally with one chip on each side, but than they might not been able to cool it enough. But normally 2 chips at half frequency use less power than one at double frequency but than the latencies go up again. So it really depends on the use case MS got in mind when they made the "one-chip" decision. Maybe the lower latencies were needed for what they had in mind.

And the price for an sata ssd vs m.2 is mostly quite equal. Only the really fast m.2 drives are expensive. But most of them are just throttling after a while because of overheating issues.
 
The buffer might get smaller with more bandwidth, but not that much more.

...

But now we have the GPU, CPU (much faster) and the SSD that fight for the memory bandwidth. Losses through memory contention are no longer that small (especially if you read something quite small that decompresses into memory as big "file"). And when you have something bandwidth demanding like RT going on, you might decide that you take the route of a larger in memory buffer and instead save some memory bandwidth for better calculations, ...
It always depends.

Except that’s not how it works in the new system, as in most cases, the buffer doesn’t get smaller, it goes away!

For instance a texture is streamed straight from the SSD into the GPUs own cache pipelines, bypassing RAM altogether and so not using space there as well as no bandwidth. Not even the CPU is used in any significant way as the object goes from SSD to a hardware decompressor and then straight to the GPU.

And raytraced reflections also rarely works the way you think as rarely is there enough raytracing power to get a high res infinite depth reflection of the actual game world you would see if you turned around.

I actually wonder if it is not cheaper to just calculate the correct coordinates to mirror and transpose the meshes to the mirror location and just render them as if they were actual geometry ...
 
Back
Top