The pros and cons of eDRAM/ESRAM in next-gen

sebbbi · May 7, 2014

DSoup said:
The too. My understanding of alpha blending is that's is basically compositing two or more things so surely you need more reads than writes!?!

You need equal amount of backbuffer reads and writes. Blending = Read the backbuffer value, combine existing value with pixel shader output, write the value back to backbuffer. One read and one write.

The shader itself obviously reads the particle texture as well. I explained this in my post. However the particle texture is DXT compressed (1 byte per pixel) and the backbuffer is RGBA16F (8 bytes per pixel), so the particle texture read is insignificant in usual case. In the tiled case, the particle texture read is much more significant because the backbuffer reads and writes come mostly from the cache (and thus consume zero BW after the first read and first write). That's that why I concluded that in the tiled case you'd likely have roughly 2x reads compared to writes.

Shifty Geezer · May 7, 2014

DSoup said:
The too. My understanding of alpha blending is that's is basically compositing two or more things so surely you need more reads than writes!?!

That's what I understood, thinking that you have one particle that you write multiple times to the framebuffer. Instead, each blend reads the framebuffer and reads the particle, blends them, and writes out. Seems grossly inefficient, but as such, a local scratchpad RAM would provide a fast working buffer pretty ideal for the task.

sebbbi · May 7, 2014

Shifty Geezer said:
That's what I understood, thinking that you have one particle that you write multiple times to the framebuffer. Instead, each blend reads the framebuffer and reads the particle, blends them, and writes out. Seems grossly inefficient, but as such, a local scratchpad RAM would provide a fast working buffer pretty ideal for the task.

This is how it's always been done on immediate mode renderers. On an immediate mode renderer (Intel, AMD, NVIDIA GPUs) you don't know what future draw calls or primitives there might be. You need to process every particle separately.

PowerVR TBDR hardware on the other hand first collects all the draw calls and all the triangles, and renders a small portion of the screen at once. This way you don't need to repeatedly write and read from the main memory when you do alpha blending or when you have overdraw. The downside of this method is that you need to store all the transformed triangle data somewhere in the memory, because you need to bin it to the tiles at once, and then access it later. Storing and later fetching this data consumes memory bandwidth. It's only a win if you have high overdraw, and too high triangle density hurts as well (because you need to store & load more temporal data to memory).

Deleted member 11852 · May 7, 2014

sebbbi said:
You need equal amount of backbuffer reads and writes. Blending = Read the backbuffer value, combine existing value with pixel shader output, write the value back to backbuffer. One read and one write.

Really appreciate the explanation. My basic understanding of modern graphics technologies is severely lacking.

Starx · May 8, 2014

http://gamingbolt.com/crytek-not-ex...-to-be-patched-but-expects-unique-tech-for-it

Ike Turner · May 8, 2014

Starx said:
http://gamingbolt.com/crytek-not-ex...-to-be-patched-but-expects-unique-tech-for-it

"Patched"? wth is this BS clickbait nonsense? As if this was some kind of software issue....smh

liolio · May 9, 2014

Make sense to me. As I read there won't be software release that will make it self managed, working as cache so devs no longer have to work around its size limitation.

pjbliverpool · May 9, 2014

It's intersting how views can change on a particular architectural aspect depending on the competition. With PS3 as the only comparison the XB360's edram was considered a great advantage of the architecture while this generation, a better implementation of it is condsidered a hinderence compared with the PS4 architecture.

Nesh · May 9, 2014

pjbliverpool said:
It's intersting how views can change on a particular architectural aspect depending on the competition. With PS3 as the only comparison the XB360's edram was considered a great advantage of the architecture while this generation, a better implementation of it is condsidered a hinderence compared with the PS4 architecture.

Well yeah. The PS3 was more complex than the 360 and it had memory bottlenecks. The 360 had unified memory + the edram. Comparing the two like for like the edram could demonstrate its advantages.

This time around though the PS4 isnt like the PS3. Its memory solution is better and simpler. If the PS4 went for 8GB GDDR3 and the GPU performance was just like the X1's the esram advantages over the PS4's solution would have been an obvious plus.

But considering what the two consoles actually offer its advantage isnt really there. It just appears like a different more complicated route towards achieving the same goal. The PS4's route overshadows the X1's better implementation of esram compared to the 360's edram.

Its only in relative terms that the esram "isnt a better" solution

Globalisateur · May 9, 2014

pjbliverpool said:
It's intersting how views can change on a particular architectural aspect depending on the competition. With PS3 as the only comparison the XB360's edram was considered a great advantage of the architecture while this generation, a better implementation of it is condsidered a hinderence compared with the PS4 architecture.

X360 memory architecture is different from XB1. X360 edram was only used for framebuffer tasks, think the 10MB as a specialized GPU cache dedicated to very fast framebuffer ROP processes which is great if you use for instance MSAA compared to quincunx on PS3.

Because X360 had for its time a fast GDDR3 unified memory.

You can compare XB1 with PS2 memory architecture where developers had trouble to put all textures in the 4MB vram (and I remember clearly that developers back in the PS2 era complained almost exactly for the same reasons some XB1 devs complain now, video memory is to small, impossible to put enough textures and modern video buffers in 32MB).

The esram memory is really a Video memory where you would need ideally to put the framebuffer but also textures and video buffers etc. in order to have correct bandwidth because main memory is rather slow, it wasn't the case with X360, the unified GDDR3 memory was fast enough for textures and everything needing fast bandwidth.

Shifty Geezer · May 9, 2014

Globalisateur said:
The esram memory is really a Video memory where you would need ideally to put the framebuffer but also textures and video buffers etc. in order to have correct bandwidth because main memory is rather slow...

Except you can access main RAM concurrently, including reads and writes of framebuffers and, I believe, MRT. So it's really just split pools with fully compatible functionality, unlike other platforms that either had to work exclusively from RAM due to memory topology, or couldn't access main RAM due to it having very restricted access/BW.

It's probably best not to think of XB1 in terms of any other machine. I can't recall a parallel.

Rangers · May 9, 2014

Nesh said:
Well yeah. The PS3 was more complex than the 360 and it had memory bottlenecks. The 360 had unified memory + the edram. Comparing the two like for like the edram could demonstrate its advantages.

This time around though the PS4 isnt like the PS3. Its memory solution is better and simpler. If the PS4 went for 8GB GDDR3 and the GPU performance was just like the X1's the esram advantages over the PS4's solution would have been an obvious plus.

But considering what the two consoles actually offer its advantage isnt really there. It just appears like a different more complicated route towards achieving the same goal. The PS4's route overshadows the X1's better implementation of esram compared to the 360's edram.

Its only in relative terms that the esram "isnt a better" solution

It offers one major advantage, consolation prize if you will, though-the ability to suffice with cheaper DDR3 RAM.

So far of course MS has not leveraged this at all, mostly imo due to Kinect expenses hiding it.

If it was only the stand alone box, it seems to me likely it could retail for at least $349 without too much pain, should MS choose too. In that case the advantage would be more visible.

From bkilian's posts, it seems MS never considered anything but a ESRAM/EDRAM+DDR3 design. It wasn't a choice they made versus GDDR5, or a choice made with foresight to enable 8GB instead of 4GB (this was a late decision, apparently, well after the ESRAM), that then somehow backfired when Sony achieved 8GB GDDR5. They never had anything else in mind. I dont really care for how MS seems really wedded to the ESRAM/EDRAM idea, I didn't even love it in 360, but at least there the disadvantages (less die size for GPU) were not visible due to the competition sporting a comparable GPU. All that said, I dont think XOne is a horrible design by any means.

HTupolev · May 9, 2014

pjbliverpool said:
It's intersting how views can change on a particular architectural aspect depending on the competition. With PS3 as the only comparison the XB360's edram was considered a great advantage of the architecture

There have been plenty of complaints about the eDRAM size in the 360, with the developers of Halo running around practically saying "tiling sucks."

Obviously it looks like an advantage when you look at the PS3 exclusives that look like they're running at 240p whenever there are transparencies on-screen, but even so you got complaints.

Globalisateur · May 9, 2014

Shifty Geezer said:
Except you can access main RAM concurrently, including reads and writes of framebuffers and, I believe, MRT. So it's really just split pools with fully compatible functionality, unlike other platforms that either had to work exclusively from RAM due to memory topology, or couldn't access main RAM due to it having very restricted access/BW.

It's probably best not to think of XB1 in terms of any other machine. I can't recall a parallel.

Of course, the specifics are different from PS2 architecture. I am sure PS2 4MB vram wasn't, like you said, as flexible as XB1 32MB fast memory.

But both memory architecture are really similar: big pool of slow memory + small pool of fast memory and both developers have really similar complaints about it -> 4MB wasn't enough for their games / 32MB isn't enough for many developers (evidently not first party developers, but some others, yes).

It didn't prevent developers to make great games on PS2. But PS2 games were often graphically different than first party Xbox/gamecube games. We could easily recognize a PS2 game just by looking at those same reused textures even in a game like Jak & Daxter (which wasn't a launch game and was one of the best graphically PS2 game, even today) compared to games having great amount of textures and lightning effects like Splinter Cell on Xbox and Rogue Leader (launch game) & Resident Evil 4 on gamecube. Even Sonic Adventure & Toy Commander (both launch games) on my beloved dreamcast had more varied & better textures in one screen/level than most PS2 launch period games.

I remember the PS2 era very clearly, I had bought one at launch, and I remember like it was yesterday about the 4MB bottleneck argument, the graphical comparison with my dreamcast/xbox/gamecube games, how developers complained about it in various medias, from even the first PS2 specs announcement.

So I am very confident to see very strong similarities between PS2 and XB1 4MB/32MB alleged bottleneck.

Shifty Geezer · May 9, 2014

Globalisateur said:
Of course, the specifics are different from PS2 architecture.

They're fundamentally different. PS2's GPU could only access VRAM, so everything had to fit. XB1's GPU OTOH can access both RAM and VRAM, so it has no capacity limitations. You could ignore the ESRAM completely as I understand it and then XB1 would look just like a laptop with shared GPU+CPU memory. It's therefore as unique as XB360 was.

chris1515 · May 9, 2014

Globalisateur said:
Of course, the specifics are different from PS2 architecture. I am sure PS2 4MB vram wasn't, like you said, as flexible as XB1 32MB fast memory.

But both memory architecture are really similar: big pool of slow memory + small pool of fast memory and both developers have really similar complaints about it -> 4MB wasn't enough for their games / 32MB isn't enough for many developers (evidently not first party developers, but some others, yes).

It didn't prevent developers to make great games on PS2. But PS2 games were often graphically different than first party Xbox/gamecube games. We could easily recognize a PS2 game just by looking at those same reused textures even in a game like Jak & Daxter (which wasn't a launch game and was one of the best graphically PS2 game, even today) compared to games having great amount of textures and lightning effects like Splinter Cell on Xbox and Rogue Leader (launch game) & Resident Evil 4 on gamecube. Even Sonic Adventure & Toy Commander (both launch games) on my beloved dreamcast had more varied & better textures in one screen/level than most PS2 launch period games.

I remember the PS2 era very clearly, I had bought one at launch, and I remember like it was yesterday about the 4MB bottleneck argument, the graphical comparison with my dreamcast/xbox/gamecube games, how developers complained about it in various medias, from even the first PS2 specs announcement.

So I am very confident to see very strong similarities between PS2 and XB1 4MB/32MB alleged bottleneck.

It is a different situation assets are the same between PS4/Xbox One... There is only differences in resolution and framerate between the PS4/Xbox One...

steveOrino · May 9, 2014

pjbliverpool said:
It's intersting how views can change on a particular architectural aspect depending on the competition. With PS3 as the only comparison the XB360's edram was considered a great advantage of the architecture while this generation, a better implementation of it is condsidered a hinderence compared with the PS4 architecture.

360s EDram was only an advantage early on until deferred rendering became dominant and you had to tile. Really the biggest asset for the 360 was its unified memory and very good GPU in comparison to the RSX.

Embedded ram is an economic choice in both instances (360/X1) especially in amounts that small in both systems.... More so with the X1 to make sure the price was south of $600 USD.

MrFox · May 9, 2014

It doesn't look like the choice was an economic one, unless they completely missed the mark with their projection of GDDR5 cost. They were targeting 4GB at some point ($14 difference between DDR3 and GDDR5) and ended up with 8GB ($28 difference).

All else being equal, the cost of die area for the ESRAM is practically compensating that difference (something like 1/4 of a $100 chip?), and it would have been a more expensive proposition, comparatively, had they went with the original plan of 4GB. In short, their 4GB solution of DDR3+ESRAM would have been more expensive than GDDR5 and no ESRAM.

Shifty Geezer · May 9, 2014

At launch, maybe. However, as I understand it, the plan with the ESRAM is long-term reduction. A few node shrinks down the line and the ESRAM should be very cheap. How cheap compared to GDDR5 is anyone's guess though, but you have to factor in board costs as well.

Sadly B3D has never had an electronics engineer expert with enough behind-the-scenes knowledge to really compare such choices. We can only guess based on broad knowledge and the usual teardowns and expert opinions.

Nisaaru · May 9, 2014

When the next GPU generation switches to HBM which seems to be the next big thing GDDR5 is pretty much history while DDR3 surely has a longer life cycle. But then I don't really think it matters if they can just redesign the memory setup in a console refresh. The hardware and software is too complex to depend on exact memory timings.

The pros and cons of eDRAM/ESRAM in next-gen

sebbbi

Shifty Geezer

uber-Troll!

sebbbi

Deleted member 11852

Guest

Starx

Ike Turner

liolio

Aquoiboniste

pjbliverpool

B3D Scallywag

Nesh

Double Agent

Globalisateur

Globby

Shifty Geezer

uber-Troll!

Rangers

HTupolev

Globalisateur

Globby

Shifty Geezer

uber-Troll!

chris1515

steveOrino

MrFox

Deludedly Fantastic

Shifty Geezer

uber-Troll!

Nisaaru

Similar threads