Embedded Memory in GPU to expensive?

jpr27 · Oct 25, 2004

I know this kind of crosses the line with a thread i was looking at in the console chat section so let me know if i should repost it there

I was reading how the Xbox 2 GPU "could possibly have" 10MB of Embedded Memory (not sure on the PS3). I have read some of the benefits this achieves but wondered why there isnt more extensive use of it in the PC GPU area? (Im not including the workstation portion of cards as I know nothing about them)

Is the cost just too prohibative? Die size? Heat?

Does anyone feel that this will be a focus in the next generation of PC GPU's or in the very near future? If not what other alternatives might be in the "pipeline"?

KimB · Oct 25, 2004

Embedded memory is a potential way to go in the PC market. But current GPU's are doing just fine maxxing out their size by just adding more pipelines. In the console market it makes a bit more sense to go embedded for the simple reason that the console can be made quite a bit cheaper if local graphics memory is done away with entirely. Sharing memory with the CPU, however, is always a risky business as far as performance is concerned, and giving the GPU its own dedicated memory is a great way to keep GPU performance up while keeping costs down at the same time.

That said, I think that embedded memory in the desktop space will probably see its first incarnation in the low-end space, where it's used to do away with onboard graphics memory entirely.

But in the high-end, I think current video cards like the GeForce 6600 GT, which manages to have pretty good efficiency even with 8 pipelines and only a 128-bit bus, show that we can go quite a bit further in increasing the number of pipelines before memory bandwidth really begins to be a problem.

And then you have to consider that pixel shaders may further reduce the need for large memory bandwidth, as the more time is spent processing per pixel, the less memory bandwidth is needed per pixel.

Megadrive1988 · Oct 26, 2004

3DO / Matsushita MX chipset - first proposed consumer use of embedded memory in graphics processor - canned

Verite 4400E ~125M transistors - 12 MB eDRAM - canned

BitBoys Glaze3D / Xtreme Bandwidth Architecture - 9 MB eDRAM - canned

Graphics Synthesizer - 4 MB eDRAM - in PS2

Flipper - 3.12 MB eDRAM - in GC

GS I-32 - 32 MB eDRAM, used in GSCube and maybe other applications?

Xbox 2 VPU ~10 MB embedded memory? (hopefully more)

Revolution VPU - some amount of embedded memory likely

PS3 GS3/Visualizer - some fairly large amount of embedded memory likely.

(probably missed a few)

I'd like to see the closest desktop equivalent to Xbox 2 VPU have some embedded memory but I'm not holding out too much hope. we'll see.

embedded memory is very good, or can be. massive increase in bandwidth. but may not be needed in desktop enviornment as more work gets done per pixel as time goes on.

AlNom · Oct 26, 2004

How about using eDRAM in laptops?

KimB · Oct 26, 2004

Alstrong said:
How about using eDRAM in laptops?

Well, nVidia's already used on-package RAM for laptops, so I doubt embedded is far away.

jpr27 · Oct 26, 2004

But in the high-end, I think current video cards like the GeForce 6600 GT, which manages to have pretty good efficiency even with 8 pipelines and only a 128-bit bus, show that we can go quite a bit further in increasing the number of pipelines before memory bandwidth really begins to be a problem.

And then you have to consider that pixel shaders may further reduce the need for large memory bandwidth, as the more time is spent processing per pixel, the less memory bandwidth is needed per pixel.

I understand that adding more pipelines would help but I was thinking more along the line of texture swapping? With textures 1024x1024 becoming the norm in the future wouldnt embedded ram help keep the pipes fed? Or am I wrong on what embedded ram would be mainly used for?

RejZoR · Oct 26, 2004

Take normal CPUs as comparison. Such internal memory would be called internal cache if you want. But since it would be that big,there would be no need for external one. And since internal stuff is always much faster than external its a better choice. But there would be another problem.
Size of the chip and heat. Probably a higher cost due to miniature size of memory part (like on CPUs with more L1 or L2 cache).

Wunderchu · Oct 26, 2004

here is an excerpt from the ATI Hardware IRC chat held at Rage3D on Dec. 3 , 2003

<Ratchet> Will ATI start to use embedded / on-chip memory anytime soon?
We could tell you but then we would have to kill you. done.
<SirEric> We do have on package memory products now. done.
last question...
<SirEric> (note: our chips have LOTS of onchip memory already. done)

source: ( http://www.rage3d.com/board/showthread.php?s=3651af711870f15b4dc8d8388147dcb5&threadid=33728806 )

(related threads at Rage3D: http://www.rage3d.com/board/showthread.php?s=3651af711870f15b4dc8d8388147dcb5&threadid=33728821

http://www.rage3d.com/board/showthread.php?s=3651af711870f15b4dc8d8388147dcb5&threadid=33728805

http://www.rage3d.com/board/showthread.php?s=3651af711870f15b4dc8d8388147dcb5&threadid=33728886 )

Dave Baumann · Oct 26, 2004

Thats talking about "on package", i.e. the memory chip on the same package as the core die.

_xxx_ · Oct 26, 2004

From what I know about semiconductors, the internal memory of that size would double the price of chips, let alone that the yields would go down the toilet due to increased failure rate.

nutball · Oct 26, 2004

_xxx_ said:
From what I know about semiconductors, the internal memory of that size would double the price of chips, let alone that the yields would go down the toilet due to increased failure rate.

Yields don't have to suffer do they, if the RAM is designed with redundancy (a la large on-die caches on eg. Itanium 2).

JohnH · Oct 26, 2004

Due to the large cost of embedded memory, embedding FB memory only really works where you are able to constrain things like target resolution and bit depth i.e. its reasonably suited to a closed system like a console, but is problematic in the desktop space where these things can't be constrained...

John.

akira888 · Oct 26, 2004

That's what I was thinking. On a console the author has total control over the size of render targets and the memory frontprint of texture maps, geometry, etc, and can structure the application to not spill active data/buffers into slow main RAM where performance would rapidly plummet. He also knows the size of this embedded RAM will be constant for all machines the application runs on.

On PC where the user has much of this control (selectable resolutions) and where there's not a constant size for the fast on-chip RAM this would be almost impossible. I would guess that this is one of the main obstacles that eDRAM faces in the PC market.

Dave B(TotalVR) · Oct 26, 2004

Chalnoth said:
But in the high-end, I think current video cards like the GeForce 6600 GT, which manages to have pretty good efficiency even with 8 pipelines and only a 128-bit bus, show that we can go quite a bit further in increasing the number of pipelines before memory bandwidth really begins to be a problem.

With the high end graphics cards cost over Â£400 and people talking about high speed RAM chip shortages, I'd say its already a problem.

arjan de lumens · Oct 26, 2004

nutball said:
_xxx_ said:

From what I know about semiconductors, the internal memory of that size would double the price of chips, let alone that the yields would go down the toilet due to increased failure rate.

Click to expand...

Yields don't have to suffer do they, if the RAM is designed with redundancy (a la large on-die caches on eg. Itanium 2).

Adding eDRAM requires extra processing steps during manufacturing, which can increase the defect rate even in the logic portion of the chip.

jpr27 · Oct 27, 2004

Wunderchu link section brought up the example i was trying to point out about the usage of embedded ram. qballshalls2002 made a point which pretty much the foundation of my question.

qballshalls2002 wrote:

Heres my best examplelaystation2

PS2's Graphics chips raw bus speed bandwidth is 1.2gb/s.Its also noted that it does a full 48gb/s.Now where does that enmormous amount of bandwidth come from?eDRAM.It has a small amount of eDRAM on the Sony Graphics Synthesizer built on Ps2.Now with 4mbs of 48gb/s,it has the speed of moving textures in real-time with no wait.

Im not sure if the numbers he posts are the PS2 real numbers but in regards to textures. If it is true on the bandwidth gain and time to move textures (or store), wouldnt this help not only in texture swaps but in the use of FSAA & AA as well? I could imagine the possabilities if true.

What is the negative comments Ive read about CPU compared cach L1&L2 etc? Can L1, L2 cach be implemented in such a way for textures in GPU's? Although from what I have read in the forums, not many people like the idea of L1 or L2 cach in GPU's. What is the negatives of using this alternative?

All in all from the feedback I've seen, it seems cost is the biggest factor at this time? Do you all agree on that?

Thanks for all the feedback

***There is no luck... Only the will and desire to succeed ! ***

PC-Engine · Oct 27, 2004

arjan de lumens said:
nutball said:

_xxx_ said:

From what I know about semiconductors, the internal memory of that size would double the price of chips, let alone that the yields would go down the toilet due to increased failure rate.

Click to expand...

Yields don't have to suffer do they, if the RAM is designed with redundancy (a la large on-die caches on eg. Itanium 2).

Click to expand...

Adding eDRAM requires extra processing steps during manufacturing, which can increase the defect rate even in the logic portion of the chip.

Depends on the eDRAM fab technology. NEC's eDRAM technology avoids the high temperature step of tradition eDRAM process so it has no effect on logic.

Another major NEC Electronics technology advantage is the ability to fabricate the capacitors at the heart of embedded DRAM cells at about half the temperature of commodity DRAM and well below the temperature used in a normal CMOS logic process. This low-temperature process is important because NEC Electronics fabricates CMOS logic before the embedded DRAM capacitors, so temperatures must be kept low to avoid degrading the performance of the CMOS logic.

The success of NEC Electronics's low-temperature capacitor process can be seen by comparing the CMOS transistor characteristics before and after embedded DRAM capacitor formation. These measurements show that the transistor performance is identical in both cases. Using the NEC Electronics process, your CMOS logic will run at the same high speed either with or without embedded DRAM.

Killer-Kris · Oct 27, 2004

JohnH said:
Due to the large cost of embedded memory, embedding FB memory only really works where you are able to constrain things like target resolution and bit depth i.e. its reasonably suited to a closed system like a console, but is problematic in the desktop space where these things can't be constrained...

John.

Well I imagine that they'd be able to put 32MB of eDram on chip relatively soon, which is more than enough for 1600x1200 or lower resolutions with AA. This would almost certainly be feasible and accepted when you're looking at something like the budget market where consumers don't have such lofty expectations of the product. This sort of setup would allow board producers to put only 32-64MB of on board ram for textures with a 64bit bus and see very little (if any) performance drop.

It wasn't that long ago that we all owned video cards that were physically limited to resolutions below 1600x1200, even now there are not many people with monitors capable of that resolution (especially in the target market). So any limits placed on resolution or AA by the eDram is not likely to affect anyone buying those cards.

The only real con to eDram is the cost of it versus the money saved by using smaller buses, less/cheaper ram, and the reduced PCB size from smaller buses and less ram. Once that price threshold is crossed I believe that eDram will be used in a heart beat, since it will be able to hold performance constant (possibly increasing it) while prices drop.

MfA · Oct 28, 2004

At 90nm they'd have to dedicate something like 1/4-1/3 of the die to memory.

Killer-Kris · Oct 28, 2004

MfA said:
At 90nm they'd have to dedicate something like 1/4-1/3 of the die to memory.

So going by die size alone (ie. ignoring process problems with eDram) how much does that add to the cost?

Which I suppose also begs the question of how much does ram, the few extra square inches of PCB, traces, etc... all cost as well?

Embedded Memory in GPU to expensive?

jpr27

KimB

Megadrive1988

AlNom

Moderator

KimB

jpr27

RejZoR

Wunderchu

Dave Baumann

Gamerscore Wh...

_xxx_

nutball

JohnH

akira888

Dave B(TotalVR)

arjan de lumens

jpr27

PC-Engine

Killer-Kris

MfA

Killer-Kris

Similar threads