AMD: R7xx Speculation

_xxx_ · Oct 21, 2007

Ailuros: what? I don't understans a word from what you're saying. What IS the width to the daughter die?

CarstenS · Oct 21, 2007

I reckon it'd be a 2048-wide route from Xenos to it's edram daughter.

hoom · Oct 22, 2007

B3D Xenos article says 32GB/s from Xenos to daughter die, 256GB/s internal between ROPs & the eDRAM.
I'd presume the bus width Xenos to daughter to be only 128bit given that there is already a 128bit bus to RAM + whatever widths to the CPU & south bridge.

_xxx_ · Oct 22, 2007

Quasar said:
I reckon it'd be a 2048-wide route from Xenos to it's edram daughter.

No way in hell is it more than 256, I'd rather say 128.

nAo · Oct 22, 2007

1999 - PS2/GS - 2560 bit bus, 1024 read (fb) + 1024 write (fb) + 512 read (texture)
The way edram is meant to be implemented(tm).

Ailuros · Oct 22, 2007

_xxx_ said:
No way in hell is it more than 256, I'd rather say 128.

...and how do you reach 256GB/s bandwidth from the ROPs to the eDRAM with a 500MHz frequency?

500MHz@128bit = 16GB
500MHz@256bit = 32GB
500MHz@512bit = 64GB
500MHz@1024bit = 128GB
500MHz@2048bit = 256GB/s

Now tell me what is wrong with the above math

Apocros · Oct 22, 2007

Ailuros said:
...and how do you reach 256GB/s bandwidth from the ROPs to the eDRAM with a 500MHz frequency?

500MHz@128bit = 16GB
500MHz@256bit = 32GB
500MHz@512bit = 64GB
500MHz@1024bit = 128GB
500MHz@2048bit = 256GB/s

Now tell me what is wrong with the above math

the arithmetic is sound, why is the inter-die interface assumed to be running at 500MHz?

3dcgi · Oct 22, 2007

Ailuros said:
...and how do you reach 256GB/s bandwidth from the ROPs to the eDRAM with a 500MHz frequency?

500MHz@128bit = 16GB
500MHz@256bit = 32GB
500MHz@512bit = 64GB
500MHz@1024bit = 128GB
500MHz@2048bit = 256GB/s

Now tell me what is wrong with the above math

2048 bit is internal to the daughter die. Others are discussing the interface width between the dies.

CarstenS · Oct 22, 2007

3dcgi said:
2048 bit is internal to the daughter die. Others are discussing the interface width between the dies.

*Urgks*
You're right - sorry. I also reckoned we were talking about edram-2-rop bandwidth.

trinibwoy · Oct 22, 2007

Quasar said:
*Urgks*
You're right - sorry. I also reckoned we were talking about edram-2-rop bandwidth.

I thought the rops were part of the daughter-die.....wasn't that the whole point of the edram setup......to take care of the high-bandwidth rop tasks?

Demirug · Oct 22, 2007

trinibwoy said:
I thought the rops were part of the daughter-die.....wasn't that the whole point of the edram setup......to take care of the high-bandwidth rop tasks?

Yes the ROPs are part of the memory interface logic of the daughter DIE.

Mintmaster · Oct 23, 2007

nAo said:
1999 - PS2/GS - 2560 bit bus, 1024 read (fb) + 1024 write (fb) + 512 read (texture)
The way edram is meant to be implemented(tm).

Don't worry, we'll see that again now that Sony isn't the only one doing EDRAM with substantial logic.

Ailuros · Oct 23, 2007

3dcgi said:
2048 bit is internal to the daughter die. Others are discussing the interface width between the dies.

I realize that; issue being that that joke of a speclist on the former page sounds merely like the result of someone's funky imagination. In an unlikely case that someone would build something like that, it is more likely that some marketing monkey quotes the ROP2eDRAM bandwidth, then the bandwidth to jump by a factor higher than 5 in just one generation.

I don't even dare to ask what something with 64 ROPs would fit exactly into meager 20MB of embedded ram if we're talking high resolutions, unless of course those are R600 ROPs *runs for his life*

nAo · Oct 23, 2007

I'm not worried, but I also don't care about edram as long as we keep this way of rendering stuff. Give me something like REYES and then we can put edram to some nice use ALL the time

Ailuros · Oct 23, 2007

....and the transistor budget needed for a healthy portion of eDRAM would be worth considering the benefits? (honest question)

hoom · Oct 23, 2007

I'd have thought transistor budget would be a lot more open to eDRAM if we're talking multi-chip on 45nm :?:

Ailuros · Oct 24, 2007

hoom said:
I'd have thought transistor budget would be a lot more open to eDRAM if we're talking multi-chip on 45nm

As nao said what would you need eDRAM for on a typical rasterizer? One idea would be to use it as something like a MSAA partial framebuffer like on Xenos, but the question still bares if 20MB are even nearly enough for 64 ROPs (even more so if those are capable of single cycle 4xMSAA).

Arun · Oct 24, 2007

Ailuros said:
As nao said what would you need eDRAM for on a typical rasterizer?

If you've got an IMR (heck, even a TBDR as I argued in the past, but probably to a lesser extend) then using eDRAM means you won't have to go off-chip as much, saving power. This makes it a viable architecture for IMR handhelds... And it's exactly what NV is doing there next round (and for, let us say, something else), but heh, I'm disgressing.

As for a desktop part where the power requirements from going off-chip don't matter as much - well, one advantage is that it might allow you to have cheaper RAM (or a less wide memory bus) for a given amount of performance. So it might give better performance/dollar for the consumer, and certainly would increase the ASPs of the IHV: instead of selling a $65 GPU with $35 of VRAM, they could sell a $90 GPU with $10 of VRAM. Clearly, that makes them more money.

As for the technical advantages - well, indeed, not that much. However, some eDRAM could also be used as on-chip buffer for the geometry shader (rather than just the framebuffer), potentially accelerating that nicely. And it might come in handy for GPGPU too as I argued in the past (for Z-RAM, but it's the same thing really; even more so with Hynix replacing DRAM with Z-RAM completely for commodity chips for example!)

Also, it is important to realize you don't need a truckload of eDRAM or to use tiling. If you're smart, you can use compression for what you're writing to eDRAM too. So, say that your framebuffer is 40MiB and you have ~4:1 color compression with MSAA. You could just use 10MiB of eDRAM, and in the worst case you're caching 1/4th of the full color information, saving 25% bandwidth. In the best case, 4:1 compression is achieved and everything fits into eDRAM, saving 100% framebuffer bandwidth.

eDRAM is a lot more attractive once you realize all past and current designs are awfully naive and you could be way smarter about it.

nicolasb · Oct 24, 2007

eDRAM might be useful for IGP chips, or for a Fusion/Torrenza type thing.

Mintmaster · Oct 24, 2007

Ailuros said:
As nao said what would you need eDRAM for on a typical rasterizer?

For a "typical rasterizer" you don't need it. It only has usefulness when you have additional design constraints, such as with handhelds or consoles where tiny die sizes limit the bus width.

Memory speed increases a lot slower than GPU speed for a given die size. Similarly, memory speed cannot compensate for a smaller bus as a given GPU scales down with time.

One idea would be to use it as something like a MSAA partial framebuffer like on Xenos, but the question still bares if 20MB are even nearly enough for 64 ROPs (even more so if those are capable of single cycle 4xMSAA).

Why is the EDRAM size correlated to the ROP throughput? That decision should be based on resolution and how finely the scene can be tiled (if at all).

AMD: R7xx Speculation

_xxx_

CarstenS

Moderator

hoom

_xxx_

nAo

Nutella Nutellae

Ailuros

Epsilon plus three

Apocros

3dcgi

CarstenS

Moderator

trinibwoy

Meh

Demirug

Mintmaster

Ailuros

Epsilon plus three

nAo

Nutella Nutellae

Ailuros

Epsilon plus three

hoom

Ailuros

Epsilon plus three

Arun

Unknown.

nicolasb

Mintmaster

Similar threads