Why a daughter die?

blakjedi

Veteran
I know why ATI has a daughter die capable of doing all the maths necessary for z pass, stencil AA etc... but why? Why didn't they just build the logic into the parent die? Would a parent die with on chip edram (logic-less) behave any differently?

Im sure that as the x360 goes through its lifecycle they will consolidate the chipset for cost savings... what will they do with the edram then? Im not sure if anyone has the answer i just wanna hear what people think.
 
They simply took certian bandwith intensive applications such as the ones you mentioned and gave them their own bandwith on the daughter die to use so thay it doesn't bottleneck the rest of the system.

That way, those applications can be done on the daughter die and essentialyhave no impact on the rest of the system. It all seems very logical to me.
 
BenQ i do get that... i guess what im really asking is if they will consolidate the chips in future anyway the bandwidth will become internal bandwidth anyway... do you need an external chip to receive the bandwith benefit?
 
It's easier to fab a 220M transistors plus a 110M transistors chips than only one 330M transistor chip.

Also, the eDRAM technology used should be produced, at this point of time, with better yields in NEC fabs.

In the near future, you can expect that both ICs will be produced on a smaller process and potentially on an unique die.
 
The eDRAM would work just as well on the GPU, in fact more efficiently even as there could be a faster connection between GPU shaders and eDRAM. However eDRAM manufacturing isn't as easy as conventional CPU production it seems, so for yields by seperating the two they can be sure of getting better yields=lower cost.

In future a single consolidated chip is quite possibile, though presumably with no improvement to access BW (the 32 write, 16 read to daughter die).
 
blakjedi said:
BenQ i do get that... i guess what im really asking is if they will consolidate the chips in future anyway the bandwidth will become internal bandwidth anyway... do you need an external chip to receive the bandwith benefit?
Actually, if the ICs were on the same die, they could have a larger bandwidth.

And the fact that the future chips may be on a single die will not affect, at all, the bandwidth. For obvious compatibility reasons.
 
It make things easier for the layout designers.
I imagine that the bulk of the main chip design was probably ported from ATI's the PC graphics line. It's not easy to design 500mhz chip with 10mb of EDRAM sitting there consuming lots of power. They'd basically have to redo everything from scratch.
A daughter die forces a partition between the two designs which makes things more manageable even though they sacrifice some bandwidth.
 
Jawed said:
Clocks, too?

Do we know the clock rate on the EDRAM daughter die?

Jawed

I believe 2GHz was tossed around, but I believe that was in the early days of the disclosure of the 256GB/s of bandwidth and the assumption that that was between the daughter die and parent die (when in fact it is between the daughter die logic and eDRAM).

I do not remember if the memory interface width of frequency were ever officially announced. I have wondered about this as well... having the daughter die logic, especially Z, Stencile, Alpha blends, and other features running at a high frequency could mean some good things for shadowing and such.

I would be interested in more information on this as well.
 
In the case of the RSX we have the "Normalizer" ALU in the Pixel Shader but it seems that is not the case of the Xenos.

Is the Xenos Daughter core the equivalent to the "Normalizer" ALU on the Pixel Shaders of the RSX?

Thanks.
 
Acert93 said:
Jawed said:
Clocks, too?
Do we know the clock rate on the EDRAM daughter die?
I believe 2GHz was tossed around, but I believe that was in the early days of the disclosure of the 256GB/s of bandwidth and the assumption that that was between the daughter die and parent die (when in fact it is between the daughter die logic and eDRAM).
Surely we can make an educated guess.

2 Tb/s on a 2048 bit bus = 1 GHz (double GPU clock - makes sense)

Or 500 MHz on 4096 bit bus (BIG bus!)
Or 2 GHz on 1024 bit bus.

I'd guess 2048 bit bus @ 1GHz.
 
Urian said:
In the case of the RSX we have the "Normalizer" ALU in the Pixel Shader but it seems that is not the case of the Xenos.

Is the Xenos Daughter core the equivalent to the "Normalizer" ALU on the Pixel Shaders of the RSX?

Thanks.

What does the normalizer do?
 
Shifty Geezer said:
2 Tb/s on a 2048 bit bus = 1 GHz (double GPU clock - makes sense)

Or 500 MHz on 4096 bit bus (BIG bus!)
Or 2 GHz on 1024 bit bus.

I'd guess 2048 bit bus @ 1GHz.

Bolded my choice. Think I've seen it mentioned somewhere.

Though in truth you can have different clocks operating on a single die, so it's perhaps not necessary to make a big deal out of different clocks being a good reason to have (or keep, going into the future of 65nm or 45nm devices) a separate daughter die.

Jawed
 
Shifty Geezer said:
Surely we can make an educated guess.

2 Tb/s on a 2048 bit bus = 1 GHz (double GPU clock - makes sense)

Or 500 MHz on 4096 bit bus (BIG bus!)
Or 2 GHz on 1024 bit bus.

I'd guess 2048 bit bus @ 1GHz.

Thanks Shifty :D

I kind of know how to do the math for such, but I think I get my answers wrong more often than not :oops: It is always nice to let someone in the industry give a real answer than me goof one up! :LOL:

Thanks again!

Btw, was no the PS2 eDRAM 4096 bits?

Like Jawed, I believe I had heard 2GHz. But that could be collective amnesia ;)
 
Back
Top