NEC Electronics on Xenos (downspec in dice interconnect bandwidth?)

Developers already have full scale development kits. Hell the beta kits had 400mhz GPU. Some things simply need common sense.
 
Panajev2001a said:
The likely thing is that the diagram between Xenos and the daughter die is wrong counting only one way bandwidth while you should have separate read and write busses.
What's the point of having separate read/write buses if the RAM you're attaching to is single-ported? There'd be none, as you couldn't read/write simultaneously anyway.

Besides, 2x512 bit bus between parent and daughter dies? I don't think so. :oops:

Besides again... 2x22GB/s is much faster than the stated original (and likely still true) 32GB/s interconnect speed. So again, I would say this is just some graphic artist or such at NEC confusing one number (RAM bandwidth) with another (interconnect bandwidth) and making an erroneous slide. Thus setting teh intarweb aflame with rampant speculation. ;)
 
OT because this i just a missprint, but when whas the last changes for Xbox, GC and PS2 before they launched. I recall the numbers but not the date.I
 
Huh, this could perhaps explain some of the troubles devs have been having getting AA/DOF/MoBlur and so forth at the same time on the Xbox.

ATi was indeed happy with Xenos production though, but does Xenos production == (GPU + eDRAM)? or just GPU??

Also, perhaps this has to do with the way MS and NEC define data transfer? In other words, maybe NEC is say the max read bandwidth is 22.4GB/s and the max write bandwidth is 22.4GB/s, but not at the same time. MS on the other hand is saying the max possible is 22.4GB/s read/write and 11.2GB/s write/read, so simultaneous read/write equals 32GB/s? ... ... ... except 11.2+22.4 doesn't equal 32... ... yeah, I'm smart. NEXT THEORY!
 
Mefisutoferesu said:
Huh, this could perhaps explain some of the troubles devs have been having getting AA/DOF/MoBlur and so forth at the same time on the Xbox.
Samples are calculated inside the eDRAM chip, so this is unlikely to be an issue.

I've not heard of anything different from ATI either way, but given the figure it seems to be to be a missprint/understanding.
 
Well, that's true Dave, but I was thinking about tiling specifically. Since 2xAA forces 2 tiles perhaps the slower bandwidth cause problems, or is that still kinda dumb?? Sorry, if it's logisitcally totally off.
 
The tile copying process is a trivial drain on bandwidth, it shouldn't make more than a very marginal difference. I still say this is a typo though, because it would be a drastic performance hit for xenos overall, and not something MS would be able to keep under wraps like this without it leaking out one way or another. It's pretty much the same as Sony cutting GDDR3 bandwidth in PS3 down to 66% of original; a massive slash that will definitely cut a huge chunk of fillrate out of xenos if true. Forget free AA just for starters. AA will have a big performance penalty with a slower die interconnect.
 
Why does the AA take a hit? That's all on the internal BW on the eDRAM, which is still 256 GB/s. The GPU<>eDRAM BW is for the transfer of buffer data for processing, the initial z pass and then the pixel data. As I understand it that's quite low consumption. The high BW is needed for render to texture type operations, where the data for different textures needs to be passed to the eDRAM, processed, and returned to a texture in memory.

One of the numbers fiends can probably provide details on maximum requirement of a 720p, 1 million pixel render, HDR render, but by my reckoning at the most it'll be a few gigabytes a second needed for FP10. Even if this is a spec change, which I doubt, I can't see it having much of an impact, unless I totally fail to understand Xenos render process :oops:
 
Shifty Geezer said:
Why does the AA take a hit? That's all on the internal BW on the eDRAM, which is still 256 GB/s. The GPU<>eDRAM BW is for the transfer of buffer data for processing, the initial z pass and then the pixel data. As I understand it that's quite low consumption. The high BW is needed for render to texture type operations, where the data for different textures needs to be passed to the eDRAM, processed, and returned to a texture in memory.

One of the numbers fiends can probably provide details on maximum requirement of a 720p, 1 million pixel render, HDR render, but by my reckoning at the most it'll be a few gigabytes a second needed for FP10. Even if this is a spec change, which I doubt, I can't see it having much of an impact, unless I totally fail to understand Xenos render process :oops:

you're on the right path, IMO. even if we assume a cut in the GPU-edram bw down to 66%, that still would not hurt xenos that much, given that the major bw eaters are kept off that link.
 
darkblu said:
you're on the right path, IMO. even if we assume a cut in the GPU-edram bw down to 66%, that still would not hurt xenos that much, given that the major bw eaters are kept off that link.

I think the concern is that the reduction of bandwidth is a symptom of the entire GPU being clocked down to 66% of original. So its not just EDRAM functions taking a hit its EVERYTHING being impacted significantly. I cant imagine they would even ship it with a 350mhz GPU instead of working out the kinks.

The comment of beta kits having a higher clock (400mhz) really points ot this article being a misunderstnading or typo. There's so many BW numbers in the system diagram of the 360 i would think its relatively easy to make such a mistake.

J
 
expletive said:
I think the concern is that the reduction of bandwidth is a symptom of the entire GPU being clocked down to 66% of original. So its not just EDRAM functions taking a hit its EVERYTHING being impacted significantly. I cant imagine they would even ship it with a 350mhz GPU instead of working out the kinks.

The comment of beta kits having a higher clock (400mhz) really points ot this article being a misunderstnading or typo. There's so many BW numbers in the system diagram of the 360 i would think its relatively easy to make such a mistake.

J

well, i though there was this possiblility that Xenos ran @700 and edram interface at half that, no? if that's the case then the only reduction would be in the edram bw. just assume the Xenos SiP clock was held back by the edram clock - you cut down 33% of the latter but that allows you to boost the fomer by quite some - now that'd be one 'loss' i personally would not mind ; )
 
Dave Baumann said:
Samples are calculated inside the eDRAM chip, so this is unlikely to be an issue.

I've not heard of anything different from ATI either way, but given the figure it seems to be to be a missprint/understanding.
Just wondering is this your 'offical statement' on this or were you waiting for additional clarification?

J
 
darkblu said:
well, i though there was this possiblility that Xenos ran @700 and edram interface at half that, no? if that's the case then the only reduction would be in the edram bw. just assume the Xenos SiP clock was held back by the edram clock - you cut down 33% of the latter but that allows you to boost the fomer by quite some - now that'd be one 'loss' i personally would not mind ; )

What about xenos running 350mhz with a synchronous clokc to the memory? That would be a downgrade across the board wouldnt it?

J
 
expletive said:
What about xenos running 350mhz with a synchronous clokc to the memory? That would be a downgrade across the board wouldnt it?

J

undoubtedly. i was just considering a less grim alternative to the above : )

btw, i'm not sure you'd call a halved edram interface 'asynchronous' - it'd be still synchronous, just factored down. the essential part here being that still each and every mem accesses by the GPU would take a constant time to be carried out; the only thing that canges from the POV of the GPU is the effective BW per clock.
 
Shifty Geezer said:
700 MHz sounds very optimistic!

yes, i agree. that's why i hypothesised that if we assume it was the edram daughter part which held the clock bottleneck, then devising a 2:1 GPU-to-edram clock scheme would actually play out rather well.
 
darkblu said:
btw, i'm not sure you'd call a halved edram interface 'asynchronous' - it'd be still synchronous, just factored down. the essential part here being that still each and every mem accesses by the GPU would take a constant time to be carried out; the only thing that canges from the POV of the GPU is the effective BW per clock.

Gotcha. I wonder what the actual tradeoff would be on the edram side to get a 40% boost (200mhz more than the claimed 500mhz) on everything else...


IIRC, all of the developers were very pleased with the final hardware once it was received. Considering they were coming from a 400mhz GPU and were pleased with the performance gains i cant see how we're back down to 350mhz...

J
 
Back
Top