RSX = Stream Processor!?!

mrdarko said:
now i am confused....

mintmaster,could you try to explain what barbarian is hinting at in his last post.
I'm not sure, but my guess it's a misinterpretation somewhere along the communication chain. I have no doubt that DXT1 will be heavily encouraged by NVidia given RSX's low bandwidth. However, a 4096x4096 DXT1 texture is 8MB. If you had that much cache, you'd be much better of using it like Xenos for framebuffer data. I assume they were talking about how a DXT1 texture makes good use of the cache, and can sit in there compressed and it'll be decompressed on the fly as needed.

By the way, why do we still have people thinking RSX will have 32 shader pipes? 22.4GB/s / 32 pipes / 2 DP3/MAD's per pipe / 550MHz = 0.63 bytes per operation. WTF are you performing FP operations on? That would also equal only 1.27 bytes per texture fetch. Furthermore, I'm not even counting z or framebuffer bandwidth, the latter of which is by itself 35GB/s at 16 pixels per clock (in other words, impossible even without blending or FP16).

I know some calculations don't need a lot of data, but surely you guys get my point. The last think RSX needs is more shader pipelines.

EDIT: aaaaa00, good catch. :D
 
Last edited by a moderator:
All GPU have cache, in general IHV don't like to give numbers or informations on it. I'm not going to comment on RSX, just to say current PC chips have caches measured in Kb.

ATI presented some information on there texture cache at GDCE, one thing ATI do which may be different for NV is that the texture cache is for uncompressed texel. I.e. DXT1 is 4 bits per texel, but ATI's cache stores it post-decompression at 4 bytes per texel. IIRC Gamecube and PSP have similar behaviour.
 
DeanoC said:
IIRC Gamecube and PSP have similar behaviour.
GC stores everything compressed actually - it makes sense for both bandwith reasons (GCN embeded ram can't serve 32bit texels in single cycle) and because the cache is configurable as scratchpad - in which case you WANT to keep stuff in it as small as possible.

I am obviously not gonna commen exactly for PSP - let's just say it's kinda half way between the two things.
 
Mintmaster said:
By the way, why do we still have people thinking RSX will have 32 shader pipes? 22.4GB/s / 32 pipes / 2 DP3/MAD's per pipe / 550MHz = 0.63 bytes per operation. WTF are you performing FP operations on? That would also equal only 1.27 bytes per texture fetch. Furthermore, I'm not even counting z or framebuffer bandwidth, the latter of which is by itself 35GB/s at 16 pixels per clock (in other words, impossible even without blending or FP16).

I don't subscribe to the 32-pipe theory, but..

Would it perhaps not be 22.4GB/s + 25.6GB/s - CPU consumption?

Don't know how much that'd help :p Even with 24 pipes, though, your calculation would yield less than a byte per operation.
 
I'm pretty sure Barbarian is referring to DXT1, perhaps he can clarify?

SRAM consists of 6 transistors per bit and can be as low as 4 transistors per bit, assuming 4T/bit or 32T/byte,

512KB cache -> ~ 16 Million transistors
1MB cache -> ~ 32 Million transistors
2MB cache -> ~ 64 Million transistors
4MB cache -> ~ 128 Million transistors

4MB of cache can fall nicely with my speculation in this thread,

Can SRAM be designed for unified cache/eDRAM functionality (RSX)?

... though that sounds like way too much to be on one die...

However, 1MB cache sounds reasonable with further 8x compression, effectively 8MB DXT1 texture (4k x 4k) would always fit or 256KB cache for 2k x 2k textures. This suggests that TMUs can always work on this texture compressed and would obviously be a great bandwidth saving feature...

Edit: typos...
 
Last edited by a moderator:
Thinking about it more, I can't see a "further" 8x compression, which would effectively make it 64x...so my guesstimate is that it's 2k x 2k DXT1 textures compressed 8x, which would mean 2MB of cache...
 
You have to remember though that texture cache may not necessarily be embedded on the gpu (in terms of the 3dfx at least IIRC). It wasn't for the 3dfx and I'm sure that much cache wouldn't be embedded on the RSX...IMO that would be a massive waste of the transistor budget.
 
Last edited by a moderator:
ROG27 said:
You have to remember though that texture cache may not necessarily be embedded on the gpu (in terms of the 3dfx at least IIRC). It wasn't for the 3dfx and I'm sure that much cache wouldn't be embedded on the RSX...IMO that would be a massive waste of the transistor budget.

Yeah, as I suggested above, too much cache would suggest 2 dies. You'd be looking at some sort of MCM package. However, in the linked thread above, speculation was that if you could modify SRAM to also act as eDRAM for PS2 GS backwards compatibility, it would be multiple wins...

Without knowing the texture res, the cache could be,

4k x 4k -> 8MB
2k x 2k -> 2MB
1k x 1k -> 512 KB

The 512 KB cache would match the PPEs L2 cache in CELL. Given consoles limited memory, it could well be these lower res textures...
 
Why do people think Barbarian is more inclined to leak real info then the rest of the devs? I expect he's as much a tease as the rest of them. In fact, I heard tell of an inter-developer competition to see who can get the biggest responses. I'm not sure of the rules, but there's something to do with factoring the number of words of the post with the level of response, which is which they also pass brief comments. That might just be a far fetched rumour but you never know.
 
Speculation is in chaotic sscillation betwwen "RSX is a G70 on 90nm" and "RSX will have 8MB of SRAM and/or G80 features etc..." :D. I'm really not sure Barbarian is the one smoking the good stuff...
 
Shifty Geezer said:
...I'm not sure of the rules, but there's something to do with factoring the number of words of the post with the level of response, which is which they also pass brief comments. That might just be a far fetched rumour but you never know.

Well, I'm not a dev, but let me try:

"Voxels, Nurbs, Ray-tracing...Oh My!"
 
I remembered some time ago when someone brought up the probability of using 1T-SRAM which could save transistor count of upto 1/6 the actual amount of a SRAM.Sony has licensed the right to use 1T-SRAM right?
 
Can any devs comment on how, according to PSM, "The final ps3 dev kit is alot faster than they thought (according to the developer)" and according to Engadget blog, "the final hardware kits released to developers in January were even more powerful than originally anticipated"?
 
ROG27 said:
Can any devs comment on how, according to PSM, "The final ps3 dev kit is alot faster than they thought (according to the developer)" and according to Engadget blog, "the final hardware kits released to developers in January were even more powerful than originally anticipated"?

Link kind Sir... :D
 
Back
Top