SM = Shader Multiprocessor
RF = Register File
Right, that's why I said per SM, rather than per cluster The changes I'm talking about are at the SM level (and then higher up at the scheduler, and then out into the RF).
Uhm but 2x8x10 equals to 160... This simply does not add up with the 240 figure...
So in each cluster there should be other 8 SP that are not counted in the SM?
Like Rys said the changes are at the SM level. Doesn't have anything do with the number of SM's per cluster (could still be 3 for a total of 240 SPs)
The 8 SP you're referring to is an SM.
OK, so it could be the inclusion of a scheduler and RF for each SM (instead of one for each cluster), leaving each SM completely independent form others in terms of thread processing? :smile:
Each SM already has its own scheduler and RF in G80 AFAIK. When it comes to CUDA all considerations are always per SM - registers, threads, shared memory etc. It looks like the only things the SM's share are the TMUs and L1 cache and are independent otherwise.
Edit: Heh, pretty much just like that ^
nVidia slides from CJ clearly show that in official numbers it's 240. Of course, perhaps they really will be more powerful due to the "rediscovered MUL".And while I'm at it, I don't think it's 240 either
Not quite solved Those resources (scheduler and RF) have been per-SM from the beginning, and their basic architecture doesn't really change with this new chip (although RF is a different size now).So mystery solved, I suppose...
But what does IU mean, in your opinion? And what's that "local memory"? Maybe they have embedded the L2 cache into each SM?
Yeah, I said upstream that it's 240 FP32 SPs.nVidia slides from CJ clearly show that in official numbers it's 240. Of course, perhaps they really will be more powerful due to the "rediscovered MUL".
Ouchie on the possible price!G200-A3 with GDDR5 as $899 Ultra-like product?
Not quite solved Those resources (scheduler and RF) have been per-SM from the beginning, and their basic architecture doesn't really change with this new chip (although RF is a different size now).
L2 is pooled still (but bigger proportionally, there's quite a lot of SRAM on this thing, although nothing compared to RV770 ).
Some people under NDA are hinting Nvidia has even faster products than the GTX 280 (65 nm) on it's schedule for 2008, and it's not the GT200b. Any info on that?
That picture's not really representative.
And what about RV770, are you thinking of a further increase of cache compared to R600/RV670?