Does the transistors matter?

weaksauce

Regular
360 CPU - 165M
Xenos - 230 in core, about 100 in edram

Cell - 234M, with all the spu's or whatever it's supposed to have eight of.
RSX - 300+

And looking at how the eDram only manages to handle the HD resolution and AA, Xenos has about 70M less transistors to count with. The 360 CPU is short on some transistors too.

But does it matter, at all?
 
weaksauce said:
360 CPU - 165M
Xenos - 230 in core, about 100 in edram

Cell - 234M, with all the spu's or whatever it's supposed to have eight of.
RSX - 300+

And looking at how the eDram only manages to handle the HD resolution and AA, Xenos has about 70M less transistors to count with. The 360 CPU is short on some transistors too.

But does it matter, at all?

Some ppl will say it does, others say it does'nt. i personally think it does because there are less tranys for the logic
 
Theoretically yes, it matters. It's the transistors that do the work. All things being equal, and if the designs are equally intelligent, than more transistors has to equal better results. The only way more transistors doesn't provide a benefit is if they're not used as efficiently.

Due to the diversity of the machines it's yet to be seen whether in practical terms the machine with more transistors does a better job though. You should also include the eDRAM transistors in the GPU as valid because they're serving a purpose without which the logic would need to fulfil. Hence the difference in transistor counts between XB360 and PS3 is probably pretty much only the difference in counts between Cell and XeCPU.
 
Shifty Geezer said:
Theoretically yes, it matters. It's the transistors that do the work. All things being equal, and if the designs are equally intelligent, than more transistors has to equal better results. The only way more transistors doesn't provide a benefit is if they're not used as efficiently.

Due to the diversity of the machines it's yet to be seen whether in practical terms the machine with more transistors does a better job though. You should also include the eDRAM transistors in the GPU as valid because they're serving a purpose without which the logic would need to fulfil. Hence the difference in transistor counts between XB360 and PS3 is probably pretty much only the difference in counts between Cell and XeCPU.

About the edram, if you would choose to not use AA, which they probably never will, but still, would you be able to free the core from something else then?
 
Athlon 64 (Venice - 512KB L2) - 76 million transistors
Athlon 64 FX (San Diego - 1MB L2) - 114 million transistors
Pentium 4 (Prescott - 1MB L2) - 125 million transistors
Pentium-M (Dothan - 2MB L2) - 140 million transistors

Considering that the Athlon 64/FX trounces the P4 and runs 1GHz slower doing so, I'd say #transistors /= performance. Within the same architecture, however, you can get more performance with more transistors by adding extra cache.
 
scooby_dooby said:
Don't NIVDIA cards usually have much higher tranny counts than ATI for roughly the same performance?

Not really. Take a look at the X1800XT and the 7800GTX for comparison. The X1800XT has 321 million transistors, and the 7800GTX has 302 million. Throw in some extremely wild hardward differences such as the X1800XT having only 16 pixel pipelines the the 7800GTX having 24. The X1800XT having only 16 texture units and the 7800GTX having 24 again. And then in the fact that the standard model of each that the X1800XT wins most benchmarks and performanes better, then add in clock speeds, etc etc etc. And my conclusion is: No, there is no way you can compare performance by number of transistors with two wildly different architectures.

Looking at the A64s and P4s once again shows that you cant.

Overall: Transistor count, and clock speeds tell us little about performance. You have to have a fundemental understanding of how all of those items are being used to grasp its performance.
 
Transistor useage is much more important than transistor count.

Before transistor counts tell you anything you have to know precisely how many of those transistors are redundant systems built in to improve yeilds instead of performance.
 
OtakingGX said:
Athlon 64 (Venice - 512KB L2) - 76 million transistors
Athlon 64 FX (San Diego - 1MB L2) - 114 million transistors
Pentium 4 (Prescott - 1MB L2) - 125 million transistors
Pentium-M (Dothan - 2MB L2) - 140 million transistors

Considering that the Athlon 64/FX trounces the P4 and runs 1GHz slower doing so, I'd say #transistors /= performance. Within the same architecture, however, you can get more performance with more transistors by adding extra cache.

That's a very simplified way of looking at it. There are more factors that need to be accounted for. As for this discussion, a higher transistor budget does allow for higher performance if the potential can be used to its potential. In regards to PS3 and Xb360 - who knows? Each new architecture with its design is a gamble on the vendors part - a guess at the future and what developers would want to use. More dedicated logic or more flexibel units? The designs and philosophies are quite different and we won't know until a few years down the road which design choices turn out to be the better ones - and even then, the answers will be mixed: You'll have certain developers that will love development on one platform and extract loads of performance to their visions while not liking the other and vice-versa.
 
Also keep in mind that Cell should be pretty much on par with Xenos in regards to logic gates. 1mb of cache memory is about 50 mio transistors on a conservative estimate. Xenos includes 1 mb of shared cache while Cell has 512Kb + 8 * 256kb for the spe local stores (totals about 2,5 mb or 125mio transistors) of fast on chip sram.
 
Long story short... yes and no. Transistors do matter, but only to a certain degree. While you could get more performance per clock from using more transistors, more transistors also means more heat and power consumption and that means greater difficultly of running the processor at higher clock speeds. Take the Radeon x850 and the Geforce 6800 Ultra for example... both video cards performed roughly equivalent to each other overall, but the Radeon x850 had 160 million transistors and the Geforce 6800 Ultra had 220 million transistors. The Radeon x850 ran at 540Mhz and the Geforce 6800 Ultra ran at 400Mhz... but in the end they both performed the same (maybe a very slight advantage towards the Radeon x850.) The difference? The yield on the Radeon x850 for a long time was a lot better than the Geforce 6800 Ultra and as a result ATI made more money (profit) on those GPUs than nVidia did with their Geforce 6800 Ultras. As for the current GPUs (Radeon x1800 and nVidia 7800 series)... I don't think either GPU has very good yield at the moment due to the high transistor count of both GPUs. In which case between the two video cards it now comes down to HOW you use those transistors and the clock rate of those processors which becomes more important than the number of transistors.

Transistors do matter, but so does the clock speed of the processor... but I believe HOW those transistors are used is even more important than both.
 
PiNkY said:
Also keep in mind that Cell should be pretty much on par with Xenos in regards to logic gates. 1mb of cache memory is about 50 mio transistors on a conservative estimate. Xenos includes 1 mb of shared cache while Cell has 512Kb + 8 * 256kb for the spe local stores (totals about 2,5 mb or 125mio transistors) of fast on chip sram.

Hrm... I am not totally sure how many transistors are needed per bit of L2 memory, but I believe it is 6 transistors per bit if I remember right. XENON has 1MB of L2 which assuming 6 transistor per bit or 48 transistors per byte that would come out to roughly 49 million transistors subtracted from the 165 million. On the Cell processor you have 512KB of L2 + 256KB of Local Storage per SPE (PS3's Cell has 7 ACTIVE SPEs instead of the full 8). But ArsTechnica reports each SPE as having 21 million transistors per SPE of which 7 million is logic and the other 14 million is SRAM and that comes out to around 512KB (6t per bit) + 256KB (14mt SRAM per SPE)*7 or about ~122.6 million transistors subtracted from the 234 million transistors. That is not all though... remember ONE SPE was disabled for yield and that removes another 21 million transistors from 220 million transistors. In total... the Cell processor has roughly 102.5 million transistors that is not SRAM and was not disabled and XENON has roughly 117 million transistors that is not SRAM.

So yea I guess you are right... both processors should have roughly equivalent number of logic gates, perhaps a little more on XENON compared to the Cell. But as I said before... it is how you use those transistors that also matters, and on that note only time will tell which implementation will be better suited for the gaming enviroment.

*EDIT* Updated Cell's transistor count for the correct amount.
 
Last edited by a moderator:
The GameMaster said:
Hrm... I am not totally sure how many transistors are needed per bit of L2 memory, but I believe it is 6 transistors per bit if I remember right. XENON has 1MB of L2 which assuming 6 transistor per bit or 48 transistors per byte that would come out to roughly 49 million transistors subtracted from the 165 million. On the Cell processor you have 512KB of L2 + 256KB of Local Storage per SPE (PS3's Cell has 7 ACTIVE SPEs instead of the full 8). But ArsTechnica reports each SPE as having 21 million transistors per SPE of which 7 million is logic and the other 14 million is SRAM and that comes out to around 512KB (6t per bit) + 256KB (14mt SRAM per SPE)*7 or about ~122.6 million transistors subtracted from the 220 million transistors. That is not all though... remember ONE SPE was disabled for yield and that removes another 21 million transistors from 220 million transistors. In total... the Cell processor has roughly 76.4 million transistors that is not SRAM

234m-21m-cache(512*1024*8*6=~25m)-sram(98m) = 90m

In fairness, though, we don't know if there'd be any redundant transistors within cores/spus either. It is a nice illustration of the control logic/core count tradeoff in cell though, they've certainly got a lot out of their transistor budget from a execution pov. And of course, the memory structure is kind of fundamental to its performance. Which I guess just kind of illustrates what everyone's saying here! ;)
 
Last edited by a moderator:
let's not forget that transistors are often counted in different ways so you can only compare transitor counts if the chips come from the same company
 
All this talk of transistors and edram vs sram efficentcy, made me think of all the reasearch Sony/Toshiba are pouring into the floating body eDRAM. In the future imagine the area hungry SRAM on each SPU replaced with eDRAM with the floating body effect. Transistor logic and memory will be close in parity, which would reverse the current trend on CPU's, where the bulk of transistors are memory.

The future of CELL looks very bright if SOI eDRAM can be pulled off. Now all Kutaragi needs to do is work out an alliance wit nVidia similar to the one forged with IBM and Toshiba on CELL.
 
weaksauce said:
360 CPU - 165M
Xenos - 230 in core, about 100 in edram
Xenos logic should be more than 255 Mtransistors. (edram is 80+ Mtransistors)
Cell - 234M, with all the spu's or whatever it's supposed to have eight of.
Cell DD2 is more than 250 Mtransistors.

But does it matter, at all?
maybe ;)
 
Brimstone said:
All this talk of transistors and edram vs sram efficentcy, made me think of all the reasearch Sony/Toshiba are pouring into the floating body eDRAM. In the future imagine the area hungry SRAM on each SPU replaced with eDRAM with the floating body effect. Transistor logic and memory will be close in parity, which would reverse the current trend on CPU's, where the bulk of transistors are memory.

The future of CELL looks very bright if SOI eDRAM can be pulled off. Now all Kutaragi needs to do is work out an alliance wit nVidia similar to the one forged with IBM and Toshiba on CELL.
Isn't eDRAM terrible on latency?
 
Everyone has summed it up well. It matters, but so does efficient chip design and clock speed.
nAo said:
Xenos logic should be more than 255 Mtransistors. (edram is 80+ Mtransistors)
I thought it was ~232M for logic and ~100M for eDRAM + logic on the daughter die.
 
one said:
Isn't eDRAM terrible on latency?

Traditional eDRAM isn't on SOI, while the floating body eDRAM will be on the SOI itself allowing for very fast speeds. I don't know for sure if it would be fast enough or what problems it might present.

Maybe if someone asked Peter Hofstee he might answer on the possibility on replacing the SRAM on CELL with Floating Body CELL eDRAM at some point in the future? He posts on the IBM CELL website correct?
 
Nicked said:
I thought it was ~232M for logic and ~100M for eDRAM + logic on the daughter die.
I thought the 'other die' was 100M, 80 mem + 20 logic.
 
Back
Top