RSX pipeline speculation?

RSX pipeline configuration?

  • Option A: 8 VS pipes, 2-issue: 24 PS pipes, 5-issue

    Votes: 39 59.1%
  • Option B: 12 VS pipes, 3-issue: 20 PS pipes, 5-issue

    Votes: 5 7.6%
  • Option C: 4 VS pipes, 4-issue: 24 PS pipes 5-issue

    Votes: 10 15.2%
  • Option D: 20 VS pipes, 2-issue: 16 PS pipes, 6-issue

    Votes: 3 4.5%
  • Option E: Other! Please specify...

    Votes: 9 13.6%

  • Total voters
    66

j^aws

Veteran
AT E3, 136 shader ops per cycle were presented for RSX, which can be equated to 136 instructions/cycle.

Also 51 billion DOT products per second were presented for CELL+RSX combined. From above it was deduced that RSX would have 52 DOTS/cycle --> 52 vec4 units.

http://www.beyond3d.com/forum/showpost.php?p=484757&postcount=119

An over-clocked 7800 will have 56 DOTS/cycle and therefore not agree with the numbers for 52 DOTS/cycle. However it does agree with 136 instructions/cycle.

We know that each PS pipes have 2 vec4 units and each VS pipes have 1 vec4 unit. Also PS pipes are 5-issue and VS pipes are 2 -issue. I.e.


PS issue rate * PS pipes + VS issue rate * VS pipes = 136

2*PS pipes + 1* VS pipes = 52

PS pipes > 0 and integer
VS pipes > 0 and integer

Now solving for,

PS issue rate = 5,6,7
VS issue rate = 2,3,4

-----------------VS issue 2-------VS issue 3----------VS issue 4----------VS issue 5
PS issue 5----- FAIL--------------12 VS:20 PS--------4 VS:24 PS---------FAIL
PS issue 6----- 20 VS:16 PS-----FAIL-----------------FAIL-----------------FAIL
PS issue 7----- FAIL--------------FAIL-----------------FAIL-----------------FAIL

So in order to get a valid solution to the above equations, either the PS pipes are modified for 6-issue or the VS pipes are modified for 3-issue or 4-issue.

Summary

Option A

8 VS pipes, 2-issue
24 PS pipes, 5-issue

Modified 7800 and basically disagrees with the dot product numbers from E3.

Option B

12 VS pipes, 3-issue
20 PS pipes, 5-issue

Option C

4 VS pipes, 4-issue
24 PS pipes 5-issue

Option D

20 VS pipes, 2-issue
16 PS pipes, 6-issue

Option E

Other!

Please specify...
 
Well... my current assumption is that the RSX is an overclocked uncrippled Geforce 7800GTX (with the differences being a FlexIO interface instead of a PCIe interface and a 128bit memory interface instead of the 256bit memory interface).

Though given the clock speed target AND the presence of a 128bit memory interface I am inclined to believe it may be closer to a Geforce 7800GT. If you was to remember back to the Geforce 6 series the biggest differences between the Geforce 6800 series and the 6600 series was clock speed, transistor count, pipeline count and memory interface. The Geforce 6600GT had 8 pixel pipelines and 3 vertex pipelines and had a transistor count of 140 million as well as a clock speed of 500Mhz. This compared to the 6800GT which had a transistor count of 220 million and a clock speed of 375Mhz and having all of it's pipelines intact (16 pixel/6 vertex) the last difference is that the 6600 series had a 128bit memory interface compared to the 256bit memory interface of the 6800 series.

Given that precedence is correct about this then that would mean the RSX would end up having 16 pixel pipelines / 6 vertex pipelines, a transistor count of 200-220 million, the 128bit memory interface, and a clock rate of 500-550Mhz which would be very similar to what the specifications of the Geforce 7600GT should look like.

That is my line of thought anyway... time will tell if that proves accurate.
 
The GameMaster said:
Given that precedence is correct about this then that would mean the RSX would end up having 16 pixel pipelines / 6 vertex pipelines, a transistor count of 200-220 million, the 128bit memory interface, and a clock rate of 500-550Mhz which would be very similar to what the specifications of the Geforce 7600GT should look like.

..

For starters, they already said RSX would be over 300m transistors. I don't see how 6vs/16ps would really fit with their other figures either.

Interesting topic, but I'm not gonna venture to guess what the exact makeup is. Hopefully we'll find out sooner rather than later though.
 
For all you 'sneaky' voters that are choosing 'other', you're not explaining why! Please do... ;)
 
My assumptions would be that the RSX is at the very least 1 step ahead of the 7800 which is somewhere in between the G70 and G80.

Gamemaster,you are predicting the RSX by comparing the ratio of transistor counts and VS/PS pipes of the last generation 6 series which very well be inaccurate.It doesn't work that way in console GPUs.Take for example the Xenos.Does it have a high transistor count without the edram daughter board included?The G70 may have similar transistor count with the RSX(>300M) but the RSX is customised for a console just like the Xenos is to R520.

The specs which Nvidia announce earlier isn't final.In addition to that Nvidia had release their clarification to TeamXbox that the RSX will indeed be more powerful than the G70.
Nvidia is keeping tight lipped at the moment is because they are reluctant to disclose roughly what their next pc part will be like which is the G80 or whatever.
 
The only logical conclusion is it is an overclocked 7800gtx (same pipe numbers) with a few feature changes maybe (who knows?) and a FlexI/O interface -- considering the transistor count is only 2 million off the 7800GTX (and there shouldn't be that useless 20million transistors for that video acceleration). I wouldn't put much weight on the Dot production calculation because it was figured out by different people than the original (entire system dot product numbers) -- people still have trouble replicating the gflops numbers for Cell and XeCPU, how can we assume that people can magically get an accurate split of the Dot product of Cell and RSX?

Don't cross out possibilities based on forum goer calculations.
 
Bobbler said:
... I wouldn't put much weight on the Dot production calculation because it was figured out by different people than the original (entire system dot product numbers) --

I don't know who you're referring to but those numbers were derived from the entire system by myself and verified by Panajev,

http://www.beyond3d.com/forum/showpost.php?p=485014&postcount=133

people still have trouble replicating the gflops numbers for Cell and XeCPU,

I don't know who these people are but both can be replicated easily...

how can we assume that people can magically get an accurate split of the Dot product of Cell and RSX?

There's no magic. It's lateral thinking and maths...if you don't understand the derivation then just ask and I'll explain...

Don't cross out possibilities based on forum goer calculations.

Who's crossing out possibilities? In fact this Poll adds an extra 3 possibilities to the 'defacto' overclocked 7800 possibility...
 
Jaws said:
I don't know who you're referring to but those numbers were derived from the entire system by myself and verified by Panajev

What I'm saying is that with all the fudginess that goes on with the numbers can you honestly say you know for certain the split that Sony choose when they came up with this 51b dot products? I don't put much weight in it bceause the fudgy creators of the number aren't the ones that are splitting it -- there is a bit of unknown in how they came up with it (as you eluded to in your post that you linked in your original post). I don't have any real doubts in people's abilities to calculate numbers (those are probably the numbers I'd come up with as well, but since I don't know exactly how Sony got the numbers I can't put much weight in them), its Sony's that I have some doubt in (not because they are technically inept, but because they may bend/warp the truth a bit here and there).

Also I vaguely remember quite a few people having trouble coming up with the exact same gflop numbers that Sony/MS did.
 
gflop numbers

Bobbler said:
Also I vaguely remember quite a few people having trouble coming up with the exact same gflop numbers that Sony/MS did.

Yes a lot of articles said CELL cannot do 217.6 or ~218Gflops but forget that PPE of Cell = 1 core of Xenos. Total Xenos = 115 Gflops so one core = 115/3 = 1.5 x SPE gflop rating. So CELL = 7 SPE + (1.5 x SPE) = 217.6 Gflops. So if Xenos rating correct, then CELL rating correct. Article writers = forgetful.
 
Mistype

ihamoitc2005 said:
Yes a lot of articles said CELL cannot do 217.6 or ~218Gflops but forget that PPE of Cell = 1 core of Xenos. Total Xenos = 115 Gflops so one core = 115/3 = 1.5 x SPE gflop rating. So CELL = 7 SPE + (1.5 x SPE) = 217.6 Gflops. So if Xenos rating correct, then CELL rating correct. Article writers = forgetful.

Sorry, I meant Xenon, not Xenos.
 
Bobbler said:
What I'm saying is that with all the fudginess that goes on with the numbers can you honestly say you know for certain the split that Sony choose when they came up with this 51b dot products?

Calculate for CELL, Calculate for RSX, then add them up.

Bobbler said:
I don't put much weight in it bceause the fudgy creators of the number aren't the ones that are splitting it -- there is a bit of unknown in how they came up with it (as you eluded to in your post that you linked in your original post). I don't have any real doubts in people's abilities to calculate numbers (those are probably the numbers I'd come up with as well, but since I don't know exactly how Sony got the numbers I can't put much weight in them), its Sony's that I have some doubt in (not because they are technically inept, but because they may bend/warp the truth a bit here and there).

Well your argument seems to based on Sony 'fudging' the 51 B dots/sec. The problem with that argument is that combined CELL+RSX number is LOWER than it should be with an overclocked 7800. Unless Sony now has reached a new low as to fudge low, it doesn't make sense. Like I already mentioned for Option A, the original dot number from E3 could be a mistake, but that 51 B number is Lower than it should be.

Bobbler said:
Also I vaguely remember quite a few people having trouble coming up with the exact same gflop numbers that Sony/MS did.

The common problem is that the PPE/XeCPU single core are 12 Flops/cycle according to the PR Gflops nos. What was expected was 8 from VMX and 2 from FPU, making 10 Flops/cycle. However, 4 from FPU because of 2-way SIMD as in Gamecubes CPU, Gecko. Now these numbers are more like fudging...
 
^^

XeCPU= 12 Flops/cycle x 3 Cores x 3.2 GHz
~ 115 GFlops

CELL PS3 = [12 Flops/cycle x 1PPE + 8 Flops/cycle x 7 SPE] x 3.2 GHz
~ 68 x 3.2 GHz
~ 218 GFlops
 
I chose Option A. How's it work with the DOT product number? 6 SPEs. :D One is tied up for the OS. That brings me to 50M DOTS if you use 56 DOT/cycle for RSX then. 1M DOTs missing. I chalk it up to a rounding error. :lol Come on, 2% error is acceptable in polls. It's acceptable in forum speculation. AMIRITE? PEACE.
 
MechanizedDeath said:
I chose Option A. How's it work with the DOT product number? 6 SPEs. :D One is tied up for the OS. That brings me to 50M DOTS if you use 56 DOT/cycle for RSX then. 1M DOTs missing. I chalk it up to a rounding error. :lol Come on, 2% error is acceptable in polls. It's acceptable in forum speculation. AMIRITE? PEACE.
I thought some of the IBM documents noted SPEs would not be a good place for an OS to run?

Could the small difference between Sony's slides from E3 and G70 be related to a typo? That has been known to occur on slides. G70@550MHz is very close performance wise. ::shrugs:: I am sure we will know sooner or later what modifications have been done.
 
Stop with the crazy flopratings, it doesnt mean anything.
The 136 instructions number would imply a GTX but nVidia dont speake about future products so its no fact and could well change.
Untill we know for sure my guess goes to GTX@550MHz with FlexIO and TurboCache.
 
I haven't voted yet, but at the same time as the days go by I get further and further from the notion that it's just a GTX in console's clothing. The implication from past interviews has been that G70 and RSX development began concurrently, originated from the same architectural base. But at the same time there comes a point where I start doubting whether the move to 90nm and a different bus system alone would warrant the extreme differences in time for the chips's respective tape-outs. I mean, there can always be a spin problem, so not like I'm ruling that out... but with G70's tape out now approaching a six month and counting gap to RSX's upcoming tape-out...

Well, I'll be a little dissapointed if it's just a tweaked GTX. Not that I'll be complaining loudly though, just more along the lines of 'get it over with already if that's what it is.'

My hope however would be a situation analogous to R520:R580::G70:RSX, where RSX obtains an incremental gain in feature set in that same 'half-generation' sort of way as it seems the R580 will.
 
xbdestroya said:
My hope however would be a situation analogous to R520:R580::G70:RSX, where RSX obtains an incremental gain in feature set in that same 'half-generation' sort of way as it seems the R580 will.
I am not sure R580 is getting any significant feature additions/improvements over R520; at least if R530 is any indication. It seems, to me, very unlikely that RSX would be to G70 that R580 will be to R520.

What appears to be happening is R580 is going to be a "Shading Stud" (what current apps need this shading power? Not sure). ATI decoupled the TMUs and ROPs and has a new threaded scheduler. With R580, if they do go the "48 fragment pixel shader" route it would seem to me it may end up in an array, ala Xenos.

R520 seems, architecturally, to be a half-step toward Xenos--minus the array and unified shaders (and eDRAM of course). Architecturally Xenos/R520 share decoupled TMUs and a much more effecient scheduler in a addition to quite a few features (e.g. FP16/10 + MSAA, 3Dc, Addaptive AA, even physics ability on the GPU). So while conjecture, if R580 is what we think it is (4-1-3-1) then treating it like an array, like Xenos, seems probable.

In which case, barring any really interesting new features (which I am sure DX10 will require), R600 will be mainly the introduction of unified shaders to the desktop PC. It seems ATI has decided to use R5xx as a test bed and has taken incrimental steps by moving forward design ideas that are benefitial to current gen GPU/API technology.

Based on the Sony figures alone a 520=>580 evolution for G70=>RSX is not possible based on the numbers (you are tripling the Pixel Shaders from R520 to R580). Of course this is assuming things have not changed... who knows.

Personally I am hoping for some peppy ideas (like ATI's memory configuration on R520) + G70 @ 550MHz + FlexIO + HDR/MSAA support at the same time. I think people forget that G70 is Nvidia's flagship model. We are talking about a 28% bump in frequency to the fastest GPU currently on the market (let me check my watch... yep, still the fastest :LOL: ) If Nvidia could have introduced some really radical performance increases and such into their GPUs they would have. I am certain they wish certain situations (like no HDR + MSAA support) were not the realities they are, but they were design decisions for this generation (and in the HDR/AA case Kirk seems to indicate this may not change soon).

Sony/NV went the tried and true route and have an adaptation of of NV's flagship GPU. That in itself is pretty amazing. Obviously we are not seeing the same movement from NV as we are with ATI (NV has, until recently, been against USA so some of the ideas ATI do not necessarily apply). G70 is a tried and true design and has demonstrated to be a great refresh to NV40. Not much to complain about there!

MS/ATI went the riskier route with a quasi-DX10 part; but as we are finding out a lot of the "neat" ideas in Xenos are in R520 as well. I think 12hrs from now we will begin to see a little bit more about how some of these design ideas work in the real world and whether they were the right direction to go. Of course Xenos is a custom part and disimilar to G70/R520 in many ways... and I think that kind of underscores the GPUs in general: RSX is an adaptation of an excellent, flagship model, desktop GPU. Xenos is an adaptation of the DX10 roadmap and ATI's next gen part (which shares some common features with R520).

With RSX I am expecting more of the same--which is a good thing. Just a different approach than ATI. NV's big pay day was with the NV40, whereas ATI took it much easier with R420.

Anyhow, for those reasons I don't expect to see RSX parallel the type of R520=>R580 development we will see. I could be wrong though! :p
 
Jaws said:
For all you 'sneaky' voters that are choosing 'other', you're not explaining why! Please do... ;)
I voted "Other" so that I could see what others have voted, and because I have no idea what's the difference between "2-issue" and "5-issue" :)
 
Acert93 said:
I thought some of the IBM documents noted SPEs would not be a good place for an OS to run?

Could the small difference between Sony's slides from E3 and G70 be related to a typo? That has been known to occur on slides. G70@550MHz is very close performance wise. ::shrugs:: I am sure we will know sooner or later what modifications have been done.
Maybe not OS, but whatever functionality has been rumored to be tied to an SPE. I've read in a few different places that the OS/drivers will tie up a single SPE. I'm assuming Sony will want to have some default functionality built into the firmware/OS/driver layer that handles basics like game controls and audio and so on. So I figure dropping a single SPE can allow the RSX to keep that 56DOT/cycle like Jaws noted, and still approach that figure stated at E3. Then again, I'm assuming something must have changed in the GPU, since it really has been a long time since the G70 landed. It can't take that long to get a 30% clock bump on a smaller process, can it? PEACE.
 
Back
Top