More info about RSX from NVIDIA

What sort of performance advantages should RSX be expected to see over G70 in a closed-environment like the PS3, where bandwidth concerns are alleviated to a degree and the CPU should do a much better job of keeping the GPU fed?
 
Fafalada said:
ERP said:
7 ops fot the DP, 1 for the RSQ, and 4 for the scale that makes 12 in my book.
Oh crap :oops: Lack of sleep and all that, I counted scale as one op... Nevermind then :p

unless you're processing quaternions, why would you want to normalize a r4 vector?

an r3 vec would be: 5 ops DP, 1 RSQ, 3 scales = 9 in total
 
Do you mean the total vertex load typically outweighs the total pixel load? I'm sure that's not the case. There's a reason why pixel shaders have always outnumbered vertex shaders..the load is "typically" weighted toward pixel shading.
I wasn't talking about total load, I was talking about per-shader load. For an individual execution of a block of shader code, a pixel shader will be lighter than a vertex shader. The reason we have more pixel shader units is because more pixels are rendered than polygons. A few instructions per pixel is a huge load, but if execution of vertex shaders can go that much faster, you can at least save that much more time.

And actually, there are plenty of cases in practice where you end up being vertex shader limited due to the fact that you have so few vertex pipelines.
 
xbdestroya said:
What sort of performance advantages should RSX be expected to see over G70 in a closed-environment like the PS3, where bandwidth concerns are alleviated to a degree and the CPU should do a much better job of keeping the GPU fed?
How are bandwidth concerns alleviated in a closed environment?
 
ralexand said:
How are bandwidth concerns alleviated in a closed environment?

Well I was really just refering to the bandwidth offered via the FlexIO as opposed to PCI express.
 
This looked familiar:
ps.gif

If you check the link location, it's from Dave's original nv40 review. My bet is that the texture units still aren't de-coupled from ALU1. Also, that review seemed to indicate a single scalar per pipe. If Xenos also has free norm like I eluded to earlier, then here is my per clock break down.

RSX - ( VS = 8 * ( vec4 + scalar + non-filtered tex ) ) + ( PS = 48 vec4 + 24 scalar + 24 norm )) or
RSX - ( VS = 8 * ( vec4 + scalar + non-filtered tex ) ) + ( PS = 24 vec4 + 24 scalar + 24 norm + 24 tex )
RSX** - ( UNI = 32 vec4 + 32 scalar + 24 norm + 24 tex + 8 non-filtered tex )
Xenos - ( UNI = 48 vec4 + 48 scalar + 16 norm + 16 tex + 16 non-filtered tex )

** total number of units combined in vertex array and pixel array
 
RSX** - ( UNI = 32 vec4 + 32 scalar + 24 norm + 24 tex + 8 non-filtered tex )
RSX** - ( UNI = 56 vec4 + 32 scalar + 24 norm + 24 tex + 8 non-filtered tex )
 
unless you're processing quaternions, why would you want to normalize a r4 vector?
Same reason we have DOT4 in the ISA even though most of the time everyone just uses DOT3.
I was under impression that NV4x already supported free Normalize4, so it would be weird if they downgraded G70 in that respect.
 
Some machine translation from the blog feature that One found in his PS3 dev kit thread that though not directly RSX related, plays in to the questions I was asking.

In case of the PS3, Cell side is, developing geometry, making form decide, it can do collision processing and interactive processing. Anyhow because it is super high speed, with the operation it is not discouraged. The operational result remains as an enormous geometry data, but this it transfers to the RSX which is the GPU with the Redwood of the high-speed communication インターフェス. Doing tessellation and shading processing inside the RSX, real it finishes in the picture. As for the PS3 because bus width to be wide ability of the Cell is high, differs from the CPU and method of using the GPU the PC rather, it is Cell side to, it is the case that it can be popular.
 
xbdestroya said:
Some machine translation from the blog feature that One found in his PS3 dev kit thread that though not directly RSX related, plays in to the questions I was asking.

In case of the PS3, Cell side is, developing geometry, making form decide, it can do collision processing and interactive processing. Anyhow because it is super high speed, with the operation it is not discouraged. The operational result remains as an enormous geometry data, but this it transfers to the RSX which is the GPU with the Redwood of the high-speed communication インターフェス. Doing tessellation and shading processing inside the RSX, real it finishes in the picture. As for the PS3 because bus width to be wide ability of the Cell is high, differs from the CPU and method of using the GPU the PC rather, it is Cell side to, it is the case that it can be popular.
And can you translate the translation?
 
i think is 2*(vec4 + scalar) par pixel pipeline not 2vec4 + 1 scalar (2 vec4 +2 scalar + 1 norm = 5 instruction)
total = 56 scalar not 32


2005-6-21-16-10-14-654986702.gif
 
What Rockster is saying is that you can't use the first ALU if you're sampling a texture too in the same cycle. At least this is how NV40 acts, dunno about G70.
 
I'm confused about this numbers. Are real or not? Just G70 numbers? Is the RSX numbers projections of G70 numbers based in Ghz increase?
 
ralexand said:
xbdestroya said:
Some machine translation from the blog feature that One found in his PS3 dev kit thread that though not directly RSX related, plays in to the questions I was asking.

In case of the PS3, Cell side is, developing geometry, making form decide, it can do collision processing and interactive processing. Anyhow because it is super high speed, with the operation it is not discouraged. The operational result remains as an enormous geometry data, but this it transfers to the RSX which is the GPU with the Redwood of the high-speed communication インターフェス. Doing tessellation and shading processing inside the RSX, real it finishes in the picture. As for the PS3 because bus width to be wide ability of the Cell is high, differs from the CPU and method of using the GPU the PC rather, it is Cell side to, it is the case that it can be popular.
And can you translate the translation?

What they're saying is that the Cell can work on stuff like physics, collisions, geometry, etc... and export the data to RSX. Also saying that due to the bandwidth offered between Cell and RSX, this sort of operation is encouraged rather than discouraged.
 
Back
Top