More info about RSX from NVIDIA

Chalnoth said:
Jawed said:
I'm sorry, you lost it there. I specifically excluded AA and AF in order to avoid the possibility of muddying the waters with changes in texturing efficiency or memory utilisation efficiency (AA) or, indeed, due to swapping across the PCI Express bus due to the frame buffer's size causing textures to swap out.
Heh, then the games are all CPU-limited, and your argument fails again.

Hahahahaha, you get better. Check the benchmarks and see how SLI 7800GTX is faster in, for example, SC:CT. 54.9 versus 100.9. CPU-limited my arse.

Jawed
 
Shifty Geezer said:
Any game will be limited based on it's own requirements and nothing else (in a balanced system anyhows). A game of loads of tiny, simple objects physically interacting will be CPU bound. A game of simple gameplay with fantastic graphics will be GPU bound. All depends what the dev's trying to do.
Nah, BS. Any game that tries to push the limits of the platform will be no more CPU-limited than GPU-limited. In the end, the only games that are going to be one or the other are going to be cross-platform games (games that don't try to push the limits won't be limited by either the CPU or GPU).
 
Jawed said:
Hahahahaha, you get better. Check the benchmarks and see how SLI 7800GTX is faster in, for example, SC:CT. 54.49 versus 100.9. CPU-limited my arse.
See if you can find three other games that are also GPU-limited without FSAA/Anisotropic filtering (or HDR, for that matter), and also take a look at the fillrate graphs: fillrate performance is still increasing with resolution for each card, indicating that for all cards involved there is still some degree of CPU limitation going on (likely the game uses a multipass rendering algorithm, and one or more of the passes turns out to be heavily CPU-limited, for whatever reason).

But, it won't matter anyway, since you still can't take out other factors that may have changed between the cores. You could downclock the 7800 so that its fillrate/memory bandwidth matched the 6800 Ultra, but then you'd be down on ROP's, and you can't be sure that the games you're running are going to be 100% shader limited anyway.

In the end, you just can't use games to attempt to compare specific features of a core against another. There are far too many factors involved. This is where synthetic benchmarks are important: they actively seek to isolate specific features.
 
pjbliverpool, pixel shaders are 4d, so you can't co-issue a vec4 + scalar/sfu. It's either vec4 or vec3 + scalar/sfu or vec2 + vec2 for a total 16 programmable flops per pipeline in the pixel shader array.
 
Chalnot: I can assure you there are plenty of current generation console games that are CPU limited one way or another, even AAA titles (emh..Jak series..even if it pushes spectacular graphics it's CPU limited most of the time).
CELL and XBOX360's CPU are probably blazing fast processor but it's way more difficult to optmize your code for a CPU than for a GPU.
GPUs workload is way more manageable, CPUs poses multiple problems on multiple leveles, moreover in a standard project there much more programmers working on CPU code than GPU code and it's very difficult to make sure everyone is writing 'good' code.
I'm not saying next gen games will be all CPU limited, I'm just saying we'll see many CPU limited games, maybe things will get better with second or third generation games.
 
Rockster said:
It would seem easy, especially early on, to get PPE bound on CELL fairly quickly.
The turmoil that Sony is inflicting on developers personal lives is laughable. How are you going to explain to a good looking woman that your are having difficulties with your PPE. Conversely if you tell her that you are bound by it she may be curious enough to have a look. :LOL:

Now back to your regularly scheduled program
 
Chalnoth said:
Shifty Geezer said:
Any game will be limited based on it's own requirements and nothing else
Nah, BS.
BS?! How rude!

Any game that tries to push the limits of the platform will be no more CPU-limited than GPU-limited. In the end, the only games that are going to be one or the other are going to be cross-platform games (games that don't try to push the limits won't be limited by either the CPU or GPU).
Oh yes? Here's an idea for a game. It's a puzzle game using simple cubes of varying properties and you have to align them in various ways. What makes it different is the cubes are suspended in a 3 dimensional volumetric space with fully realised pressure changes throughout the entire volume. It's through manipulation of densities and pressure changes that you align the blocks.

Tell me how that game would even get close to the limits of a GPU before finding that the extensiveness of number of objects and size of the volume is limited by the processing capabilities of the CPU.
 
nAo said:
Chalnot: I can assure you there are plenty of current generation console games that are CPU limited one way or another, even AAA titles (emh..Jak series..even if it pushes spectacular graphics it's CPU limited most of the time).
How can you tell?
 
Chalnoth said:
nAo said:
Chalnot: I can assure you there are plenty of current generation console games that are CPU limited one way or another, even AAA titles (emh..Jak series..even if it pushes spectacular graphics it's CPU limited most of the time).
How can you tell?

Performance Analyser. I think nAo has a good idea of the anatomy and the workings of Jak games. ;)
 
Chalnoth said:
That doesn't even sound like it'd be CPU-limited to me.
You don't know a lot about volumetrics then! ;)
In any case, how likely is it to be GPU bound before being CPU bound, if you're only drawing simple cubes but calculating very complex interactions?
 
Shifty Geezer said:
You don't know a lot about volumetrics then! ;)
In any case, how likely is it to be GPU bound before being CPU bound, if you're only drawing simple cubes but calculating very complex interactions?
Well, the way you described it it didn't sound like it would require very complex fluid dynamics calculations.

And even then, if the developer was serious, he might consider offloading some of the calculations onto the GPU.
 
Rockster said:
pjbliverpool, pixel shaders are 4d, so you can't co-issue a vec4 + scalar/sfu. It's either vec4 or vec3 + scalar/sfu or vec2 + vec2 for a total 16 programmable flops per pipeline in the pixel shader array.

Can you not get extra scalars from the two mini ALU's?
 
pjbliverpool said:
Can you not get extra scalars from the two mini ALU's?
You can't on NV40, dunno about G70. Nvidia numbers seems to imply there is an additional scalar op per pixel pipe.
To be fair NV40 pixel pipes also have an additional reciprocal unit that can perform a reciprocal per clock without stalling any other units (even the 16bit nrm) but it's not mentioned anymore on G70, AFAIK.
 
NV g70 Slides said:
frame buffer:
fp32 - SSAA only [ed: no MSAA, no blending]

seems like the fp32 framebuffer is meant essentially for feedback - to the CPU, or to self as a texture. i like that 8)
 
Nvidia numbers seems to imply there is an additional scalar op per pixel pipe.
NVidia's don't, that original hardspell chard did. Nvidia's official pixel shader doc:
1119063771Y3O0GyEDBw_3_3_l.jpg
 
Rockster said:
Nvidia numbers seems to imply there is an additional scalar op per pixel pipe.
NVidia's don't, that original hardspell chard did. Nvidia's official pixel shader doc:
Emh..and what about this?! where does the additional scalar op come from? (note this slide seems coming from nvidia too..)
1119063771Y3O0GyEDBw_3_6_l.jpg
 
Chalnoth said:
Jawed said:
Hahahahaha, you get better. Check the benchmarks and see how SLI 7800GTX is faster in, for example, SC:CT. 54.49 versus 100.9. CPU-limited my arse.
See if you can find three other games that are also GPU-limited without FSAA/Anisotropic filtering (or HDR, for that matter)...

Sorry the onus of proof is on you. I've given you an example of a game where the increased capability of the G70 pipeline makes no difference. Now explain why it and others like it are showing no improvement beyond clock/pipeline-count.

What's the use of fancy technology when games don't get faster because of it? We don't play synthetic benchmarks.

Jawed
 
Jawed said:
What's the use of fancy technology when games don't get faster because of it? We don't play synthetic benchmarks.

While I would not defend every "fancy technology" as delivering better games... it would seem that if it did accelerate performance in a benchmark that it theoretically could help *future* games designed around the added performance in certain areas.

Obviously games are designed to the strengths of the target HW. So I guess the real question is: Is this extra performance something that can/will be utilized?

My observation is that GPU IHVs have gotten better about recognizing bottlenecks in the system and introducing new features and performance increases in the areas that matter the most. There are always exceptions (a recent example would be is FP16 seems very underpowered on NV40) but it does seem IHVs are more realistic and wise about where they invest in their transistor budget.

So, is this an area where games--especially on a closed platform like a console--could benefit from this extra shading performance or are the other bottlenecks (memory, pipelines, fillrate) significant enough that this bump in performance wont be seen in most apps?
 
Back
Top