It can sound silly but is possible (about the RSX)

ERP said:
Devs have at least some info.


But which type of information has developers if had received
sdk with cell 2.4GHZ + Geforce 7800GTX (this since July) with bandwidth from
these with only 2GB/Sec?
 
Heinrich4 said:
But which type of information has developers if had received
sdk with cell 2.4GHZ + Geforce 7800GTX (this since July) with bandwidth from
these with only 2GB/Sec?

Well, the "information" they have is that the final thing will run at 3.2GHz and have a lot more bandwidth. So they can work around that idea.
 
I still think it would be interesting to test a "simulated RSX" with a 7800GTX overclocked on the core to around 550 and downclocked on the memory to 350 - to see what happens to HDR performance in say Far Cry (without AA) and FEAR at 1024x768 and 1280x960.

Someone round here must have a 7800GTX they'd like to play with for educational purposes...

Jawed
 
london-boy said:
Well, the "information" they have is that the final thing will run at 3.2GHz and have a lot more bandwidth. So they can work around that idea.

Sounds is good ,it seems certain, but as developers they can carry through a "simulation" accomplishes of the final hardware of ps3 in one sdk without "the real" interaction between cell and gpu ( and.that it is the most strong point of ps3)?

(unless gpu either really very based in the technology cell tham G70 ... or something cell in this gpu)
 
Last edited by a moderator:
Jawed said:
I still think it would be interesting to test a "simulated RSX" with a 7800GTX overclocked on the core to around 550 and downclocked on the memory to 350 - to see what happens to HDR performance in say Far Cry (without AA) and FEAR at 1024x768 and 1280x960.

Someone round here must have a 7800GTX they'd like to play with for educational purposes...

Jawed

If someone could do this then how would the difference in bandwidth be factored into the outcome?
Also, what about framerates? Consoles never exceed 60, as far as I know and hear. So lets say the configured card runs the games around 80FPS, it's throwning an extra 20 FPS in there. If you program for a locked 60 then how could the extra 20FPS resources be used? Would they be a factor or would that extra power be minimal?
 
Oh and changing the memory to XDR might be cheaper than doubling the bus. I wish they had put XDR in there with a 256bit bus. 100GB/s sounds very nice.
 
Synergy34 said:
If someone could do this then how would the difference in bandwidth be factored into the outcome?
With a bit of luck, in comparison with other bandwidth capabilities (e.g. default 7800GTX bandwidth) - we can see what aspects of what games are affected. It's a pretty simple experiment to try.

If game performance at the resolutions I suggested is unaffected, even with AA or HDR, then that's a good start. It means there's prolly no next-gen-enough PC games around to indicate how well RSX will cope with its limited bandwidth.

Also, what about framerates? Consoles never exceed 60, as far as I know and hear. So lets say the configured card runs the games around 80FPS, it's throwning an extra 20 FPS in there. If you program for a locked 60 then how could the extra 20FPS resources be used? Would they be a factor or would that extra power be minimal?
Let's cross that bridge if we get to it.

First the experiment needs running.

Jawed
 
You won't get effective raytracing hardware. It doesn't exist. The experimental and niche raytracing hardwares don't manage full scene raytracing will all the trimmings in 30+ fps realtime.
 
Jawed said:
With a bit of luck, in comparison with other bandwidth capabilities (e.g. default 7800GTX bandwidth) - we can see what aspects of what games are affected. It's a pretty simple experiment to try.
In this comparison by Dave he underclocked the memory to 350, and overclocked the 7800GTX to 500Mhz to simulate the PS3, then compared with a stock GTX at 430/600.

FarCry with HDR, and the lowered memory speed, saw a 10% drop in framerate at 1024x768, and a 40% drop at 1280x1024. On the stock GTX the drops were 5% and 19.6% respectively.

Overall, with HDR, at 1024x768 the peak framrate was 6.5% lower than the stock GTX, at 1280x1024 it was 26% lower than the stock GTX.


Splinter Cell 3 with HDR and lowered memory speed, saw a 21.3% drop in framerate at 1024x768, and a 26.5% drop at 1280x1024. On the stock GTX the drops were 18.4% and 23% respectively.

Overall, with HDR, at 1024x768 the peak framrate was 21% lower than the stock GTX, at 1280x1024 it was 20% lower than the stock GTX.
 
Last edited by a moderator:
scooby_dooby said:
In this comparison by Dave he underclocked the memory to 350, and overclocked the 7800GTX to 500Mhz to simulate the PS3, then compared with a stock GTX at 430/600.

FarCry with HDR, and the lowered memory speed, saw a 10% drop in framerate at 1024x768, and a 40% drop at 1280x1024. On the stock GTX the drops were 5% and 19.6% respectively.

With HDR at 1024x768 the peak framrate was 6.5% lower than the stock GTX, at 1280x1024 it was 26% lower than the stock GTX.


Splinter Cell 3 with HDR and lowered memory speed, saw a 21.3% drop in framerate at 1024x768, and a 26.5% drop at 1280x1024. On the stock GTX the drops were 18.4% and 23% respectively.

With HDR at 1024x768 the peak framrate was 21% lower than the stock GTX, at 1280x1024 it was 20% lower than the stock GTX.


But when you factor in the fact that the game code RSX will be running has been EXCLUSIVE code'd and tweaked to perfection that changes alot of thing's.
 
This is just a test to show what the effect is of halving the memory bandwidth on the 7800 GPU, the specific framerates are meaningless, the interesting part is the performance penalties associated with HDR and AA.

For example, we can see that on FarcCry at 1024x768 it can implement 4xAA with virtually no penalty, however at 1280x1024 it takes a 27% hit from 4xAA. In comparison the stock GTX manages to do 4xAA at 1280x1024 with only a 5% hit. So the decreased bandwidth is really coming into play at the higher resolution.

On the other hand, Splinter Cell 3, at 1024x768 is takes a huge hit for 4xAA(27%), especially compared to the stock 7800(5% hit) so obviously the decreased bandwidth is causing a huge effect on splinter cell even at 1024x768, which is lower than 720p resolution. At 1280x1024 it evens out a bit, but still takes a signifigantly larger hit for 4xAA(33%) vs the stock GTX(24%)
 
Last edited by a moderator:
scooby_dooby said:
In this comparison by Dave he underclocked the memory to 350, and overclocked the 7800GTX to 500Mhz to simulate the PS3.
You can't just underclock the 7800GTX's memory to 350 and pretend it will give you a good measure of what PS3's performance is going to be, even Dave acknowledges that in that thread.

PS3 is going to have a different GPU (and it's not like we know a lot about it at the moment) that will be working with a CPU that is quite different compared to actual PC CPUs, it's going to have an additional memory pool with its own bandwith, and it will be running games that are specifically written for the system.
 
scooby_dooby said:
This is just a test to show what the effect is of halving the memory bandwidth on the 7800 GPU, the specific framerates are meaningless, the interesting part is the performance penalties associated with HDR and AA.

For example, we can see that on FarcCry at 1024x768 it can implement 4xAA with virtually no penalty, however at 1280x1024 it takes a 27% hit from 4xAA. In comparison the stock GTX manages to do 4xAA at 1280x1024 with only a 5% hit. So the decreased bandwidth is really coming into play at the higher resolution.

On the other hand, Splinter Cell 3, at 1024x768 is takes a huge hit for 4xAA(27%), especially compared to the stock 7800(5% hit) so obviously the decreased bandwidth is causing a huge effect on splinter cell even at 1024x768, which is lower than 720p resolution. At 1280x1024 it evens out a bit, but still takes a signifigantly larger hit for 4xAA(33%) vs the stock GTX(24%)

But agen, RSX will have specifically code'd game's for it that will remove most of the bottlenecks, and it most definatley wont be cpu bound.
 
Jacob said:
You can't just underclock the 7800GTX's memory to 350 and pretend it will give you a good measure of what PS3's performance is going to be, even Dave acknowledges that in that thread.
No, but I think it's fair to see what RSX might be like if it's not very different from an overclocked G70 on a lower speed memory bus. And that's what Jawed et al were addressing - what's the effect of a substantial memory BW drop on the G70 architecture. They weren't making any guesses at the PS3's total system performance.
 
I would prefer " imagine" this Gpu ( in case RSX = "full G70 like"...) like vision of a Gpu 22.4GB/sec GDDR3 + 15/20 GB/sec by the FlexIO betwem cpu cell (and in these 1.75MB local memory SPE) to acess some XDRAM tham G70 550/350 only.

(and without counting on some fixed functions to plus some extras and no cpu bounds )
 
Last edited by a moderator:
That's all irrelevant. These are benchmarks from current generation games any reference to the PS3 and it's extra features need to put in context with the fact that next gen games wil be much much more graphically demanding than Splinter Cell 3, or FarCry!

And this isn't a reference to the PS3, it's a approximation of the PS3 GPU and it's benchmarks, showing performance penalties associated with the decreased bandwidth and effects like 4xAA and HDR lighting.
 
I would counter that by saying that a big factor is overdraw and the number of polys.

It seems reasonable to expect that Cell can be used to perform a lot of early culling, for example, and therefore reduce overdraw, which also impacts the number of polys rendered by RSX.

What's more interesting to see is that the AA and HDR hits at 1280p are in the region of 10%, give or take. With a bit of luck they can stay in that region and frame rates can hold up at around 60fps.

It's really just a question of how much vertex-preprocessing, adaptive tessellation, sorting etc. that Cell can assist with. There must be diminishing returns here, and they would vary depending on the graphics engine and game style, I expect.

For example an SPE's 256K of LS is enough to hold a high-res top-level of a hierarchical-Z buffer (230400 bytes) corresponding to one Z value per quad of back-buffer pixels. So perhaps you could build a 2-SPE culling engine on Cell, one SPE holding the high-res Z and performing all queries and updates, with the other Cell holding the remaining levels of the Z hierarchy performing occlusions, etc. Maybe that's just pie in the sky :smile:

Jawed
 
Back
Top