8 ROPs on the RSX. Bad engineering decision?

Status
Not open for further replies.

Mobius1aic

Quo vadis?
Veteran
Assuming all info we've been spoon fed, we know the RSX GPU in the Playstation 3 is outfitted with 8 render output processors. Being based on the G70 core which is present in the GeForce 7800 line of GPUs, I can't help but notice that the 7800 G70s are equipped 16 ROPs. Was this done as reasonable engineering with the possibilities of lower power consumption and/or manufacturing overhead? Also are the 8 ROPs a bottleneck in regards to z-buffering games into full HD resolutions, or is the bottleneck more related to other parts of the rendering pipeline as well as limiting memory bandwidth along with limited VRAM, hence the possible reason for halving the ROP count? I find this issue kind of perplexing, especially in regards to the promises of full 1080p games by Sony a while back, as well as being a repackaged G70. Any insight into this would be appreciated.
 
Decision process was fairly simple I suppose; G71: 256-bit -> 16 ROPs, RSX: 128-bit -> 8 ROPs (with not-so-incredibly-fast GDDR3). In both cases the ROPs could be the bottleneck rather than bandwidth, it's always possible to figure out corner cases, but I don't think it's unbalanced, quite on the contrary. In Xenos' case, I guess the decision to only have 8 ROPs was related to the cost and complexity of having a larger external bus between the main chip and eDRAM.
 
8 ROPs on RSX is not bottleneck on PS3. Since G70 use 256-bit memory interface
with GDDR3 within 16 ROPs in action. However RSX strickly on 128-bit memory
interface with GDDR3. 8 ROPs design was fit for its demand.

If you want to point some weakness area of PS3 design compare with PC architect
it will not work at all. PS3 is not need more bandwidth like 256-bit memory interface
via GDDR3 cause its architect combine with FlexIO from Rambus. However PC with
PCIe bottleneck need more bandwidth for GDDR3.

About 1080p output for PS3, I think RSX can handle it very simply as you see on
Ridge Racer 7, Ninja Gaiden Sigma and all 1080p supported games. It is depend
on developers need to show their works on 720p or 1080p not PS3 and its architect.

By the way, many 8 ROPs cards on PC can support 1080p output or higher too.
However if RSX has 16 ROPs like G70 or G71 it seem to be outclass XENOS GPU
on fillrate twice time. That's so good for PS3 users.
 
Well sure most midrange cards have 8 ROPs like the 8600s, and they handle 1080p video very well, same with the 8400s and 8500s, however these cards are more tailored for videos at that resolution, not 3D graphics related z-buffering which I'd assume is alot more intensive on the ROPs, considering they have to calculate the depths, etc. Max viable rendering resolution I'd go after with a newer game using the 8600 w/ 256 or 512 would probably be 1440 x 900. Anything higher would be killing the 8600 I'd assume unless it's an older title.
 
I see. However 8600 was target for home video and media entertainment in HD Era. Its 3D performance is not strong as the 7800/7900GTX Model as we see.

RSX based on 7800/7900GTX modified I think in some kind of 3D games applications area. 8600 can't beat RSX at all.

As I know from the Quote " RSX Secret " there are some advantage of RSX over 7800GTX.
About your point in "Z-buffer" bottleneck on RSX via 8 ROPs in my opinion Z-buffer of GeForce
family since Riva TNT era can adjustable from 16-bit Z or 24-bit Z or 32-bit Z.

If you remember the old school GeForce family, Number of its ROPs and its "Z-buffer" apply. I think all GeForce 7 Series (within RSX) or above must not have some kind of " bottleneck" at all.

Sorry If I misunderstand some of your point about " Z-buffer" .
 
I think 8600s are underestimated for their 3D capabilities. In many cases they proven to be quite a nice midrange card. However the most bang for the buck was the initial 8800GTS (I have a 320 :D), then the 8800GT, and now the 9600.

And one more thing to mention, my laptop's GPU (G72M) only had 2 ROPs (3:4:4:2 configuration) but it handled 1280 x 800 resolution quite decently with older games that didn't call for as much z-buffer capabilities, but the fact that I'm talking about games such as Far Cry, Half Life 2, Call of Duty 2 DX7, it seems kind of impressive. I also had it overclocked upwards of 550 MHz as well to boost performance a bit.

However I still can't help that 8 ROPs wasn't enough for the RSX. Considering games like Call of Duty 4 resort to below 720p rendering to 600p rendering (with 2xAA though) I can't help but think their is some inherit bottleneck in the z-buffer phase, if they are resorting to 600p. But hell I think the RSX is an underwhelming GPU to be using in the first place. Maybe if Sony had gotten in bed with Nvidia a bit better, they could've had a GeForce 8xxx series-esque graphics core instead (hmmmm....... 64 Stream processors, 16-20 TMUs, 8-16 ROPs?). Then certainly the PS3 could be whipping the 360 on the GPU side as well as having the Cell BE to be reckoned with in graphics unison. Could've been the perfect storm, though it would have cost Sony ALOT more to produce.
 
I think 8600s are underestimated for their 3D capabilities.
In many cases they proven to be quite a nice midrange card.
However the most bang for the buck was the initial 8800GTS
(I have a 320 ), then the 8800GT, and now the 9600.

I think maybe Nvidia and AMD(ATi) set their mid-range cards
as 8600 and Radeon HD2400 Series for some kind of video editing
and heavy media editing application choice rather than 3D games.


And one more thing to mention, my laptop's GPU (G72M) only
had 2 ROPs (3:4:4:2 configuration) but it handled 1280 x 800
resolution quite decently with older games that didn't call for
as much z-buffer capabilities, but the fact that I'm talking about
games such as Far Cry, Half Life 2, Call of Duty 2 DX7, it seems
kind of impressive. I also had it overclocked upwards of 550 MHz
as well to boost performance a bit.

550MHz x 2 ROPs can produce around 1.1 Gig ROPs (max)
however 550MHz x 8 ROPs was 4.4 Gig Rops (max).
You'll see there RSX is at least 4 times more performance on
ROPs than your G72M @550MHz.

If you said overclocked G72M to 550MHz can " boost performance a bit "
hence 550MHz RSX may 4 times " boost performance a bit " from its
500MHz clock (as rumors)

However I still can't help that 8 ROPs wasn't enough for the RSX.
Considering games like Call of Duty 4 resort to below 720p
rendering to 600p rendering (with 2xAA though) I can't help but
think their is some inherit bottleneck in the z-buffer phase,
if they are resorting to 600p. But hell I think the RSX is an
underwhelming GPU to be using in the first place. Maybe if
Sony had gotten in bed with Nvidia a bit better, they could've had
a GeForce 8xxx series-esque graphics core instead
(hmmmm....... 64 Stream processors, 16-20 TMUs, 8-16 ROPs?).

600p may limit of XENOS GPU 10 MB e-DRAM not RSX as these chart

<img>http://www.watch.impress.co.jp/game/docs/20060426/3dhd09.jpg</img>

CoD4 is multiplatform game so if they apply native 720p (1280x720) for both versions. XBUG360 users will have some kind of downgrade game for their system. PS3 games will cost more than XBUG360 version at all.


Then certainly the PS3 could be whipping the 360 on the GPU
side as well as having the Cell BE to be reckoned with in graphics
unison. Could've been the perfect storm, though it would have
cost Sony ALOT more to produce.

I prefer 8800GTX full option instead friend.:eek:
 
I see. However 8600 was target for home video and media entertainment in HD Era. Its 3D performance is not strong as the 7800/7900GTX Model as we see.

The 8600GTS should give a 7900GT a run which is faster than a 7800GTX. A think it was the 8600GT one which was noticably slower.

As I know from the Quote " RSX Secret " there are some advantage of RSX over 7800GTX.

Seems most of them where to 'integrate' the RSX tighter with the Cell.
 
7800GTX 430/650MHz and RSX 550/700MHz. Its way to say RSX was clear the winner.

As many friend of me use 8600GT cards, I see that 8600 cards perform lower than 7800GTX/7900GTX in benchmark and many games. However it was the best value choice for HD movies playback or video editing application like Radeon HD2400
 
7800GTX 430/650MHz and RSX 550/700MHz. Its way to say RSX was clear the winner.
Except it's RSX 500/650. Unless you can provide a document pertaining to the released hardware that shows it's 550/700, you have no argument that stands up to the lots and lots of input and evidence that RSX received a downclock. If you want to believe your PS3 is running at that spec, fine, but on this forum it's a nonsense to argue console GPUs if you're not going to adhere to the accepted specifications.
 
7800GTX 430/650MHz and RSX 550/700MHz. Its way to say RSX was clear the winner.

As many friend of me use 8600GT cards, I see that 8600 cards perform lower than 7800GTX/7900GTX in benchmark and many games. However it was the best value choice for HD movies playback or video editing application like Radeon HD2400

But dont forget the 7800GTX comes with a 256bit bus and 16 rops vs RSX 128bit bus and 8 rops.

GeForce 7800 GTX; 16 rops; 10.3 GigaTexels/second Fillrate; 38.4 GB/second memory bandwidth.

GeForce 7800 GTX 512 MB (Ultra); 16 rops; 13.2 GigaTexels/second Fillrate; 54.4 GB/second memory bandwidth.
 
I see. However in PC architect the PCIe bottleneck will trash your 7800GTX/GTX(512) performance around 40-50% hit. 16ROPs 7800GTX may perform somewhere around
40-50% performance hit.

However 128-bit mem interface with 8 ROPs on RSX were fit for smart performance
adjustdable. Due the PS3 architect its easy to say RSX will perform better than
PC 7800GTX/GTX(512) side by side.

Those number of 7800GTX is more impress but in your PC due the bottleneck around
there. You'll reach only 40-50% remain performance of something you paid for.
I think in the closed box environment like PS3 we'll see something that you all can't
see that 7800GTX/GTX(512) cards perform on PC in soon.
 
I see. However in PC architect the PCIe bottleneck will trash your 7800GTX/GTX(512) performance around 40-50% hit. 16ROPs 7800GTX may perform somewhere around
40-50% performance hit.

PCIe 16x wont bottleneck a 7800GT, it doesn't bottleneck a G80 either (8800GTX). Now if you have SLi where it runs each bus at 8x then it may bottleneck at very high resolutions/IQ enhancers. Though it has been a long time since mobo's came with 2x(16x) PCIe conenctions for graphic cards.

However 128-bit mem interface with 8 ROPs on RSX were fit for smart performance
adjustdable. Due the PS3 architect its easy to say RSX will perform better than
PC 7800GTX/GTX(512) side by side.

8 rops used becouse of the VRAM limitation (128bit). If it where 256Bit then they would most likely wanted 16 rops.

Those number of 7800GTX is more impress but in your PC due the bottleneck around
there. You'll reach only 40-50% remain performance of something you paid for.
I think in the closed box environment like PS3 we'll see something that you all can't.

40-50% estimated by what/who? There are different types of load's you can put on a GPU, different engines/game designs put different strains on the GPU.

I think in the closed box environment like PS3 we'll see something that you all can't see that 7800GTX/GTX(512) cards perform on PC in soon.

When you say PS3 you are talking about the RSX and the Cell, the Cell can and does help out RSX as been stated by some devs in interviews etc. But RSX alone, I doubt it. Not against the Ultra one.
It is also very clear how AA and anisotropic filtering is something very costly for the console GPU's hence why they are not used much. Something fairly inexpensive on G70/G71 GPU's or better. :smile:
 
Let's see friends. PC architect simply like water falling from waterfall to canal , to river and through the oceans. However PS3 architect simply like the water flow from the head of cylinder pipe to its tail. Which flowrate better? Let think.

PCIe x16 is the bottleneck it can only split 4GB/sec highest bandwidth for your PC's gpu cards. Compare with your number of 16ROPs 256-bit 7800GTX bandwidth how many time
different?

With FlexIO of PS3 subsystem it can split at least 50GB/sec around the machine.
In CPU to GPU interface Cell,RSX there is 35GB/sec as you see. How many time different?

That 40-50% performance hit that from X86 CPU, Its Chipset, latency of its memory type
, PC hardware ,OS, DirectX ,GPU driver,...etc. I think it isn't hard for you all to see.
 
PCIe x16 is the bottleneck it can only split 4GB/sec highest bandwidth for your PC's gpu cards. Compare with your number of 16ROPs 256-bit 7800GTX bandwidth how many time different?
That's only a bottleneck for GPU <> System RAM and CPU connectivity. If you copy over all the GPU needs into its VRAM, you have stupid amounts of BW these days to play with. The ROPS don't need to access anything over PCIe, so that's an irrelevant limitation to the ROP performance of PC versus RSX.
 
PCIe x16 is the bottleneck it can only split 4GB/sec highest bandwidth for your PC's gpu cards. Compare with your number of 16ROPs 256-bit 7800GTX bandwidth how many time
different?

PCIe 16x does 4Gb in each direction at the same time (8GB/s). Also you dont need as much bandwidth between system as in GPU to VRAM unless you intend to use the system RAM as VRAM.

With FlexIO of PS3 subsystem it can split at least 50GB/sec around the machine.

My friend not however it wants without penalties.

In CPU to GPU interface Cell,RSX there is 35GB/sec as you see. How many time different?

IIRC If the CPU will help out the RSX it needs to access the VRAM pool and thus more bandwidth is needed.

That 40-50% performance hit that from X86 CPU, Its Chipset, latency of its memory type
, PC hardware ,OS, DirectX ,GPU driver,...etc.

If anything it is becouse of OS and API and 40-50% I doubt it, and even if so then I find it even more fascinating how a G70/G71 can hold up against the console GPU's in multiplatform games, easily.
 
ROPs is only some part of 3D Pipeline. Only ROPs cannot produce anything on your screen right? So If you want to compare what appear on your screen as you see. You'll need to
combine all component of either system to determine.
 
In PC architect, not only software part produce bottleneck , in fact some part of hardware make some seriously bottleneck, too. In software part, there are many ways to solve its.
By re-coding some OS handle, driver, API, reduce memory taken, ...etc.

By the way, In hardware part nothing can be done except change some components of
your system to solve its. Intel platform the biggest bottleneck was its chipsets other hand
AMD platform its bottleneck was its CPU architect pipeline not wide issue enough as Intel
to perform parallelized. On GPU itself both AMD and nVidia can produce the TITANIC like
GPU (8800GTX aka HD3870 and above etc.) however if you don't match them to the high
performance CPU and Chipset it'll like you drive your speedboats in your swimming pools.
 
On GPU itself both AMD and nVidia can produce the TITANIC like
GPU (8800GTX aka HD3870 and above etc.) however if you don't match them to the high
performance CPU and Chipset it'll be like you're driving your speedboats in your swimming pools.

:LOL: That's a fun analogy. OK so where effectively is the limitation in render resolution beginning with? Is it bandwidth or more of a VRAM related problem?

And from my experience, I don't think the PCIe bus on my laptop (the one with the G72M) or my desktop really was a bottleneck like some of you suggest. The real bottleneck was it's 2 x 32 bit memory interface (350 MHz normal, I OCed it to 425 MHz) and it's VRAM setup of 64 MB dedicated + 192 MB shared, which I think effected newer games with larger texture maps. Like I said I could run Call of Duty 2 in DX7 with max everything, including textures. Disabling DX9 mode of course disabled shaders, which saved the hassle of those calculations as well as extra memory to store them. Far Cry close to max spec and Half Life 2 ran pretty much at maximum spec, both which I didn't use AA at all, too much of a hit. Either way I think the G72M in my laptop was pretty efficient for a low end dedicated GPU despite the 64 bit memory bus, split memory pools, and PCIe connectivity. It did things that I wasn't expecting out of a 3:4:4:2 GPU. I could do Call of Duty 2 DX9 mode when overclocked (450 MHz factory, I pushed it to 570 MHz for games) at 800 x 600 decently though :LOL:
 
Last edited by a moderator:
Status
Not open for further replies.
Back
Top