ATI - PS3 is Unrefined

aaronspink · Dec 28, 2005

m1nd_x said:
Is the PS3 capable of delivering 1080p@30fps to two HDTV's? I know it has two outputs for it, but I'm talking about hardware-wise, is it up to snuff for a decent looking game?

Can it display? sure. Don't reckon the performance is going to be anything to write home about though. Realistically, the devs are going to be pushing the graphics on their games such that a vast majority of them won't support beyond 720P natively, I am of course assuming that the RSX provides support for scaling (which seems pretty reasonable).

Aaron Spink
speaking for myself inc.

one · Dec 28, 2005

aaronspink said:
There is fundamental difference. The devs have to do little to nothing in order to enable the faster graphics cards. OTOH, the devs would have to devote reasonable resources into taking advantage of the second display which will be utilized by a small minority of the market (if that). Therefore the likelyhood of actual games utilizing the dual display interfaces are ~0.

So, displaying charts for character status, racing courses, maps, text chats, web browser, and other 2D data onto the second display require devs to devote reasonable resources? :???:

a688 · Dec 28, 2005

one said:
So, displaying charts for character status, racing courses, maps, text chats, web browser, and other 2D data onto the second display require devs to devote reasonable resources?

Well it isn't the DS where two screens are the default (and only) configuration. Extra work has to be done to make it work on both single screen and dual screen configurations (detecting, layout, etc). So yes. If two screens are the minimum then they lose a LOT of possible market share. Name all of the computer games that are dual monitor capable. There arn't that many because it takes more time, resources, and money.

mckmas8808 · Dec 28, 2005

a688 said:
Well it isn't the DS where two screens are the default (and only) configuration. Extra work has to be done to make it work on both single screen and dual screen configurations (detecting, layout, etc). So yes. If two screens are the minimum then they lose a LOT of possible market share. Name all of the computer games that are dual monitor capable. There arn't that many because it takes more time, resources, and money.

Didn't Ken say this was built into the OS? And didn't Sony say that one person could be watching a Blu-ray movie, while another person could be on the internet on a second screen? I think this to is corporated inside the O.S.

Glad to see people fighting against a positive thing that they wouldn't even have to use. :???:

one · Dec 28, 2005

a688 said:
Well it isn't the DS where two screens are the default (and only) configuration. Extra work has to be done to make it work on both single screen and dual screen configurations (detecting, layout, etc). So yes. If two screens are the minimum then they lose a LOT of possible market share. Name all of the computer games that are dual monitor capable. There arn't that many because it takes more time, resources, and money.

Don't you play an RPG like Dungeon Siege? You have to open the inventory to equip items and you switch those screens by hitting keys even though things around you are moving. Or for chat, it's usually overlayed when playing MMORPG or FPS. It's surprising for me if putting those things in the second screen really requres so much development resources.

Besides, there will be cases which don't bother game developers, like having a web browser in the second screen to see an online walkthru while playing a game. These functions can be supplied by the OS.

AlphaWolf · Dec 28, 2005

mckmas8808 said:
Didn't Ken say this was built into the OS? And didn't Sony say that one person could be watching a Blu-ray movie, while another person could be on the internet on a second screen? I think this to is corporated inside the O.S.

Glad to see people fighting against a positive thing that they wouldn't even have to use.

What do watching movies and browsing the internet have to do with developing games?

mckmas8808 · Dec 28, 2005

AlphaWolf said:
What do watching movies and browsing the internet have to do with developing games?

Nothing actually. I was just stating what I think I heard and asking for confrimation. But 'one' does have a great point when he says...

one said:
Besides, there will be cases which don't bother game developers, like having a web browser in the second screen to see an online walkthru while playing a game. These functions can be supplied by the OS.

Langsuyar · Dec 28, 2005

macabre said:
Welcome to the board !
there are also quite a few threads where Lair is a topic. I remember a recent one where people talked about the skin deformation of the dragons etc.
I know you are probably busy , but maybe you could step in and make some comments in there , if you are allowed to of course.

Thanks! Unfortunately everything directly related to Lair and the PS3 is guarded by NDAs, so I can't go into any of that. I can only comment on issues that are relatively universal to the industry (like next-gen art workflow) or that have already been mentioned in a public forum (for example during that GDC session or the more recent IGDA talk).

mckmas8808 · Dec 28, 2005

Langsuyar said:
Thanks! Unfortunately everything directly related to Lair and the PS3 is guarded by NDAs, so I can't go into any of that. I can only comment on issues that are relatively universal to the industry (like next-gen art workflow) or that have already been mentioned in a public forum (for example during that GDC session or the more recent IGDA talk).

Okay. Do you guys see most of the industry using the scanning technics that you talked about in the IGDA talk come next-gen? If so do you also feel like the Alfred Molina face demo shown off by Sony will be done in real-time in future next-gen games?

And if it can be done, will it be too expensive for over 90% of the developers to use or will it be common place in a couple of years?

P.S. Good luck on the game. And if the gameplay is as good as the graphics I will definetely buy it.

Shifty Geezer · Dec 28, 2005

Alpha_Spartan said:
I was referring to all of the legal shit + taking their engineering focus off of consoles and onto the PC market which is their toast and butter.

But XB produced nForce 2. As for the legal stuff, I'm not sure what that was, but large companies are invariably in half a dozen legal challenges at any point. XB gave nVidia a very popular line in motherboard chipsets, and a 20 million selling GPU with higher than usual profits, using a variation on existing tech which was followed by the Ti4xxx series that was well received and sold well. I don't see where their focus was taken away from the PC space or how they lost anything due to their XB work.

Titanio · Dec 28, 2005

Jawed said:
A lot of Xenos's vertex/geometry work will take place in render-phases that require no pixel shading.

What proportion of a frame's processing do you think they'd require then? The smaller the proportion, the more just RSX's vertex shaders could keep up, with processing distributed across the entire length of the frame. If you need results at a certain time, get your SPUs going.

The original proposition was:

"I may be the case that if devs start targeting large-scale vertex processing on Xenos this could cause issues on non-unified processors."

All I'm saying is that PS3 as a system, not just the non-unified RSX, could sustain such "large-scale" vertex processing and also "large-scale" pixel processing simultaneously, be that for one rendering pass or more generally across the frame's processing.

Jawed said:
With RSX running roughly six times slower at vertex shading during these phases, and not having anything like as rich a feature set (can't create or delete vertices, can't vertex-texture worth a damn)

A SPU can create and destroy geometry fine. In fact if I wanted to mimic, or exceed even, "Geometry Shaders" - a feature of DX10, the API you're so keen to align Xenos with - I'd be much happier using a SPU than Xenos's very fixed function tesselator.

Jawed said:
you're forced to soak up lots of Cell FLOPs (with an instruction set that doesn't have implicit load/store/permute (swizzle) and can't co-issue part-vector operations, hence less effective FLOPs than a GPU) to keep up.

Your GPU flops is more effective than CPU flops point rings rather hollow depending on what you're doing. And don't try and mould a SPU to execute exactly as a Vertex Shader - take advantage of its own strengths (which are considerable indeed). Functionally there is little contest, you could do things with an SPU that would be impossible to do with a Xenos (or RSX) shader (or could only be done in a rather rigid fashion with the tesselator). And still have a lot of pixel shading power left over. Taking Xenos to a point that would require extensive use of Cell would leave it with..nothing, for pixel shading. You may consider it compensatory for RSX, but at least it can compensate, and then some - the same can't be said for situations where X360 would be in a bind vs PS3. I don't know..maybe that's the mark of an architecture that's flexible, and that has legs..

mckmas8808 · Dec 28, 2005

Titanio said:
Your GPU flops is more effective than CPU flops point rings rather hollow depending on what you're doing. And don't try and mould a SPU to execute exactly as a Vertex Shader - take advantage of its own strengths (which are considerable indeed). Functionally there is little contest, you could do things with an SPU that would be impossible to do with a Xenos (or RSX) shader (or could only be done in a rather rigid fashion with the tesselator). And still have a lot of pixel shading power left over. Taking Xenos to a point that would require extensive use of Cell would leave it with..nothing, for pixel shading. You may consider it compensatory for RSX, but at least it can compensate, and then some - the same can't be said for situations where X360 would be in a bind vs PS3. I don't know..maybe that's the mark of an architecture that's flexible, and that has legs..

Hey Titanio can I get your opinion on something? You remember the Alfred Milano head demo where the CELL was calculating lighting, SSS, and all that other technical stuff right? Would it be sensible to do that same amount of calulating of lighting and SSS using the CELL in a real game? Or would it be better to use the RSX? Or would it be even better to use both?

Jawed · Dec 28, 2005

Titanio said:
What proportion of a frame's processing do you think they'd require then? The smaller the proportion, the more just RSX's vertex shaders could keep up, with processing distributed across the entire length of the frame.

This is about maximum geometry per pass and skirting bottlenecks. If you have 5% of the frame's render time to perform geometry pre-processing, then Xenos will deliver 6x the shading power in the same time.

So that gives you the ability to perform tessellation (which, incidentally, isn't fixed function

) and, say, more lighting-shadow passes, e.g. 12 lights instead of 6.

More geometry creates less of a bottleneck than in RSX.

If you need results at a certain time, get your SPUs going.

Oh, I agree. But this topic is about Xenos and RSX, one being refined (both as a console-specific GPU and generally) and the other being brutish.

For example, getting an array of SPUs to crunch through an early z-pass probably might make a lot of sense in many cases.

Oh dear, where are you going to get the fill-rate for Cell to do that? Why do you think GPUs have fixed-function hardware, including hierarchical-Z and z-test in the ROPs to accelerate those tasks. Whoops.

A SPU can create and destroy geometry fine.

Of course it can, that's what CPUs have been doing for years now.

In fact if I wanted to mimic, or exceed even, "Geometry Shaders" - a feature of DX10, the API you're so keen to align Xenos with - I'd be much happier using a SPU than Xenos's very fixed function tesselator.

So, DX10 geometry shaders are a waste of time then, hmm?... Oh dear. You need a better argument than that.

Your GPU flops is more effective than CPU flops point rings rather hollow depending on what you're doing.

Er, actually we're talking about geometry and vertex shading, Xenos's home turf.

The comparison of a CPU FLOP and a GPU FLOP is more than valid. In a GPU

ADD r1.xy, r1.xy, r2.zw

runs in one clock cycle. VMX/SPE takes longer because the swizzle needs to be performed separately (at least one extra clock, maybe 2 - permute takes 4 clocks on SPE, but it could be co-issued with another vector operation on previous clock cycles, provided that the previous instruction didn't set r2). Sure, that's a silly example, but the point stands. GPUs are built for vector maths in a way that SPEs and VMX don't quite get - that's why Fafalada keeps moaning about VMX. SPEs are "more general purpose", to put it bluntly.

And don't try and mould a SPU to execute exactly as a Vertex Shader - take advantage of its own strengths (which are considerable indeed).

True, data re-ordering, packing and a variety of techniques can recover some of the efficiency that's lost in translating GPU shader programs into CPU shader programs. Cell starts with an awfully big disadvantage, though.

Functionally there is little contest, you could do things with an SPU that would be impossible to do with a Xenos (or RSX) shader (or could only be done in a rather rigid fashion with the tesselator).

Bear in mind that doing geometry/vertex-only passes, Xenos has roughly the same progammable FLOPs (for what that's worth, not much, since they're not even the same kind of FLOPs) as the whole of Cell, plus it has the extra capabilities of fixed-function hardware (e.g. culling and clipping).

And still have a lot of pixel shading power left over. Taking Xenos to a point that would require extensive use of Cell would leave it with..nothing, for pixel shading.

No because a lot of geometry/vertex shading work in advanced engines is done independently of pixel shading. e.g. the workload during stencil shadow calculation doesn't invoke any pixel shading - it's purely vertex work and z/stencil fill-rate.

D3 is a great example of this, with its extraordinarily low-poly environments/characters based on the fact that the CPU has to perform a lot of the shadowing. Even though shadowing is only a small proportion of the overall frame render time, it creates a huge bottleneck. Therefore the only solution is to keep the poly count really low.

I'm not disputing that Cell can help - all I'm saying is that geometry can create its own bottlenecks that are independent of pixel shading. Xenos has the flexibility to assign it's computing power to whatever the current bottleneck is - RSX has none of that flexibility, it consists of stages that have fixed peaks. At certain times while rendering a frame, the pixel shader pipelines will be entirely idle.

You may consider it compensatory for RSX, but at least it can compensate - the same can't be said for situations where X360 would be in a bind vs PS3.

Eh? You've got vastly more efficient shading power in Xenos allied to a more-graphics oriented instruction set in Xenon (DP3/4 plus freely interchangable AoS/SoA formatting for vectors) than that of SPEs (though Xenon's VMX units are still not up there with GPUs).

I don't know..maybe that's the mark of an architecture that's flexible, and that has legs..

No, it's the mark of an architecture that gets away with a retrograde GPU design by falling back on the CPU. If Xenos wasn't around I'm sure we'd all think PS3 was lovely, but with DX10 knocking on the door, RSX looks distinctly old-fashioned. Brutish, definitely, but old-fashioned.

Jawed

zidane1strife · Dec 28, 2005

Jawed said:
This is about maximum geometry per pass and skirting bottlenecks. If you have 5% of the frame's render time to perform geometry pre-processing, then Xenos will deliver 6x the shading power in the same time.

So that gives you the ability to perform tessellation (which, incidentally, isn't fixed function ) and, say, more lighting-shadow passes, e.g. 12 lights instead of 6.

But for shadow-light, hasn't nvidia got some optimizations or something? I mean they always say Ultra-Shadow #X something X2-or something performance increase, and they've done that a few times already improving over past implementations. I'm sure there are other similar things done, these scenarios are clearly not outside gpu dev.s minds, and they clearly will try to do as much as possible to improve perf in such to outdo the competition, especially when such perf is at play with regards to some top-selling games like D3.

More geometry creates less of a bottleneck than in RSX.

Sure doesn't look like it

Laa-Yosh · Dec 28, 2005

mckmas8808 said:
You remember the Alfred Milano head demo where the CELL was calculating lighting, SSS, and all that other technical stuff right?

I'd be totally surprised if Cell could calculate SSS in realtime, especially at HD resolution close-ups. As I've mentioned back when the demo was first presented, beause the head is completely static, it is more likely that the SSS data has been precalculated.

Titanio · Dec 28, 2005

Jawed said:
This is about maximum geometry per pass and skirting bottlenecks. If you have 5% of the frame's render time to perform geometry pre-processing, then Xenos will deliver 6x the shading power in the same time.

So that gives you the ability to perform tessellation (which, incidentally, isn't fixed function ) and, say, more lighting-shadow passes, e.g. 12 lights instead of 6.

More geometry creates less of a bottleneck than in RSX.

Agreed, but again, use Cell for that 5% of frametime if you wish. And I'm sure you can find something for the pixel shaders to be doing.

Jawed said:
Oh dear, where are you going to get the fill-rate for Cell to do that? Why do you think GPUs have fixed-function hardware, including hierarchical-Z and z-test in the ROPs to accelerate those tasks. Whoops.

I've seen it mentioned by more than a couple of times by devs here. I'm not sure if you're saying RSX's fillrate would be a bottleneck here, but it wouldn't need to be a part of this at all. If you're talking about Cell's fillrate - I don't know, what is it's fillrate? SPU-to-local memory?

Jawed said:
So, DX10 geometry shaders are a waste of time then, hmm?... Oh dear. You need a better argument than that.

No, I'm saying if you want to do similar work on a system without geometry shaders, SPUs would provide you with a more flexible and general purpose model to work with than Xenos's tesselator.

Jawed said:
The comparison of a CPU FLOP and a GPU FLOP is more than valid. In a GPU

ADD r1.xy, r1.xy, r2.zw

runs in one clock cycle. VMX/SPE takes longer because the swizzle needs to be performed separately (at least one extra clock, maybe 2 - permute takes 4 clocks on SPE, but it could be co-issued with another vector operation on previous clock cycles, provided that the previous instruction didn't set r2). Sure, that's a silly example, but the point stands.

I don't know if it does, because there's plenty a CPU could do with its flops, if you want to put it that way, that a Xenos shader also couldn't.

Jawed said:
True, data re-ordering, packing and a variety of techniques can recover some of the efficiency that's lost in translating GPU shader programs into CPU shader programs. Cell starts with an awfully big disadvantage, though.

I wouldn't bet against its capability as a vertex processor..

You could uniquely use Cell for vertex processing in ways that would leave Xenos shaders stumbling over themselves, also. Asides from flexibility, geometry creation, etc. leave aside your "fraction of a frame" bursts of vertex processing, and consider what would happen if a PS3 dev adopted frame-long vertex processing - really really really heavy vertex processing - across an SPU array and RSX's vertex shaders.

The original point was to present a usage of Xenos that would be challenging for PS3. I guess my point is that a) the specific example given wouldn't really and ultimately b) there are many more ways that PS3 can bend X360 over a barrel than vice versa.

Jawed said:
No because a lot of geometry/vertex shading work in advanced engines is done independently of pixel shading. e.g. the workload during stencil shadow calculation doesn't invoke any pixel shading - it's purely vertex work and z/stencil fill-rate.

Doesn't mean you can't be doing pixel shading in parallel..

Frankly your arguments are a little questionable whilst you also state, for example, that RSX is a SM2.0a chip. I'm not sure if one could trust your insight, Jawed.

Jawed · Dec 28, 2005

zidane1strife said:
But for shadow-light, hasn't nvidia got some optimizations or something? I mean they always say Ultra-Shadow #X something X2-or something performance increase, and they've done that a few times already improving over past implementations.

Ultra Shadow is a technique for applying a clip plane to stencil shadows I believe. It's not used in any game as far as I know (it can be turned on in D3, but makes no difference).

The other technique that NVidia introduced to help with stencil shadow volumes is the double-rate Z/stencil in the ROPs of the 6 and 7 series. It isn't a part of the vertex shading, per se, but relates to marking up the stencil buffer with "in shadow" or "out of shadow" state, as the shadow volume for each light is projected into 3D.

I'm sure there are other similar things done, these scenarios are clearly not outside gpu dev.s minds, and they clearly will try to do as much as possible to improve perf in such to outdo the competition, especially when such perf is at play with regards to some top-selling games like D3.

ATI put double-rate Z/stencil into Xenos.

(Though it's missing from other recent GPUs, i.e. R520 and R580

- but not RV530.)

Jawed

Panajev2001a · Dec 28, 2005

Forstencil shadow volumes generation and Geometry processing with Lighting turned off as well as textures, etc... the SPE's can take care of it all leaving maybe those portions of the scene in which you used vertex texturing to displace/animate some of the on-screen geometry even though you could do texture fetches on the SPE's if you cared not to have to divide geometry processing during this event between the SPE's and RSX. Not to say that Xenos's ability of dedicating 3 arrays of 16 Unified Shader ALU's to the task is not something very interesting indeed

.

Faf and many others liked more the AoS form you found in EE's VU's with broadcasting and swizzling, but really now that they have touched the VFPU VMX-128 does not cut it either

.

Yes, you have a DotProduct instruction and you have the ability of keeping data in AoS form (I have the feeling no studio will do this though if they want things optimized [and a bit portable] as many developers already on PSTwo ended up using the SoA form for vectors [which worked very well with broadcasting vector fields in math operations]), but what pisses people off about the PPE is not really VMX...

I do nto find fair the comment about the SPE's being compared to the PPE and its VMX/FPU/FXU units for graphics: a unified register file, the ability of performing logical, permute, floating-point and fixed-point math operations on all the registers without doing crazy hoops do help things out immensely.

I'd still like a FlexIO adapted Xenos inside PLAYSTATION 3, but as a system PLAYSTATION 3 should really kick ass even without RSX.

Sometimes a system is much more than the sum of its parts: look at the jump in performance from POWER4+ based platforms to POWER5 based platforms

.

Jawed · Dec 28, 2005

Titanio said:
Agreed, but again, use Cell for that 5% of frametime if you wish. And I'm sure you can find something for the pixel shaders to be doing.

The point still stands, Xenos's ceiling is far higher than RSX's.

I've seen it mentioned by more than a couple of times by devs here. I'm not sure if you're saying RSX's fillrate would be a bottleneck here, but it wouldn't need to be a part of this at all. If you're talking about Cell's fillrate - I don't know, what is it's fillrate? SPU-to-local memory?

Yes, I've hypothesized about the data structures you could draw up for hierarchical-Z residing in LS. It consumes a hell of a lot of Cell (around half). I think it would make a nice experiment, but I think it's at the heart of why Sony bought in RSX: fixed function hardware trumps ill-used programmable hardware.

No, I'm saying if you want to do similar work on a system without geometry shaders, SPUs would provide you with a more flexible and general purpose model to work with than Xenos's tesselator.

You don't get it: I'm saying Cell's gonna be helping RSX. I'm contrasting RSX with Xenos which has a broader and more efficient capability. Gah.

I don't know if it does, because there's plenty a CPU could do with its flops, if you want to put it that way, that a Xenos shader also couldn't.

Yes, of course. But that stuff tends not to be graphics

You could uniquely use Cell for vertex processing in ways that would leave Xenos shaders stumbling over themselves, also.

Feel free to enlighten me.

Asides from flexibility, geometry creation, etc. leave aside your "fraction of a frame" bursts of vertex processing, and consider what would happen if a PS3 dev adopted frame-long vertex processing - really really really heavy vertex processing - across an SPU array and RSX's vertex shaders.

Come back when you find such a case. You should check out some of RoOoBo's graphs.

The original point was to present a usage of Xenos that would be challenging for PS3. I guess my point is that a) the specific example given wouldn't really and ultimately b) there are many more ways that PS3 can bend X360 over a barrel than vice versa.

No, Richard's comparison was between Xenos and RSX. That's where the thread started and it's what I've been fleshing-out. I haven't even touched on the anaemic framebuffer capabilities of RSX, which was prolly a big part of his comparison.

Frankly your arguments are a little questionable whilst you also state, for example, that RSX is a SM2.0a chip. I'm not sure if one could trust your insight, Jawed.

You should look at a comparison of the SM2a and SM3 profiles:

http://www.beyond3d.com/previews/nvidia/nv40/index.php?p=5#over

Take away RSX's check-box features: vertex texturing which is unusably slow and dynamic branching which is unusably slow (it can only be used as a run-time static branch, i.e. every pixel follows the same path) and you're left with RSX being better than SM2a on:

double-length vertex shaders, 512 instead of 256
geometry instancing
position register
face register
more general purpose and constant registers (on an architecture that grinds to a halt if you use more than about 6 general purpose registers, yeah, nice)
and that's it

Jawed

mckmas8808 · Dec 28, 2005

With all due respect

Jawed said:
You should look at a comparison of the SM2a and SM3 profiles:

http://www.beyond3d.com/previews/nvidia/nv40/index.php?p=5#over

Take away RSX's check-box features: vertex texturing which is unusably slow and dynamic branching which is unusably slow (it can only be used as a run-time static branch, i.e. every pixel follows the same path) and you're left with RSX being better than SM2a on:

double-length vertex shaders, 512 instead of 256

geometry instancing

position register

face register

more general purpose and constant registers (on an architecture that grinds to a halt if you use more than about 6 general purpose registers, yeah, nice)

and that's it

Jawed

Not to sound smart or piss you off or anything, but are you saying that the RSX is close to a SM2a GPU, while Xenos is close to a SM4.0? If this is true then I can't see how the PS3 will keep up after the first year (as far as graphics go)?

ATI - PS3 is Unrefined

aaronspink

one

Unruly Member

a688

mckmas8808

one

Unruly Member

AlphaWolf

Specious Misanthrope

mckmas8808

Langsuyar

mckmas8808

Shifty Geezer

uber-Troll!

Titanio

mckmas8808

Jawed

zidane1strife

Laa-Yosh

I can has custom title?

Titanio

Jawed

Panajev2001a

Jawed

mckmas8808

Similar threads