RSX pixel shaders vs Xenos pixel shaders

Status
Not open for further replies.

rounin

Veteran
I'm not sure if there has been a topic already made on this issue, but I thought it would be worth the risk since its gotten me very curious and recently I haven't seen much of a debate on this. I realize that a lot of the 3d gurus are very enthusiastic about the Xenos GPU, partly because of its new architecture and partly because the chip was covered extensively by a certain someone on a certain site ;) Part of the reason for this thread is also because someone made a thread on comparing vertex shading and the Xenos as a shading monster thread :D

If we were to take the assumption that the RSX is just a G71 with a 128-bit bus (same 24PS/8VS configuration) and compare it to the Xenos (48Unified Shaders), how would the two compare in terms of pixel shading? Basically, does it make sense to assume that 1 unified shader line does the same work as 1 pixel shader in the G71? What is the tradeoff for that big number (if applicable) versus a traditional architecture?

Given that in a closed console environment, devs are able to much better optimize for the hardware, would a unified shader approach give that big of a real-world advantage in terms of end results versus a traditional approach? I ask because I was under the impression that a unified shader approach allows for better load balancing in the PC world where games need to run on a diverse range of hardware.
 
rounin said:
I'm not sure if there has been a topic already made on this issue, but I thought it would be worth the risk since its gotten me very curious and recently I haven't seen much of a debate on this. I realize that a lot of the 3d gurus are very enthusiastic about the Xenos GPU, partly because of its new architecture and partly because the chip was covered extensively by a certain someone on a certain site ;) Part of the reason for this thread is also because someone made a thread on comparing vertex shading and the Xenos as a shading monster thread :D

If we were to take the assumption that the RSX is just a G71 with a 128-bit bus (same 24PS/8VS configuration) and compare it to the Xenos (48Unified Shaders), how would the two compare in terms of pixel shading? Basically, does it make sense to assume that 1 unified shader line does the same work as 1 pixel shader in the G71? What is the tradeoff for that big number (if applicable) versus a traditional architecture?

Given that in a closed console environment, devs are able to much better optimize for the hardware, would a unified shader approach give that big of a real-world advantage in terms of end results versus a traditional approach? I ask because I was under the impression that a unified shader approach allows for better load balancing in the PC world where games need to run on a diverse range of hardware.


For the most part one G71 pipeline is 2 Xenos ALU's. But this is neither strictly true nor is it a particularly good measure of performance.

Xenos can read less txtures per clock so G71 should perform better in texture limited operations, and in the cases where you are geometry bound G71's shaders will sit idle where Xenos' will be utilised to speed up geometry processing, making it faster in these cases.

G71 ALU's are 4 wide and can be split 2, 2 or 3, 1 Xenos' ALU's are 5 wide and are split 4, 1.

G71 has additional mini ALUs /pipe used to do "special ops" like sin cos and rcp and a potentially "free" 16 bit normalise.

G71 still supports 16 bit fp math, which in effect doubles it's register file size when 16 bit math is in use.

Xenos can do a texture fetch in parallel with ALU ops, G71 uses an ALU during texture ops.

Xenos has MUCH better dynamic branch support.

The EDRAM on Xenos means a pipe cannot block on a wait to destination memory.

But in the end it's not simple, it depends a lot on the shader, and the source data.
 
rounin said:
Basically, does it make sense to assume that 1 unified shader line does the same work as 1 pixel shader in the G71?

The pixel shader units on the G71, have two ALU's, with one of those ALU's also being used for texturing. Each ALU, also has a sub-unit doing some additional math and called a mini-ALU unit. Each pixel shader on Xenos has one ALU. The texture units on Xenos are seperate from the pixel shaders, and there are 16 of them.

RSX
- 550 MHz
- 8 vertex units
- 24 pixel pipelines
- 24 texture units
- 48 pixel ALU's + mini ALU's (burdened with doing texture work)
- more overall horsepower

Xenos
- 500 MHz
- 48 pixel ALU's (burdened with doing vertex work also)
- 16 seperate texture units
- better at shading many small fragments, and code branching
- load balancing, and less chance of stalls

Different strengths, and developers will exploit those differences, but the end result will be similar simply because they are not that far apart in overall power.
 
Last edited by a moderator:
Edge said:
The pixel shader units on the G71, have two ALU\'s, with one of those ALU\'s also being used for texturing. Each ALU, also has a sub-unit doing some additional math and called a mini-ALU unit. Each pixel shader on Xenos has one ALU. The texture units on Xenos are seperate from the pixel shaders, and there are 16 of them.

RSX
- 550 MHz
- 8 vertex units
- 24 pixel pipelines
- 24 texture units
- 48 pixel ALU\'s + mini ALU\'s (burdened with doing texture work)
- more overall horsepower

Xenos
- 500 MHz
- 48 pixel ALU\'s (burdened with doing vertex work also)
- 16 seperate texture units
- better at shading many small fragments, and code branching
- load balancing, and less chance of stalls

Different strengths, and developers will exploit those differences, but the end result will be similar simply because they are not that far apart in overall power.


Xenos has 48 ALUs which can do both pixel and vertex shaders, dont treat them seperately and because of reduced overhead from USA, the overall output is more than RSX regardless of the 50 Mhz gap infavor of RSX. USA is designed to reduce overhead and improve efficiency, and with ATI overheads historically less than Nvidia anyway, i expect alot more from Xenos in the future when Developers start using Memexport
 
kabacha said:
Xenos has 48 ALUs which can do both pixel and vertex shaders, dont treat them seperately and because of reduced overhead from USA, the overall output is more than RSX regardless of the 50 Mhz gap infavor of RSX. USA is designed to reduce overhead and improve efficiency, and with ATI overheads historically less than Nvidia anyway, i expect alot more from Xenos in the future when Developers start using Memexport

I said "burdened with doing vertex work also", indicating their unified nature.

Overall output is more than RSX? Sure in certain circumstances, but certainly not in all, and Memexport is hardly going to amount to being that big of a deal. RSX has more overall horsepower, but I agree Xenos has some efficiencies that will help it close the gap, or surpass RSX, but not by much.

Anyway, all this has been discussed a million times here. Better for the thread starter to start digging and reading those older threads, then for all this back and forth argument to start up again.
 
Indeed. I have been following these forums for a long time and all in all I was of the impression that the two are roughly equivalents with different strengths and different weakenesses, but recent discussions have led me to think that Xenos suddenly is a super (shader) monster and that, coupled with its USA will allow for much better graphics through efficiency. As a result, I would like to have the last paragraph of my original post to be cleared up.
 
Efficiency is word that being thrown around for a long time concerning Xenos, but how efficient, and how do you quantify that? You can't really, and that largely drives the constant arguments that cannot really be settled.
 
Edge said:
Efficiency is word that being thrown around for a long time concerning Xenos, but how efficient, and how do you quantify that? You can't really, and that largely drives the constant arguments that cannot really be settled.

Of course the flip side is to completely ignore and dismiss the architectural differences in how they utilize their resources and the tradeoffs in the respective designs. We know from the PC side of things that comparing various IHV graphic chips based on paper metrics can lead to pretty erronious results.

A simple example ERP outlined is "the EDRAM on Xenos means a pipe cannot block on a wait to destination memory". Pretty straight forward and is more relevant in many ways than comparing paper spec-marks that are detatched from an architectural context. Similarly a large number of arguements are built upon faulty comparisons like fillrate. Techically G71 has more fillrate than Xenos, but when memory limitations are factored in the fillrate of G71 with 128bit GDDR3 700MHz is actually less than the GPU is capable of. And when AA is in use this situation changes even more. An over reliance on, "the numbers look the same, therefore they must" ignores how the chips were designed to run. RSX has more float performance than CELL, yet no one would argue RSX is a better CPU.

Of course on the flip side you have no problem saying, "but the end result will be similar simply because they are not that far apart in overall power" and seem to relegate any effeciency gains as nominal. Yet looking over ERP's post you see comments like, "Xenos has MUCH better dynamic branch support." Where do such comments as this fit within the framework of your conclusion?

It is all perspective; and in reality it will come down to game design and developer skill/budget. And as ERP said:

ERP said:
But in the end it's not simple, it depends a lot on the shader, and the source data.

Ultimately some game designs are better for various designs. The X800 and 6800 were night and day in Doom 3 / Half-Life 2 (at least at first). Very little information gleaned from the spec sheets would indicate such divergent benchmark results. All these chips have short comings, it is the job of the developers to minimize those areas in their engine and game designs and to emphasize their strengths. And with consoles, the GPU is just one element. Developers are free to utilize the entire system to their benefit if they so wish. And ultimately different developers with different backgrounds and design goals will arrive at difference conclusions for what is best for them. And in the broader context what is good for a small 15 man development team may not be what is best for a 150 person development team.

On the positive side, Mintmaster has done a good job lately focusing discussion on specific points of an architecture. Frequently many of the non-technical posters here want definitive, "Which one is faster" type answers. This is really the wrong question for many reasons. When you break it down, like ERP did above, you can get a feel for the strengths and weaknesses of each design. e.g. ERP noted Xenos having strong dynamic branching performance and vertex texturing. So a game that is bottlenecked by such designs could possibly be substantially faster if a suitable work around could not be found on another system. But the reality is we should not expect PS3 developers to push heavy vertex texturing, instead they will be using different design goals that better match the PS3's hardware.

Of course this leads threads into a more discussion oriented direction instead of the, "I am a fan and want self validation that my platform of choice is the best" or "I have a platform of choice and am going to do everything I can to downplay the competition" both of which tend to dominate and ruin a lot of threads..
 
Acert93 said:
"Xenos has MUCH better dynamic branch support." Where do such comments as this fit within the framework of your conclusion?

I indicated as much also, as I wrote my post before reading his post. My conclusion is that: "better at shading many small fragments, and code branching". What more do you want? You want me to say 30 percent better or what? But at what scene, or what game, etc. In other words I can't give a specific answer, and neither can you. Why don't you answer Rounin instead of debating with me.
 
Last edited by a moderator:
Acert93 said:
Of course the flip side is to completely ignore and dismiss the architectural differences in how they utilize their resources and the tradeoffs in the respective designs. We know from the PC side of things that comparing various IHV graphic chips based on paper metrics can lead to pretty erronious results.

A simple example ERP outlined is "the EDRAM on Xenos means a pipe cannot block on a wait to destination memory". Pretty straight forward and is more relevant in many ways than comparing paper spec-marks that are detatched from an architectural context. Similarly a large number of arguements are built upon faulty comparisons like fillrate. Techically G71 has more fillrate than Xenos, but when memory limitations are factored in the fillrate of G71 with 128bit GDDR3 700MHz is actually less than the GPU is capable of. And when AA is in use this situation changes even more. An over reliance on, "the numbers look the same, therefore they must" ignores how the chips were designed to run. RSX has more float performance than CELL, yet no one would argue RSX is a better CPU.

Of course on the flip side you have no problem saying, "but the end result will be similar simply because they are not that far apart in overall power" and seem to relegate any effeciency gains as nominal. Yet looking over ERP's post you see comments like, "Xenos has MUCH better dynamic branch support." Where do such comments as this fit within the framework of your conclusion?

It is all perspective; and in reality it will come down to game design and developer skill/budget. And as ERP said:



Ultimately some game designs are better for various designs. The X800 and 6800 were night and day in Doom 3 / Half-Life 2 (at least at first). Very little information gleaned from the spec sheets would indicate such divergent benchmark results. All these chips have short comings, it is the job of the developers to minimize those areas in their engine and game designs and to emphasize their strengths. And with consoles, the GPU is just one element. Developers are free to utilize the entire system to their benefit if they so wish. And ultimately different developers with different backgrounds and design goals will arrive at difference conclusions for what is best for them. And in the broader context what is good for a small 15 man development team may not be what is best for a 150 person development team.

On the positive side, Mintmaster has done a good job lately focusing discussion on specific points of an architecture. Frequently many of the non-technical posters here want definitive, "Which one is faster" type answers. This is really the wrong question for many reasons. When you break it down, like ERP did above, you can get a feel for the strengths and weaknesses of each design. e.g. ERP noted Xenos having strong dynamic branching performance and vertex texturing. So a game that is bottlenecked by such designs could possibly be substantially faster if a suitable work around could not be found on another system. But the reality is we should not expect PS3 developers to push heavy vertex texturing, instead they will be using different design goals that better match the PS3's hardware.

Of course this leads threads into a more discussion oriented direction instead of the, "I am a fan and want self validation that my platform of choice is the best" or "I have a platform of choice and am going to do everything I can to downplay the competition" both of which tend to dominate and ruin a lot of threads..

No one is necessarily ignoring the architectural differences between the designs. Even putting aside the maxim that different is not necessarily better, I think some people are concentrating solely on the merits of the novel architecture, and unfairly dismissing those of the conventional architecture. While certain things can be pointed out, the significance is always exaggerated or downplayed.

For example, while it may be obvious that "the EDRAM on Xenos means a pipe cannot block on a wait to destination memory", it remains contentious whether the cost of the implementation could have been spent on other aspects of the hardware. Is it not a compromise between pixel quality vs post-processing, or programmable vs fixed function? Comparing the 128 bit bus on the RSX (identical to xenos too) to a supposed PC counterpart isn't fair, especially if you would also want to dismiss the XDR bandwidth that is available when required, and not being contended for at other times (unlike the 360). One could observe perhaps that "Xenos has MUCH better dynamic branch support". Even if true, how important is dynamic branching support outside of benchmarking tools? What of vertex shader intensive situations? Can you say that it happens most of the time? Why should the ability to dedicate all ALUs to vertex shading for one instance be significant?

Perhaps your argument is that paper metrics cannot provide a simple answer to which system is more powerful. But if I were to argue for it, I would say that even ignoring the vertex shaders on the RSX, the Xenos ALUs at best will only be able to match RSX's pixel shading capacity, regardless of how you reorganise the ALU arrays. This is based on numbers alone.
 
  • Like
Reactions: one
onanie said:
Even if true, how important is dynamic branching support outside of benchmarking tools? What of vertex shader intensive situations? Can you say that it happens most of the time? Why should the ability to dedicate all ALUs to vertex shading for one instance be significant?

I hope you remember those questions in 3 or 4 years time.

But if I were to argue for it, I would say that even ignoring the vertex shaders on the RSX, the Xenos ALUs at best will only be able to match RSX's pixel shading capacity, regardless of how you reorganise the ALU arrays. This is based on numbers alone.

You wouldn't be surprised if people disagreed with you, based solely on numbers, would you?
 
TurnDragoZeroV2G said:
I hope you remember those questions in 3 or 4 years time.
You wouldn't be surprised if people disagreed with you, based solely on numbers, would you?
I would be surprised. Do you have anything to offer?
 
onanie said:
No one is necessarily ignoring the architectural differences between the designs. Even putting aside the maxim that different is not necessarily better, I think some people are concentrating solely on the merits of the novel architecture, and unfairly dismissing those of the conventional architecture. While certain things can be pointed out, the significance is always exaggerated or downplayed.

For example, while it may be obvious that "the EDRAM on Xenos means a pipe cannot block on a wait to destination memory", it remains contentious whether the cost of the implementation could have been spent on other aspects of the hardware. Is it not a compromise between pixel quality vs post-processing, or programmable vs fixed function? Comparing the 128 bit bus on the RSX (identical to xenos too) to a supposed PC counterpart isn't fair, especially if you would also want to dismiss the XDR bandwidth that is available when required, and not being contended for at other times (unlike the 360). One could observe perhaps that "Xenos has MUCH better dynamic branch support". Even if true, how important is dynamic branching support outside of benchmarking tools? What of vertex shader intensive situations? Can you say that it happens most of the time? Why should the ability to dedicate all ALUs to vertex shading for one instance be significant?

Perhaps your argument is that paper metrics cannot provide a simple answer to which system is more powerful. But if I were to argue for it, I would say that even ignoring the vertex shaders on the RSX, the Xenos ALUs at best will only be able to match RSX's pixel shading capacity, regardless of how you reorganise the ALU arrays. This is based on numbers alone.

It's funny you mention this. This is a forum of tech-heads and comparing specs is a given, but the thought of real world use is not always entertained here. As a student of Interaction Design, I'm very tempted to point out that the GOALS of the users (game developers in this situation) should be what is emphasized in the comparison of the hardware--function first, form second. Having an exciting new architecture that theoretically should have some efficiency benefits doesn't answer this question: "Is this GPU going to meet the needs and wants of the user in a typical 'day in the life of...' situation?" That is a much more sophisticated question than "What are the specs?" The real test of the unified architecture is in this particular scenario (closed box context), and not anywhere else. Just because in the PC/open platform space you get a 15 to 20 percent theoretical efficiency gain through load balancing and better dynamic branching does not mean that is going to translate into the closed box environment. The efficiency gain may only be 5% when all is said and done....which could be easily outweighed by a raw performance difference of 15 - 20%. Having a more mature development environment on the software end also factors heavily into the usefulness of the conventional architecture to its users. I'm of the camp that believes gaming should become more CPU-centric. Having developers not have to relearn the way they utilize the GPU allows them to focus more of their efforts on things like physics, simulation, physics-based transitional animations, environment-based AI schemes, collision detection, better art direction, etc.
 
ROG27 said:
It's funny you mention this. This is a forum of tech-heads and comparing specs is a given, but the thought of real world use is not always entertained here. As a student of Interaction Design, I'm very tempted to point out that the GOALS of the users (game developers in this situation) should be what is emphasized in the comparison of the hardware--function first, form second
Indeed, function first, form second. Some would argue though that if form is there, then function can be figured out :) but the former statement is certainly more sensible for design. The users i.e. developers would certainly know best what function they'd like, and perhaps they've already entertained some of the questions i've asked.
 
onanie said:
For example, while it may be obvious that "the EDRAM on Xenos means a pipe cannot block on a wait to destination memory", it remains contentious whether the cost of the implementation could have been spent on other aspects of the hardware. Is it not a compromise between pixel quality vs post-processing, or programmable vs fixed function? Comparing the 128 bit bus on the RSX (identical to xenos too) to a supposed PC counterpart isn't fair, especially if you would also want to dismiss the XDR bandwidth that is available when required, and not being contended for at other times (unlike the 360).
There are a lot of things to consider with eDRAM vs the PS3 approach. First of all it allows developers to use the 512MB however they want without any headaches or performance issues. They can use 100MB of textures or 400MB of textures, if the game code allows for it. How much is this flexibility worth? Secondly, it'll probably save MS some money to only have one 128-bit bus. There are fewer memory chips and the board layout is simpler. (I know packaging the two dies together costs something, but that could disappear in the future.) Thirdly, there's more stuff you need on the GPU in order to manage without eDRAM (e.g. compression logic, more complex memory controller, etc), so this offsets a bit of the transistor costs. Fourthly, the additional memory contention from the CPU pales in comparison to the demands from the colour and z clients in GPUs without EDRAM. The bulk of the EDRAM probably yields well also, since a tiny amount of redundancy can cover for an error anywhere.

Add all this up, and it's tough to say whether there's a big cost disadvantage.

One could observe perhaps that "Xenos has MUCH better dynamic branch support". Even if true, how important is dynamic branching support outside of benchmarking tools? What of vertex shader intensive situations? Can you say that it happens most of the time? Why should the ability to dedicate all ALUs to vertex shading for one instance be significant?
For dynamic branching, of course there's no use in current games, because it's only been usable in hardware for six months or so. It's a new tool that enables new effects. Demos have shown huge practical benefits, and demos aren't benchmarking tools.

The polygons a game sends to a chip are not uniform in size. Vertex to pixel ratios span many orders of magnitude. Very, very few polygons lie within the range where both pixel and vertex shaders are mostly occupied simultaneously. For any given polygon, the ratio changes as the player moves around. I've done sophisticated workload analysis before, and it's like this all the time.

There's also lots of vertices without any pixels. For any character or object you draw (with the exception of some things like terrain), about half the triangles are backfaces, and get culled. Of the remaining polygons, some are out of the screen, as you can't waste CPU time getting rid of every polygon outside the viewing frustum.

Perhaps your argument is that paper metrics cannot provide a simple answer to which system is more powerful. But if I were to argue for it, I would say that even ignoring the vertex shaders on the RSX, the Xenos ALUs at best will only be able to match RSX's pixel shading capacity, regardless of how you reorganise the ALU arrays. This is based on numbers alone.
You think texturing is going to disappear? For any texture lookup, one of RSX's ALU's are occupied. Consider a shader with 8 texture instructions and 20 vector math. Xenos is limited by texture units, so it outputs 2 pixels per clock. RSX will output 24*2/28=1.7 per clock, assuming perfect dual issue. Note that some of the math units in Xenos are idle, so it's not an ideal case for it. If I really wanted to cherry pick, I could give an example where Xenos is over 10 times as fast. I don't know of any situation where the converse is true.

As for vertex shading consuming resources, consider a short 10 instruction shader operating on a small 10 pixel by 10 pixel rectangle with simple transformation shader. 1000 PS instructions, 8 VS instruction for two vertices (in fact you could get away with one vertex per quadrilateral). 99% of the time is spend in pixel shading. Vertex shading will barely make a dent in the shading power available for pixel shading.

Xenos doesn't have handicapped pixel shading like a bunch of you guys are pretending. Yes, it's a bigger die size in total, but there are lots of new rendering possibilitities with it.
 
In the end i think its going to be up to Cell to prove "itself". It never have been a wars of gpu´s. The GPUs seems comparable with their pros and cons, Cell simply looks that better IMO than xCpu. So for me Cell is the difference, not the Gpus.
 
ERP, not sure if you are aware, but the scalar units on Xenos's shaders are special function units as well.
 
ROG27 said:
It's funny you mention this. This is a forum of tech-heads and comparing specs is a given, but the thought of real world use is not always entertained here. As a student of Interaction Design, I'm very tempted to point out that the GOALS of the users (game developers in this situation) should be what is emphasized in the comparison of the hardware--function first, form second. Having an exciting new architecture that theoretically should have some efficiency benefits doesn't answer this question: "Is this GPU going to meet the needs and wants of the user in a typical 'day in the life of...' situation?" That is a much more sophisticated question than "What are the specs?" The real test of the unified architecture is in this particular scenario (closed box context), and not anywhere else. Just because in the PC/open platform space you get a 15 to 20 percent theoretical efficiency gain through load balancing and better dynamic branching does not mean that is going to translate into the closed box environment. The efficiency gain may only be 5% when all is said and done....which could be easily outweighed by a raw performance difference of 15 - 20%. Having a more mature development environment on the software end also factors heavily into the usefulness of the conventional architecture to its users. I'm of the camp that believes gaming should become more CPU-centric. Having developers not have to relearn the way they utilize the GPU allows them to focus more of their efforts on things like physics, simulation, physics-based transitional animations, environment-based AI schemes, collision detection, better art direction, etc.
ROG27, you make all those points, but you fail to acknowledge that it could be completely the other way around.

PC games have pretty low polygon counts. It's not that devs really want to do this, but they need the game run well on lower end and previous gen cards. Simply reducing the resolution will assist weaker pixel shaders, but this doesn't reduce the polygon count. If Xenos lets you run a 48 instruction vertex shader at 500Mverts per second, then they'll be free to increase the polygon counts.

Games don't use displacement mapping because even though many cards support vertex texture fetch, they do it veerrrry slowly, and this isn't worth the space you can save with displacement mapping. Plenty of other uses for VTF also.

The same with dynamic branching. I personally think ATI made a mistake in wasting so much die space on this feature for the PC space, because it'll probably be 2007 before any game uses it simply due to the install base. But on a closed platform you can use these things extensively. And it's not 15-20%, it's 2x to 10x, depending on the effect.

Right now we have stencil shadows and PCF shadow maps, for which NVidia devoted plenty of hardware to accelerate, but in the future none of these will be used. Instead, Variance Shadow Maps (cofounded by AndyTX on these forums) will give us fast, pretty, and mostly artifact-free shadows. And Xenos has a feature that increases their usability.

Xenos is not about making a dev relearn how to use a GPU, it's about opening new doors. For the most part you can use it exactly the same as a PC GPU. It's the same way the super powerful CPU's on these consoles, esp. CELL, can let you do new things (except there are plenty more headaches than with the GPUs).
 
Mintmaster said:
You think texturing is going to disappear? For any texture lookup, one of RSX's ALU's are occupied. Consider a shader with 8 texture instructions and 20 vector math. Xenos is limited by texture units, so it outputs 2 pixels per clock. RSX will output 24*2/28=1.7 per clock, assuming perfect dual issue. Note that some of the math units in Xenos are idle, so it's not an ideal case for it. If I really wanted to cherry pick, I could give an example where Xenos is over 10 times as fast. I don't know of any situation where the converse is true.

As for vertex shading consuming resources, consider a short 10 instruction shader operating on a small 10 pixel by 10 pixel rectangle with simple transformation shader. 1000 PS instructions, 8 VS instruction for two vertices (in fact you could get away with one vertex per quadrilateral). 99% of the time is spend in pixel shading. Vertex shading will barely make a dent in the shading power available for pixel shading.

Xenos doesn't have handicapped pixel shading like a bunch of you guys are pretending. Yes, it's a bigger die size in total, but there are lots of new rendering possibilitities with it.
But why would a developer write that code for RSX then? If RSX isn't very good at dynamic branching, then don't use it.

It seems to me that you're looking at this from a very PC-centric point of view, but actually, console developers have to be a much more crafty bunch if they want their games to look good.

For example, in lots of realworld situations, particularly where SM2.0 was used, the GeForce FX series came up short of R300 based cards, yeah? This was a major issue in the PC space because general code involving SM2.0 effects didn't run well on the hardware. But on the console, it wouldn't have been a massive issue since all the code would have been written for the FX's architecture, thus avoiding any architectural deficiencies. Sure it wouldn't have looked quite as good because a lot of time you'd have to cut corners or resort to lower precision but it would be a lot closer than looking at Half-Life 2 benchmarks would suggest.

Console developers do stuff that PC developers couldn't dream of doing on equivalent PC hardware because they can code very specifically knowing that everyone in the world will have the same piece of kit. An old but clichéd example is MGS2's rain, which probably could have been done using pixel or vertex shaders on PC hardware, but because that wasn't availible relied on PS2's large fillrate.

So I think to say 'this type of code will run faster on Xenos' is a misleading statement, because developers would never run the same code on RSX. They'll just find a faster workaround suited to the console, even if they have to trade off image quality vs. performance and hide it with art assets (Shadow of the Colossus is a good example of a game that does this a lot).

Hmmmm...I'm not really sure what the point of my post is. Also, it's quite possibly a complete load of rubbish. I think what I'm trying to say is that I think final games will probably be a lot more comparable than 'this type of code will run 10x faster on Xenos' suggests.
 
Status
Not open for further replies.
Back
Top