SO this is what I am getting out of the 360 and PS3...

3roxor said:
Sorry blakjedi, but you are completely wrong :!:

Comparing mostly renders on PS3 (with the exception of HS)

Why..because it happens to be that deanoC is the only developer who confirmed that here.... :rolleyes:

yep and thats good enough fo rmost of the Web too.

to games running (at more than 5fps) on last generation's PC hardware is kinda far fetched dont you think?

Ehm..with last generation PC hardware you mean those 2 Powermac G5 where the xboxBeta kit is currently running on... :rolleyes:

And the X850s they were running on and oh the dual SLI 6800s everything "running" on Ps3 was on.... get my drift?

(actual game running at more than 5fps) has been universally hailed as Gears of War.... so in reality your perception is inverse to reality.

Yup at a fluent 15fps.. It also wasn't "universally" hailed as the most impressive next generation game.... so in reality your perception is inverse to reality.

Ok...

BTW everything youve seen is running on high end pcs... "Comparing mostly renders on PS3" = you :?

Hey THEY (Guerilla Games) said ."It's basically a representation of the look and feel of the game we're trying to make," and most of the other demos were *rendered* as frames then streamed together to provide the illusion of gameplay.

Which doesn't make your opinion wrong just not as informed as you make it out to be.


*shrug*

No doubt PS3 will generate amazing games but i dont think you can PROVE that based on what youve seen to date.

Not in the court of law "but I'll stick to it untill proven otherwise.."

try again[/b]

Nope cause its pointless. Have a great day.
 
Actually I am pretty sure, that when PS3 launches, high end PCs will already have the same performance. In the End of 2006 high end PCs will be clearly faster.

Of course console games have higher budgets and are optimized alot more.

Differences in grafik and gameplay between PS3<->Xbox 360 will depend on budget and the talent of programers and artists.

I mean there are still alot people who prefer gt4´s grafik over forza´s grafik, despite missing mipmapping, missing aa, less texture detail ...

And the difference between ps2<->xbox is huge compared to the difference betwenn ps3<->xbox360, where we most certanly will not see a clear winner.
 
Titanio said:
That aside, there is a clock difference and an unknown, but highly likely, performance degradation on unified shaders vs a dedicated shader. It's a little complicated to weigh up all the plusses and minuses..

The major drawbacks from unified shaders is load balancing and transistor realestate. There is nothing intrinsically stating that a unified shader ALU would need to be slower/less effecient than a dedicated ALU. The issue would be how much transistor realestate is needed to make an ALU friendly to both procedures and how to ensure that the hardware intelligently divides up the workload. Load balancing would be KILLER if not done correctly. Kirk has stated the issue of load balancing explicitly as an issue; conversely ATI is confident in their arbitrator design.

Basically, you cannot take what Kirk/NV says as a universal rule. A problem for NV is not necessarily a problem for ATI. See below for why.

After Kirk's claim that some devs were pushing VS off Xenos and using it only for pixel shading, the more I thought about it, the more I consider that to be a very undesireable situation. It would suggest some bad things for Xenos, in fact, if that were more desireable in the typical case.

:LOL:

You are kidding, right?!

This comes from the SAME Kirk interview where he downplays FP16 (HDR) and MSAA and saying we wont see it for a long time (at least from NV).

Opps, his competitor doesn't seem to have this issue. ;)

Just PR FUD. Seriously, I would NOT take anything NV says about ATI or what ATI says about NV seriously without some firm evidence.

That said, I am not sure you can be in a position to discuss whether it is desirable or not to use Xenon for vertex tasks because simply 1.) there is no direct information on how it performs 2.) The Xenon-Xenos cache link is designed for the purpose of streaming vertex data and 3.) vertex engines are one of the FEW things in game development that have a proven track record of being threaded separate from the core game logic.

Seriously, quoting Kirk's "annonymous" source is about as reliable as listening to the MS rep saying, "They have 1 CPU core and 7 other 'things'".

Kirk has an invested interest in downplaying unified shaders. His company has not gone that direction, NOR will they be going that direction anytime soon.

Ditto FP16 w/ MSAA. He clearly downplays it and basically blames DEVELOPERS.

And of course he trumps up their architectural advantages at every corner (remember SM 3.0 and the ATI smear compaign by Jen Hsun? he was pretty made when someone mentioned R420 in the same breath as his precious cutting edge NV40).

Basically I would not trust Kirk at all when it comes to talking about ATI's products that have yet to launch. Specifically, developers are just NOW getting hardware. For Kirk to be talking about hardware that has been out a month is pretty telling. From the news posts I have read the Beta kits are still using G5s, sooo.... what can we gather from Kirk?

PR, downplay and smear, and praise your own efforts.
 
Acert93 said:
The major drawbacks from unified shaders is load balancing and transistor realestate. There is nothing intrinsically stating that a unified shader ALU would need to be slower/less effecient than a dedicated ALU.

I think in the general case it's fairly reasonable to expect that a more general unit will lose some performance versus a dedicated one with specific tasks, no? Especially if you're talking about very refined and mature dedicated units ;)

Acert93 said:
You are kidding, right?!

This comes from the SAME Kirk interview where he downplays FP16 (HDR) and MSAA and saying we wont see it for a long time (at least from NV).

Don't misunderstand me! I don't accept that what he's saying is necessarily true, that devs are actually doing that - maybe they are, maybe they're aren't, I don't know. I was simply entertaining the hypothetical, and responding to a post that was doing likewise. As far as I can see if that was being done it, or if devs felt they had to do it, it would not be a good thing. You'd be giving up one of Xenos's greatest strengths - the greater utilisation across two tasks, vertex and pixel - while also eating into your available CPU power. If devs in the typical case felt they had to do that, it would make me wonder why, and how this could be a more desireable situation than taking advantage of some of the key benefits Xenos is supposed to bring. In such a situation, if devs did want to put all vertex work, they would IMO be better off overall with a fully dedicated chip (i.e. all dedicated pixel shading hardware), or even a "traditional" dedicated architecture, where pixel performance may well hold up to what Xenos can even offer going full whack at it, and they still have VSs then left over to avoid vertex CPU tasks.

(BTW, I don't think Kirk was downplaying HDR or MSAA individually? HDR performance perspective is a major focus of G70, it would seem)
 
Acert93 said:
The major drawbacks from unified shaders is load balancing and transistor realestate. There is nothing intrinsically stating that a unified shader ALU would need to be slower/less effecient than a dedicated ALU.
Though I agree with your post in the main ,has it not been states lots of times that a unified shader is not as efficient at a task as a specialised shader for that task? There's never been any associaited figures with how much, but as I figure Unified Shaders are balanced with

+ Efficiency in operation - load balancing - lower performance to specilaised shaders :: fixed op shader

If the load balancing and shader performance issues are more negative than the efficiency is positive, the hardware is outperformed by existing methods.

I don't think the advantage or disadvantage can be too much either way as if so, either ATi wouldn't go near Unified coz it's rubbish, or they'd have unified with everything and so would nVidia. Given no rush to introduce unified to PC GPU's ATi can't see it as being amazingly more powerful than conventional systems.
 
Titanio said:
I think in the general case it's fairly reasonable to expect that a more general unit will lose some performance versus a dedicated one with specific tasks, no?

No, I don't agree with that.

I think at this point it is a huge assumption to say "a more versitile ALU is a less effecient ALU".

Like I said, in essense, is that there is a tradeoff on performance-per-transistor (unified shader units ARE larger), but there is no reason it must be less effecient per-unit.

Further, ATI is suggesting that this design, overall, is far superior in effeciency than a traditional design. Basically you get more out of this design than you would out of a traditional design per-transistor.

So you have Kirk who is anti-Unified Shader at this point bemoaning it; and ATI who is pro-Unifided Shader singing its praises.

Who to believe?

Simple: It all goes back to business, investors, and money. NV is not switching because they believe they can squeeze more life (investment, $) out of their current design and unified shaders pose a problem for their current design decisions. Unified shaders are future, but not the present.

ATI is switching because they have overcome the issues that were hurdles for them (like intelligent load balancing and getting ALU performance:transistor ratios under control to end up on the positive) so it is better than what they have now.

Remember, MS could have easily chosen to go with a standard GPU design. Kirk's rhetoric aside, there must be benefits to the design. From a common sense standpoint unified shaders should perform better in more situations, different game genres, and make more effecient use of the transistors (i.e. more transistors will be active more often, instead of PS sitting idle during VS intense scenese/games and vice versa).

MS had the oppurtunity to evaluate traditional hardware (R300, R420, R520) and ATI's new stuff (R400 and ATI's WGF 2.0 projections). I think it is awefull pessimistic to assume MS chose the worse of the two options.

I was simply entertaining the hypothetical, and responding to a post that was doing likewise.

When entertaining discussion it is important to consider the source.

Besides the reasons mentioned above, I am curious how much time Kirk and his dev friend had to even look at the hardware. We know that it has been less than a month and the Kirk interview is like a week old. To be listening to poo-pooing from Kirk, who has an invested interest AGAINST this design, is about as reliable as listening to doomsayers who had one months with a CELL.

Needless to say it is not worth commenting on--ESPECIALLY because it is in the same article where he blows the difficulties of MSAA+HDR out of proportion.

Basically, you have to read EVERYTHING Kirk says from the NV standpoint. Kirk has a long history of poo-pooing on ATI. Just read many of his older interviews. The fact is that his competitor is not as stupid as he often seems to paint them.

You'd be giving up one of Xenos's greatest strengths - the greater utilisation across two tasks, vertex and pixel - while also eating into your available CPU power. If devs in the typical case felt they had to do that, it would make me wonder why, and how this could be a more desireable situation than taking advantage of the benefits Xenos is supposed to bring.

While there is no point commenting on Kirk's make believe developer friend's issues (funny how this person is reporting back to NV and not ATI! hahaha) I would note that IF a developer is using the CPUs for some vertex work it is not as bad as it sounds because 1.) it was designed with that in mind and 2.) that would mean having all 48 ALUs dedicated to pixel shading.

While I think we can totally disregard what Kirk says here, in the end Kirk in only highlighting a strength of the system:

• It can act as a traditional GPU where Xenos handles PS and VS.
• Xenos can act as a dedicated PS unit and use Xenon to generate vertex data to maximize the visual look.

That versatility, the same versatility that many PS3 fans are excited about with CELL<>RSX, is designed into the 360.
 
Shifty Geezer said:
Acert93 said:
The major drawbacks from unified shaders is load balancing and transistor realestate. There is nothing intrinsically stating that a unified shader ALU would need to be slower/less effecient than a dedicated ALU.

Though I agree with your post in the main ,has it not been states lots of times that a unified shader is not as efficient at a task as a specialised shader for that task? There's never been any associaited figures with how much, but as I figure Unified Shaders are balanced with

Actually Kirk is the only one in the past who I have heard harp on U.S. effeciency. He also complained about load balancing and how it was almost impossible.

While Kirk has changed his tune (he is now on record stating they WILL do a U.S. part in the future) the question still remains: ATI obviously has overcome these issue.

Just like ATI does not seem to have issues with FP16+MSAA.

Kirk is an expert in his company and his design. But he also works PR and is not familiar with all of ATI's technologies.

Anyhow, while I could be wrong, I have not read any primary data (or anything NOT from NV) indicating a U.S. ALU is, or must be, less effecient/slower.

I am more than happy to concede that U.S. Array will be less effecient in transistors.

I don't think the advantage or disadvantage can be too much either way as if so, either ATi wouldn't go near Unified coz it's rubbish, or they'd have unified with everything and so would nVidia. Given no rush to introduce unified to PC GPU's ATi can't see it as being amazingly more powerful than conventional systems.

ATI actually was going to bring a U.S. product to market (R400) and from what I have heard from others here it was a "debackle". The jist I was told was that the design was too feature heavy but did not have the right balance of performance for the market.

Which makes sense. As chips get larger they are able to add stuff in that, 2 years previous, were unthinkable. Think about adding 20M transistors for video decoding (and then not work!) on a 50M transistor GPU. It would NEVER happen!

Obvisouly in ATI's design there is the arbitrator, a hardware tesselator, and shader ALU units that are larger than dedicate units. That could have been a bad combo on a smaller process.

Similarly, the GPU market has a long history of feature rich cards that underperformed until the sequal. R300 is basically the first GPU to introduce new features that were actually usable in games.

It sounds like as the competion between ATI and NV got hot and heavy, ATI could not risk offering a feature rich card that would under perform compared to the market. So they went with the R420 design.

Two other points to add.

1. You are correct, traditional designs are not "bad" nor are they dead. The fact NV is sticking with it says a lot. They are confident that there is a LOT more power to get out of this design.

2. The lack of a unified shader architecture on the market is also a large part due to the Longhorn delays. One of the BIG selling points of U.S. units is that there is a unified language and you can pass information from VS units and PS units back and forth. All of this requires a new API. yet MS is taking their sweet time with Longhorn. So even *if* U.S. units were superior it would be tough to overlook the API issues.

But looking forward it seems everyone, even NV, sees that an U.S. architecture is the future. As the chips get larger and have more pipelines/ALUs is just makes too much sense.

Think of a 100 ALU GPU. With the 16:6 ratio of the NV40/R420 we are looking at a

Traditional Design: 72 PS units, 28 VS units
Unified Design: 100 flexible ALUs*

Obviously having 100 flexible units means you can knock out a VS heavy scene about 4x faster. It also means you can really chew through PS heavy tasks quickl as well. You have less units sitting idle. Obviously this means the GPU can handle not only different parts of games better, but it can also handle differently designed games. What if someone designs a very vertex shader heavy game with very minimal pixel shading? Or more realistically, such a card would be extremely usefull in th CAD market.

*Of course there would be less U.S. ALUs because they ARE less effecient in transistor space.

Of course that just rehashes some of the benefits of the design that we all know. But it seems even now NV has conceded that unified shaders are in their future as well.

But they has not stopped them from trashing on it in the past OR present. Just like they were trashing on MSAA + HDR ;)

Ohhh and just like ATI talked down SM 3.0.

It DOES go both ways.
 
Acert93 said:
Titanio said:
I think in the general case it's fairly reasonable to expect that a more general unit will lose some performance versus a dedicated one with specific tasks, no?

No, I don't agree with that.

I think at this point it is a huge assumption to say "a more versitile ALU is a less effecient ALU".

The general trend is increased generality = less performance for a specific task.

Vertex work and pixel work are not yet exactly the same. With a dedicated piece of hardware you can make assumptions and optimisations that you can't on a piece of hardware that has to handle more than one task.


Acert93 said:
Further, ATI is suggesting that this design, overall, is far superior in effeciency than a traditional design. Basically you get more out of this design than you would out of a traditional design per-transistor.

With arbitrary instruction mixes, perhaps. But if your mix typically maps well to a dedicated architectures built-in "proportions", so to speak..

Also, I don't know about "far more" utilisation when comparing to a closed architecture. Devs can build for a specific chip, a specific ratio of pixel:vertex shaders, something you can't do in PCs where this varies. So I think it's somewhat more necessary in PCs. Even with ATi's own comparison to PCs, their quoted high end in terms of utilisation was 70%. Where might the high end be with a dedicated architecture in a console? Higher still, I'd think. Compared to claimed 95% efficiency, the gap may not be all that big with a dedicated chip in a closed system (far more?).

Acert93 said:
Remember, MS could have easily chosen to go with a standard GPU design. Kirk's rhetoric aside, there must be benefits to the design. From a common sense standpoint unified shaders should perform better in more situations, different game genres, and make more effecient use of the transistors (i.e. more transistors will be active more often, instead of PS sitting idle during VS intense scenese/games and vice versa).

No doubt, but remember that the "fixed" ratio in dedicated hardware is not arbitrarily chosen. There is a common case it's aimed at.

A little side note, but interesting to ponder: consider what your greater utilisation frame-to-frame with "out there" instruction mixes is actually going to buy you (frame-to-frame)

Acert93 said:
MS had the oppurtunity to evaluate traditional hardware (R300, R420, R520) and ATI's new stuff (R400 and ATI's WGF 2.0 projections). I think it is awefull pessimistic to assume MS chose the worse of the two options.

We can't assume MS's only primary concern was sheer performance. Obviously Xenos is designed around flexibility and developer ease, which suggests other priorities alongside good performance. They were also dealing with a smaller target (in terms of transistors) for the shader array than they could have otherwise been, and unified shaders may have offered the best solution given that - a way to get the most out of what they were working with (this is not to downplay its power at all, btw). So for their situation it may have been the best choice, and I'm sure it was, but that doesn't mean it offered most performance, or will offer more performance than competing "traditional" designs.

As for the rest, I think we're crossing wires or something. Forget about Kirk, if you wish, his comments on that are not important to me or even to what I was talking about. I'm not saying anything either about using Xenon for some CPU work. However the issue of using all of Xenos for pixel shading and letting Xenon handle all vertex work is what I was addressing, and I have very strong doubts about the desireability of that in the typical case.
 
Acert93 said:
Obviously having 100 flexible units means you can knock out a VS heavy scene about 4x faster. It also means you can really chew through PS heavy tasks quickl as well. You have less units sitting idle. Obviously this means the GPU can handle not only different parts of games better, but it can also handle differently designed games. What if someone designs a very vertex shader heavy game with very minimal pixel shading? Or more realistically, such a card would be extremely usefull in th CAD market.
On that point, though I agree, I do question it's validity. Yes, US can bring all shaders to bear pixel shading, but they need vertex data to work on. And they can all transform vertices, but they need pixel shading to amount to anything.

If you think of it like car manufacturing, Ford showed the difference between workers on every part or workers focussing on a single part. When workers handled manufacture from beginning to end, applying diverse skills, they made cars. When they specialised at only doing a specific part, improvements all along the line meant cars MUCH faster. I wonder how this relates to GPUs or not?

You will need to transform vertices, and shade them. You could transfrom all the vertices, save that data, then use the same shaders to pixel shdae them. Or you can divide the shaders into those transforming vertices and those shading, workng on the data immediately they get it. This sets up a 'production run'.

The key strength to US isn't being able to do ALL of one or the other. It's being bale to switch so when the vertex data has swamped the pixel shaders, due to complicated shader programs, those vertex shader rather than waiting can chimmy in and help get through the back-log, and then pick up where they left off. And vice versa for pixel shaders.

That makes a lot of sense, but I wonder how often this happens? If it doesn't happen that much then I guess US is a waste. Do we have any profiles for existing GPUs and games to see how much sitting idle they do?
 
Good points.

Shifty, I think the specific example your raise is why there is the goal of going to long, much longer, shaders.

Because you are definately right that you need vertex and pixel data to work on... but if your shader code starts getting to be, on average, hundreds upon hundreds of operations (the Luna demo has some 300+ shaders) then it makes more sense.

But yes, overall, the idea for unified shaders is to autoload balance. You probably wont see many extremes of 100% one or the other, but who knows, with much longer shader code we may begin to see such situations.

Which raises yet another reason why we WONT see what these beasts can do until 2007.

Devs are so familiar with writing short shaders to compensate for the limited abilities of the DX7/DX8 era that it will take time to test the boundaries of the new hardware and then compile libraries of nice shader code to use in games.

Until then it will be as LOT of trial and error.
 
I think ATi already calculated the VS/PS ratio in most game that gave them a "sweet spot" as far as load-balancing is concerned. IIRC, this sweet-spot can also be tweaked by developers. This part of the C1 design has been veiled in secrecy so details are scanty.

I think it all depends on how the API implements vertex and pixel shaders. Maybe there's some way they can converge the way the two use hardware and design the transistors around that. Maybe they already have. MS made the API, they wrote the compilers, they designed part of the GPU. I strongly believe that USA will make it's PC debut when Longhorn WGF is introduced. Xbox 360 is the bridge between the past and the future. It's a training ground for the future of PC's; that being multi-core CPU's with GPU/NB hybrids using USA. Microsoft has the world's best software engineers. I firmly believe that hardware design follows software design, not vice versa. Maybe I'm biased because of my CS discipline, and that you hardware engineers will probably get defensive :p, but that's what I believe. If MS actually created a more general way to implement VS and PS, then everything else is moot.
 
David Kirk mentioned that the shader implementation will be independent of the Longhorn API. It's only unified from a software standpoint and they're free to design the hw however they feel is most efficient.

I guess that would mean there will be limited control over shader balancing through the API.
 
Alpha_Spartan said:
I think ATi already calculated the VS/PS ratio in most game that gave them a "sweet spot" as far as load-balancing is concerned. IIRC, this sweet-spot can also be tweaked by developers. This part of the C1 design has been veiled in secrecy so details are scanty.

I think it all depends on how the API implements vertex and pixel shaders. Maybe there's some way they can converge the way the two use hardware and design the transistors around that. Maybe they already have. MS made the API, they wrote the compilers, they designed part of the GPU. I strongly believe that USA will make it's PC debut when Longhorn WGF is introduced. Xbox 360 is the bridge between the past and the future. It's a training ground for the future of PC's; that being multi-core CPU's with GPU/NB hybrids using USA. Microsoft has the world's best software engineers. I firmly believe that hardware design follows software design, not vice versa. Maybe I'm biased because of my CS discipline, and that you hardware engineers will probably get defensive :p, but that's what I believe. If MS actually created a more general way to implement VS and PS, then everything else is moot.
But software can't extract things not laid out in hardware... you have a budget there.

How large die area is taken up by the load balancer / arbiter / sequencer in Xenos? Was it hinted at somewhere? IIRC ATI said it's fast because it's implemented as hardware.
 
acert ATI talked down SM3.0 because truthfully there was nothing of value ( far cry maybe?) that too advantage of it (is there anything else yet?) and anything that would would be available in time for r520... i think ATI made the right call..
 
If you don't add the features, nothing's gonna be written to use them ;) How many games exist to use developer-aided load-balancing on unifed shaders...

As long as no-one adds SM3.0 to their parts, no-one will write SM3.0 shaders.
 
blakjedi said:
acert ATI talked down SM3.0 because truthfully there was nothing of value ( far cry maybe?) that too advantage of it (is there anything else yet?) and anything that would would be available in time for r520... i think ATI made the right call..

Build it and they will come.

While SM 3.0 was a bullet point at the time, it still was something ATI talked down. That is separate from the fact NV's implimentation appears to only offer marginal increase in performance in SM 3.0 games and ATI already supports GI (one of the few SM 3.0 features... if I remember correctly FP16 is not even an SM 3.0 requirement).

We may find that ATI's SM 3.0 part is really a lot better... or we may find SM 3.0 is a little overblown. We don't know quite yet.

Shifty said:
As long as no-one adds SM3.0 to their parts, no-one will write SM3.0 shaders.

Exactly,

But with both consoles supporting SM 3.0 we should see a big shift. I suspect of the next 4-5 years SM 3.0 will be the baseline. Starting Fall 2006. Sadly it looks like NV's NV30 really neutered PS 2.0's lifecycle :( Sure, it saw some support, but with only 1 generation of hardware from both IHVs providing solid PS 2.0 performance, and with the huge install base for SM 3.0, I think we will see most fall 2006 titles on SM 3.0 (and PS 1.4 fall back?)
 
Acert93 said:
But with both consoles supporting SM 3.0 we should see a big shift. I suspect of the next 4-5 years SM 3.0 will be the baseline. Starting Fall 2006. Sadly it looks like NV's NV30 really neutered PS 2.0's lifecycle :( Sure, it saw some support, but with only 1 generation of hardware from both IHVs providing solid PS 2.0 performance, and with the huge install base for SM 3.0, I think we will see most fall 2006 titles on SM 3.0 (and PS 1.4 fall back?)

I disagree... unless you can back up numbers on who's using what ;) It just seems that with all those R3xx+, there's enough weight there to stick with SM2.0 for baseline.

Plus, with UE3.0 having SM2.0 as the baseline or the fallback...

It's kind of a tricky situation. I mean, you've got Ubisoft's stance with Chaos Theory, but then you've got Epic's SM2.0 baseline with UE3.0, and many more developers have licensed that.
 
3roxor said:
(Killzone and Heavenly Sword were running 5 fps and speeded up towards 60? Fps)

The Heavenly Sword E3 demo runs a between 5 and 20fps (roughly) currently. The army scenes being the 5 and the actual fight scene being roughly 20+fps.

With lots and lots of work to do for the PS3, the target of 30fps seems acheivable. Its fairly common even without a new platform for games to be under framerate at this stage of development.
 
Acert93 said:
So you have Kirk who is anti-Unified Shader at this point bemoaning it; and ATI who is pro-Unifided Shader singing its praises.

Who to believe?

Actually, the most curious thing about that interview is that he was admitting they will have to do it; he was just downplayng it for now.

His comment on developers liking it raises the question of what he actually heard - did a developer say they didn't like it? In which case, why, as its transparent to them; or did just he hear some devs may use the CPU for VS work (which is something that MS designed the CPU to be able to do), and he just put the rest together?
 
Back
Top