How can Nvidia be ahead of ATI but 360 GPU is on par with RSX?

Alpha_Spartan · Sep 23, 2005

hugo said:
An 11th hour move it is but the idea of buying off the shelf product doens't fit into Ken's vision.He does not belief in outsourcing development/manufacturing cost and would prefer most things done in-house.

If going with an Nvidia chip isn't outsourcing, then what is?

London Geezer · Sep 23, 2005

mckmas8808 said:
Well it depends if you think MGS4 is amazing performance. If you do then the answer to that question will be yes. If you don't think MGS4 is considered amazing performance then the answer to your question will be no.

Whenever MGS4 comes out, it doesn't matter.

Like with PS2, it will take a long time to exploit the PS3 hardware (and X360 too).

MGS4 will "exploit" the hardware the best the team can, but one or two years after, i'm sure devs will be able to extract much more out of the platform.

BlueTsunami · Sep 23, 2005

Alpha_Spartan said:
If going with an Nvidia chip isn't outsourcing, then what is?

Not that Ken believes that going with NVidia is NOT outsourcing, but it seems to be saying that Kutagari would prefer in-house development but in this case it could not be helped.

Dave Baumann · Sep 23, 2005

Jaws said:
Yep. And if you want to extend that route, then it leads you onto how the CPU effects things. Ultimately you have to look at the 'whole' architecture and not just isolate the GPUs.

Errr, this is a discussion about the graphics chips, as the title quite clearly states, not the verall system - this element is very much a factor of the different graphics chips withing these parts.

And to follow on with the workload type efficiency, i.e. load balancing, this is also where the SPUs in CELL can have a load balancing effect with RSX. The interaction with CELL is where RSX differs from G70 in the PC arena.

And this is likewise with Xenos, except the curiousity is that if you move workload such as vertex processing from Xenos to the CPU, Xenos gains resources to spend on pixel shading.

Furthermore, the varying clock domains that are present in the G70 will have a smoothing/ load balancing effect too...

Highly unlkely that the domains are going to vary on a dynamic workload basis.

Acert93 · Sep 23, 2005

mckmas8808 said:
I agree with about everything that you said Acert expect these two items. The PS3 will be exploited in 2006. Did you forget that MGS4 is targeted to come out next year? And when that happens Sony would have lived up to their expectations right?

1. No, I don't plant to see games on either platform taking full advantage of the consoles for a while. How can you when you have not been working with the console long? 1st gen titles, even if with the HW from day 1, you are learning a ton about it. 2nd gen games are more thoughtful, and knowledgable, about their abilities and limitations.

2. Read what I said in context. If Sony is unable to deliver a consistantly high quality number of games it could hurt them. As great as MGS, Halo, RE, Tekken, GT, or whatever franchise is, those games do NOT appeal to every gamer. A portolio full of quality games is important. Top to Bottom.

I never Sony would disappoint, but we do know they have been careful about disclosing media. There is always a risk of over hyping a product. As I noted Sony has done an excellent job of pitching key titles and really has handed MS their butts in this regards. But at some point they need to start showing lesser titles and cross platform titles. You have to deliver on your promises, and if there are a lot of delays, products fall short, or you cannot deliver enough key brands you are in trouble.

I think Sony has the right strategy. MS is hoping to "upsell" and gain momentum with the finished products. We will know a lot more come fall 2006.

Jawed · Sep 23, 2005

Generally, there's no need for branch prediction in a GPU. (Though GPUs and the driver software, together, can make use of hinted information about branches.)

This is really a reflection of the fact that a GPU's pipeline is really quite different from a CPU's.

In general, a GPU runs one instruction, repeatedly clock after clock, say a thousand times. On 4 or 16 pixels simultaneously (say). After all the pixels in the batch have run that instruction, the GPU then moves onto the next instruction in the program.

The GPU pipeline is there to organise the vast amounts of data that a GPU can crunch through - 10 or 100x what a typical CPU can do.

The relatively simple instruction set of a GPU and the outwardly quite limited concepts that can be programmed in shader language, mean that instruction decoding and branch prediction are essentially irrelevant side-issues in pipeline organisation (generalisation, of course - they don't totally disappear).

When a branch occurs, the GPU has the entire remainder of the batch to organise the data/instructions to run next (e.g. if pixel 667 is the first pixel to branch then there's another 333/16 = 20 clocks to go). Though the 1000th pixel might be the little bugger that is first

in which case you'd have a glitch.

Predication is used in the GPU to differentiate which pixels run which instructions:

Code:

....if pixel is in shadow then
1001	make 
1001	this
1001	pixel
1001	darker
....end if

So the first and the last pixels (in a group of four, here) run the code, the other pixels failed the branch and so don't run the code.

---

A CPU runs a different instruction (in general) on each successive clock. A CPU deals with relatively small amounts of data, but generally it's a moderately disorganised slew all across memory. Caching on modern CPUs takes care of that problem but still leaves the problem of latency in fetching data.

One way a modern CPU pipeline deals with latency is by preparing alternative instructions (in a different hardware thread) to execute should the data not arrive on time.

The pipeline also has to deal with not being sure which instruction will execute after a branch in code is performed (a simple if...then...else, or a loop). The CPU can speculatively fetch the choices (both branch-fail and branch-succeed) and have them ready to roll, or even start executing them. That's the basic idea with branch prediction. It's costly in terms of transistors on the CPU, but that's what you get for having such laggardly data flows.

A CPU's pipeline tends to be fairly long, 20-to-30 stages. If a branch prediction is wrong then the entire contents of the pipeline become worthless. And the program will actually glitch until data/program has passed through the pipeline's full length, where it can start running again.

Of course I'm talking in rough terms about AMD/Intel CPUs in PCs.

---

GPUs benefit from having a pipeline that's always "1 instruction" long, for any given program. Hence there's no need for branch prediction, as such.

(There is a time overlap in the pipeline, as the new instruction starts up while the old instruction is just finishing, so it's not technically 1 instruction long, but that's essentially how it appears.)

NVidia GPUs with dynamic branching have a 6 clock latency on executing a branch (it may be less, now, it seems to have changed because of driver improvements). I can't for the life of me work out why, because it should be "hidden" by the 1 instruction pipeline. So I'm missing something there :???:

Jawed

Jawed · Sep 23, 2005

Shifty Geezer said:
Unless nVidia engineers have set a new world record for having tea breaks, 50 man-years minimum input on RSX seems to have accomplished very little if all they can come up with is a FlexIO interface

90nm isn't trivial technology, especially if you're building a 300m transistor chip. And especially with a company whose 90nm technology you've never used before.

Jawed

hugo · Sep 23, 2005

BlueTsunami said:
Not that Ken believes that going with NVidia is NOT outsourcing, but it seems to be saying that Kutagari would prefer in-house development but in this case it could not be helped.

Could not be helped is one thing but throwing in an off the shelf GPU to finalised a product isn't something an engineer like Ken Kutaragi would do.There's just no honour and pride for an easy way out like that just to get things done.On the other hand,the timing wasn't that tight for Sony to do such things.They weren't aiming to be the first to launch their console and didn't mind being the last.

SubD · Sep 23, 2005

Dave Baumann said:
Errr, this is a discussion about the graphics chips, as the title quite clearly states, not the verall system - this element is very much a factor of the different graphics chips withing these parts.

The 'overall system' is the 'graphic chip' for the PS3.

Shifty Geezer · Sep 23, 2005

hugo said:
If Nvidia would announced that the RSX is actually a G70 remodified,it would be considered one of the major blunders/weakness for this PS3 gen.

Though you're probably right, I can't say that that'd be a fair assessment. To get graphics the PS3 needs pixel and vertex shader capabilities, and to ease development, tools and resources and an API. nVidia offers all of this, with what's currently the most powerful GPU available. Is that a bad choice to include in PS3?

If ATi have pulled of a stunning feat of engineering and Xenos totally outclasses a modded G70 in real-world performance, that still doesn't reflect badly on Sony. Just kudos to ATi/MS for their design.

The only 'problem' with RSX being a modded G70 is in some quarters this is looked on as a failure. Apparently it's okay for MS to get a graphics hardware company to supply them a GPU but not Sony. And yes, we've had these arguments before. Even if G70 is a 'last minute' solution for a failed Cell-based renderer, kudos to Sony for changing their minds rather then sticking with an arrogant pride in their own less-effective solution.

Looking at what's appearing on screen I really can't see any complaints at all in the actual results. I really don't care who makes PS3's GPU as long as it does the job and I reckon it'll be looking god for years.

ihamoitc2005 · Sep 23, 2005

Xenos pixel-shading

Dave Baumann said:
And this is likewise with Xenos, except the curiousity is that if you move workload such as vertex processing from Xenos to the CPU, Xenos gains resources to spend on pixel shading.

If all vertex shading moved to Xenon CPU, Xenos has the same pixel-shader ALU operations/clock as 7800GTX no?

http://www.beyond3d.com/previews/nvidia/g70/

"NVIDIA say they have analysed 1300 shaders utilised in current and upcoming games and noticed that the most frequently used instruction is MADD - although NV40 has two ALU's they are not in fact each fully featured with the same instructions, instead one is a MADD unit and the other is a MUL unit; for G70 NVIDIA say they have added a MADD and MULL into each of the units that didn't previously contain them and in fact we are led to believe they are now complete instruction duplicates of each other (although, obviously the second unit doesn't have texture address processing instructions)."

mckmas8808 · Sep 23, 2005

Thank you Acert and London you have said exactly what I wanted to hear. And that is that MGS4 is only the beginning.

Yes!

london-boy said:
MGS4 will "exploit" the hardware the best the team can, but one or two years after, i'm sure devs will be able to extract much more out of the platform.

Acert93 said:
No, I don't plant to see games on either platform taking full advantage of the consoles for a while.

Shifty Geezer · Sep 23, 2005

Jawed said:
90nm isn't trivial technology, especially if you're building a 300m transistor chip. And especially with a company whose 90nm technology you've never used before.

Jawed

How much effort it is to do a process shrink of an existing chip? My uneducated assumption is they have the layout of transistors and this layout isn't going to change, so they only need to redo the masks. Like I can type a document in Word and then shrink it if I want, changing the font size. Surely they use computers to create their layout for the chips, then connect it to a 'printer' to create the masks needed for fabrication. That's what they should do anyway!

hugo · Sep 23, 2005

Jawed said:
90nm isn't trivial technology, especially if you're building a 300m transistor chip. And especially with a company whose 90nm technology you've never used before.

Jawed

As you've said,a 300m transistor GPU dedicated for a console versus one with only 232m without edram which one would be more feature rich?Nvidia isn't designing this all alone they have partners which are not new to 90nm for quite some time in this.They didn't went with 90nm for the 7800 because they wanted more yields.If ATi already has 90nm for quite some time now what makes you think Nvidia doesn't have in their labs?If they still stuck with 110nm,they should start worrying by now.Sony should also...

Acert93 · Sep 23, 2005

Titanio said:
Also, Acert, the list of 102 titles in dev for PS3 are for the japanese market only. There are games we know are in dev for PS3 that aren't on that list. Not that it makes much difference, but 102 shouldn't be thrown out there as THE number announced as in development.

As you know the numbers I quoted are officially from MS and Sony. These are the games they have announced in development. That does not mean they have publically announced all the titles or developers and likewise that there are not unaccounced games in current development.

But what you are saying about the PS3 having games in development NOT counted in that list is equally true of MS. So I am not sure what your point is?

If Sony and MS are going to throw around those numbers (and Sony full well knew MS's published number before they announced theres) I am not sure why I cannot, or should not, quote them. Really, I did not make up the numbers. When Sony announces a nother number I will gladly take that into consideration.

Anyway, seems this discussion has brought us nowhere we haven't been before to be honest. We've no new information, really, on Xenos or RSX that we hadn't had before.

What do you mean? We are still waiting for RSX FP10 links!

Further, a lot of the discussion of Xenos effeciency has been in the GPU forums lately, ditto recently information about XPS etc. Since Dave presented his Xenos paper there has been a lot of quality discussion in the IHV boards about the benefits and ramifications of Xenos. And with the newly released DX10 info things are coming into focus a bit more clearly.

We're raking over old territory, and its the same people making the same arguments roughly speaking, as far as i can see. Are we going to do this every time someone starts a new thread on RSX or Xenos? Pointing out the search function may be more useful and save everyone a lot more time.

There can a benefit to these threads. The first is that a very questionable position was put forth. Second, infactual information and beliefs are raised and publically discussed. e.g. the transistor count issue not being similar. We covered that ground before, even in this very thread, yet it keeps getting repeated. And of course people see and go to other forums and spread this.

I know Shifty and I had a good discussion to clarify semantics in discussing the SPEs about 5 months ago. Even if someone is not wrong, how they say something can appear incorrect technically. And as we press forward on issues more quality information is revealed.

These threads are only worthless in regards to the trolling, the intentional misuse of information, and the fanatical defense of ones favorite system/company. You take those factors out of the thread there can, at times, be very good information--even from those who disagree.

Jawed · Sep 23, 2005

Shifty Geezer said:
How much effort it is to do a process shrink of an existing chip? My uneducated assumption is they have the layout of transistors and this layout isn't going to change, so they only need to redo the masks. Like I can type a document in Word and then shrink it if I want, changing the font size. Surely they use computers to create their layout for the chips, then connect it to a 'printer' to create the masks needed for fabrication. That's what they should do anyway!

You can't "process-shrink" into 90nm. It doesn't work like that.

Plus you need to remember that each fab has its own foibles in what you can implement at the transistor level.

I hardly know anything about this, but 90nm is categorically not trivial.

Jawed

London Geezer · Sep 23, 2005

Jawed said:
You can't "process-shrink" into 90nm. It doesn't work like that.

Plus you need to remember that each fab has its own foibles in what you can implement at the transistor level.

I hardly know anything about this, but 90nm is categorically not trivial.

Jawed

Not to correct you or anything, but i think that 90nm is not trivial when trying to make a 300M trannies chip. For obvious reasons.
90nm has been used for quite a while, and rather successfully. I don't think that the triviality is in the 90nm process by itself, the problem is trying to make a 300M trannies chip with that kind of process.

Lysander · Sep 23, 2005

This is little off, but... Can unified shader on gpu operate without fast edram buffer?

London Geezer · Sep 23, 2005

Lysander said:
This is little off, but... Can unified shader on gpu operate without fast edram buffer?

Well, unless ATI intends to produce PC GPUs with EDRAM in the future, i'm pretty confident you don't need EDRAM when you have unified shaders.

ihamoitc2005 · Sep 23, 2005

edram

Lysander said:
This is little off, but... Can unified shader on gpu operate without fast edram buffer?

Not if you want AA and HDR.

How can Nvidia be ahead of ATI but 360 GPU is on par with RSX?

Alpha_Spartan

London Geezer

BlueTsunami

I laugh at you! HA HA HA!

Dave Baumann

Gamerscore Wh...

Acert93

Artist formerly known as Acert93

Jawed

Jawed

hugo

SubD

Shifty Geezer

uber-Troll!

ihamoitc2005

mckmas8808

Shifty Geezer

uber-Troll!

hugo

Acert93

Artist formerly known as Acert93

Jawed

London Geezer

Lysander

London Geezer

ihamoitc2005

Similar threads