ATI & Nvidia investigating dual core GPU's

Uttar said:
...
Most clueless post of the month, I think. This is from the same guy who started the "NVIDIA is going SLI because they can't compete thread".
Every single of his post is a record of its own. I think some action might be in order here.

Uttar

Your are entitle to think as you wish, I am not going for any kind of record I only have seven post. I do believe that Nvidia cannot compete with ATI in the single slot soultions anymore. I thought that this news would be interesting to some. I can see that you really stand behind Nvidia, and if I have offended you with my opinions than I am sorry. Take whatever action is appropriate I am shaking in my boots.
 
Redeemer said:
I do believe that Nvidia cannot compete with ATI in the single slot soultions anymore.

Would you care to elaborate on why you think this at some point? In your thread titled as much you never really offered why you think so. I think most people view the 6800 and X800 line as equally performant on current titles, though it remains to see how they do with future ones. In the middle and bottom segment Nvidia seems to be dominating, especially considering their expanded feature set. I offered my take that SLI is a bonus before it was locked.

I am not saying we should rehash it all here and I am not sure why those hostile comments towards this thread were so quick to go up, but I would venture a guess that it has something to do with that last thread and you not defending a position or even explaining why you see it that way. Such posts are often just fuel for the fire. This one, however, seems perfectly legit to me. Just a rehash of a Inq story.
 
wireframe said:
In the middle and bottom segment Nvidia seems to be dominating, especially considering their expanded feature set.

That coupled with the PE's dominance over the Ultra makes this round a real slugfest. I think they pretty much stand even now but Nvidia has the momentum. Maybe he can foretell the future and knows Nvidia will have no answer to the R520.
 
What makes me believe that Nvidia cannot compete with ATI single solutions is the fact that there is no word on Nvidia next generation of gpu. We all know about the R520, it is due in a few months but there is simply smoke around Nvidia's challenge to ATI's R520. Which mean that Nvidia might not have an answer for ATI this year except for SLI, with tweaked 6800ultra gpu's. Of course this is just a speculation that some have pushed out of hand.
 
Fodder -

Not necessarily. Just give each of the N chips 1/N the amount you would give the single chip, and allow one chip to make requests of another chips memory (at some latency/bandwidth penalty).
 
Redeemer said:
What makes me believe that Nvidia cannot compete with ATI single solutions is the fact that there is no word on Nvidia next generation of gpu. We all know about the R520, it is due in a few months but there is simply smoke around Nvidia's challenge to ATI's R520. Which mean that Nvidia might not have an answer for ATI this year except for SLI, with tweaked 6800ultra gpu's. Of course this is just a speculation that some have pushed out of hand.

Any random three-month snapshot can give a "missing the forest for the trees" view.

Certainly there is a *bit* to be at least moderately concerned about NV-ways. The last major process shift was damn cruel to them, and they are behind ATI on this one. Does that mean this one will be "damn cruel" to them as well? Not necessarily. But anyone who thinks back to the predictions for NV30 in the spring of '03 has a right to have at least a little queasiness to see NV behind again this time, with still enuf lead time to the target date to slip even further. And Jen-Hsun specifically side-stepped an opportunity at their conference call to say this process shift won't be as hard as that one (not that I blame him --he earned his merit badge on "watch your mouth on process changes" the hard way).

But "can't compete" on those grounds is certainly premature.
 
Perhaps another benefit of having stackable or dual core GPU's is a consolidation of your whole line of chips. Flawed workable single cores could be used for your lower budget line. Single core for your middle price line and then the dual core breed and up for your higher end products. Since all the cores are virtually the same your drivers would be alot easier and cheaper to maintain hopefully with much better stability. The board design for each category could be very similar with similar components except additional lanes or paths for your higher end products thus possibly making the manufacturing end easier and more profitiable.

The next generation wouldn't have to like designing a 300 million transitor design to a 600 million and so on but more like a 200 million to a 300 million stackable design to a 400 million and so on. I would say designing a less complicated chip but stackable would be easier then one huge chip.

It makes sense to me to start breaking the core down into units vice just making bigger and bigger cores especially if you reach a point where the complexity of a single core is just too much to handle.
 
I guess I'm more ignorant about the dual core concept than I thought. I thought that dual core CPUs are still manufactured on (as?) one dice, whereas it appears these dual core GPUs are almost like SLI'ed separate chips. Are dual core CPUs also separate dies integrated into a common package, or like the Cell (a single "CPU" as an aggregate of smaller, more competent subunits)?

I'm not sure how the former translates to more efficient yields, as both ATi and nV are selling partially faulty GPUs as lower SKUs (6800s from 6800GTs/Us, 6200s/6600s from 6600GTs, X800Ps from X800XTs, etc.). I'm guessing the overall yield of sellable NV43-sized chips is higher than that of NV41- or NV40-sized ones, even taking into account funcionality binning?

Is this b/c GPUs are moving away from the pipe and even quad structure, so defects aren't as easily compartmentalized?
 
Pete - for the most part they (dual core CPUs) have been announced to be on one die. This should give better performance than placing 2 dies on a package.

However, what I'm speculating is that if you are constrained by yield or manufacturing ability, then going with multiple "more feasible" chips (in a package or on a board) is a good solution.

I see multi-GPU setups as a means to extend maximum performance possible in a given GPU generation - I think this will become more important as GPU design cycles lengthen.
 
Multiple core GPUs may be referring to the long term direction that 3d graphics processing seems to be taking. This has more to do with the markets and companies that address those markets than anything else.

CPU manufacturers address the broad corporate and home markets.
Their focus is on a single solution with broad application and low cost.

GPU manufacturers address the 3d entertainment and professional markets. 3d graphics and physical simulation are computation bound. This market has always been driven by floating point performance. Initially, 3d vendors focused on the video and rendering aspects of 3d graphics, since it was the most immediate problem to be solved and lent itself to solutions using low precision fixed point. However, in the long run, it will be about solving real-time 3d physical simulation in general.

Since GPU manufacturers target this market, they are the most motivated to design the solutions to solve its challanges. Initially, it was more frame buffer performance, then more color precision, then anti-aliasing, then vertex processing, then pixel shaders and shadow processing, etc.

In the long run it will be about providing the highest general purpose programmable floating point performance coupled with some specific rendering and display hardware.

CPU manufacturers have not targeted this market so will not likely be the vendors that solve the challanges it presents. General purpose, programmable, very high performance floating point will likely move to the GPU, since those companies are targeting the markets that require it.

To meet these needs in the long run, GPU manufacturers will likely adopt some the of techniques employed by CPU manufacturers, with the use of more dynamic logic, higher frequencies, on chip caches, general purpose registers, etc. The GPU is therefore likely to consist of several cores: a rendering core handing the raw rendering and display, the specialized graphics core, handling 3d graphics specific tasks optimized in hardware such as hidden surface removal, anti-aliasing, etc. and a general purpose set of high frequency floating point processors with a very large number of concurrent ALUs and their own on-chip cache. All this with a memory bandwidth architecture to match.

This is all pure speculation, but it seems to be the way things are heading.
 
Jawed said:
If we consider the concept of a unified shader design with a farm of "ALUs" assigned work units (shader code) against a pool of pixels/vertices, would this be a strong argument against a dual-GPU graphics card?

I'm thinking that latency across multiple GPUs' discrete pixel/vertex pools becomes prohibitive, or that the architectural benefits in using a farm<->pool architecture are heavily depleted by halving (at least) the possibility that an ALU can work on a pixel/vertex. i.e. if a farm becomes "free" to work on a pixel, but the pixel that's "ready" at that moment is on the other GPU's pool, then you've lost some of the benefits of the unification.

Alternatively, if you consider that the unified model is designed to hide the latency of memory, and that a moderate increase in latency caused by a multi-GPU architecture can be overcome by a small increase in the overall capacity of the farms and pools, then maybe multi-GPU isn't fatal.

Perhaps a single shared pool between multiple cores would be required. Blimey, what kind of memory controllers are you talking about then?

It's also interesting to think about whether ATI's "super-tiled" (R300) architecture naturally progresses (by way of increased granularity) into a unified architecture. This transition seems to require an increase in granuality in both functionality and time. If super-tiling provides such a neatly load-balanced approached to a multi-GPU architecture, then I suppose a unified architecture would follow quite smoothly, being nothing more than a finer-grained version of the super-tiled architecture.

Well, I expect I'm talking to myself...

Jawed
what if one of the cores would be used as a geometry engine/vertex shader, and the other would be used as a fragmet/pixel pipline. if you work under a unifide architecture you dont need to design and make 2 diffrert cores. is it posible? and can you create a link between the cores which would be fast enough to make such thing work?
 
Redeemer said:
What makes me believe that Nvidia cannot compete with ATI single solutions is the fact that there is no word on Nvidia next generation of gpu. We all know about the R520, it is due in a few months but there is simply smoke around Nvidia's challenge to ATI's R520. Which mean that Nvidia might not have an answer for ATI this year except for SLI, with tweaked 6800ultra gpu's. Of course this is just a speculation that some have pushed out of hand.

So, you would not put any weight behind an interpretation that Nvidia is taking it easy because they only need to add a few pipelines and turn up the clocks to compete? That this is not as exciting sounding as a "whole new architecture" and that if you look at it from another angle you see that ATI is just playing catch up? Nvidia may be in a comfortable position where they don't want to reveal what they are doing because they have nothing to gain. Perhaps they are working under the assumption or knowledge that R520 is 24 pipelines, ~500Mhz, 1200MHz effective memory, and approximately the same feature set as NV40. Why would they want to reveal any big changes when this would only mean nudging up the NV40?

An Asus rep at CeBit made the strange statement (I thought, for a company selling Nvidia and ATI products) that their dual 6800 Ultra will remain the fastest solution even after the launch of R520 or any other discrete board for the next 12 months.

I think the R520 has reached mythical proportions on forums because it has been bumped around in the roadmaps. I think R520 is nothing much like the R400 (which was "too advanced") that people have been following rumors about. I think people are just holding on to that "too advanced" too much. Maybe back then, but a lot has changed. This is not to say that I don't think R520 won't be a great piece of equipment. It just seems to me that people are expecting a little bit too much out of it straight away.
 
Yes, it's bad company strategy to let details of your next generation products leak out too soon, since that stops in their tracks sales to enthusiasts who must have the best.

Tom
 
wireframe said:
I think the R520 has reached mythical proportions on forums because it has been bumped around in the roadmaps. I think R520 is nothing much like the R400 (which was "too advanced") that people have been following rumors about. I think people are just holding on to that "too advanced" too much. Maybe back then, but a lot has changed. This is not to say that I don't think R520 won't be a great piece of equipment. It just seems to me that people are expecting a little bit too much out of it straight away.

Hey as long as it's twice as fast as X850XTPE and it is SM3 and costs about $500 it'll be good enough for me.

I'm expecting NVidia to have a competitive part this summer, too. The mobile 6800Ultra Go (or whatever it's called) has received far too little attention so far. This part appears to offer truly sensational performance but there's been no explanation for why.

Jawed
 
The point is that it probably won't be twice as fast as the X850XTPE, as a few sources have said. The doubling of performance typically comes every two years, so a 6800U SLI setup (when it works) should be top dog until R600 and whatever NV50/60 is now called. The X series is just a year old, so I wouldn't expect miracles. Marc at Hardware.fr just pegged the next gen to hit around May/June and be 30-50% faster. That sounds reasonable to me just a year out from the apparently low-yield X800 and 6800.
 
Fodder,

Well - if you tile the frame buffer and z/stencil buffers across chips then you can guarantee that access to them always occurs to local RAM. For uploaded geometry, you could process multiple batches of geometry in parallel: upload different models in round-robin fashion to the GPUs and then attempt to process geometry from local memory as much as possible. Programs I assume are relatively small, so you could probably just duplicate them in each GPUs local RAM. That leaves textures - I don't know how much increase in latency non-local memory access would incur, but you're right, it would require even higher latency tolerance from the GPUs, and duplicating them across each GPUs RAM is a gigantic waste of space.

One thing that might help minimize non-local texture access is to break textures up into nxn tiles and have each GPU compete for tile ownership on a frame by frame basis.
 
Back
Top