I still don't understand large parts of it, but I from what I gather I can safely say that you`re wrong about a number of things, and it`s probably pointless to highlight them as the language barrier may be a tough one to break. Ask yourself this:which would be the inherent reason for a huge performance jump in DX10 for the R600, allowing it to surpass the G80(asides from the "NV30 sucked at DX9" party line).What are the huge differences that make the jump from DX9 code to DX10 code an occasion for R600 muscle flexing?What makes you think that the 320 stream processors(omg, big number) are underutilized ATM?
The problem is how ATI adverting HD-2900XT, there is not actually 320 stream processors on that chip, there is only 64 real processors, but each is cable of 5 operations per shader clock. The 320 individual stream processing units in R600 are arranged in 4 groups of 80 SIMD arrays and each functional unit is arranged as a 5-way superscalar shader processor. First; most of the stream processors are simpler and aren't capable of special function operations. For every block of five stream processors, only one can handle either a special function operation or a regular floating point operation. The special function stream processor is also the only one able to handle integer multiply, while others can perform simpler integer operations. This means is that each of the five stream processors in a block must run instructions from one thread.
Although the unified shader concept is similar between the two cores, the way they go about presenting this functionality is a bit different. (Whereas the G80 has 128 aptly-named Unified Shaders), the R600 has 320 Stream Processors. Clearly 320 is a bigger number than 128, but as we know in the hardware world, bigger numbers don’t always mean something is better. The fact of the matter is that Stream Processors are different than Unified Shaders. ATI’s Stream Processors are an integral part of the Superscalar architecture implemented on the R600. Those 320 processors on the R600, but some of them are standard ALU’s and some of them are special-function ALU’s.
In contrast, NVIDIA's G80 has up to 8 groups of 16 (128 total) fully generalized, fully decoupled, scalar, stream processors, but keep in mind the SPs in G80 run in a separate domain and can be clocked as high as 1.5GHz. In ATI's R600, each functional SP unit can handle 5 scalar floating point MAD instructions per clock. And one of the five shader processors can also handle transcendental as well. In each shader processor, there is also a branch execution unit that handles flow control and conditional operations and a number of general purpose registers to store input data, temporary values, and output data. Simply R600 still has a chance, in conclusion we have to wait upon driver’s updates from AMD and also how DX10 path code handles R600 architectureâ€. I could be wrong about R600 future; R600 is still someway/some point a failure desigh, "I will accept that"
Last edited by a moderator: