So.. What will Nvidia Bring to Counter the R520?

martrox said:
There really isn't a good reason for nVidia to do anything.
There are plenty of reasons for NV to do a new hi-end chip -- as there always were. Price/performance ratio is what counts, not raw performance. 6800U SLI will most probably beat R520 in all benchmarks but its cost will be 2 times higher plus numerous bugs of SLI plus 800W power supply plus very expensive motherboard plus higher noise of SLI setup plus ATI AMR which can be a clone of SLI in performance gains. Enough reasons i think.
 
martrox said:
My point with 11u is that nVidia will get little if any added core speed out of it. Their product line is still very strong and competitive. While ATI does have higher performing models in the high end, nVidia can always counter that they, via SLI, have the performance crown. Not counting the hefty cost of intry, few can argue with this. There really isn't a good reason for nVidia to do anything. Anyone here think a 520 will blow away a SLI 6800GT/Ultra from a "best 15 benchmarks and games as presented by damn near every hardware site" point of view?


I think you are assuming that IBM's .13u process, which the NV40 is made on is as good or advanced as TSMC's .11u and .13u lines. This indeed might not be the case and if NV had produced the NV40 at TSMC it might have been able to clock them higher, even on the regular .13u non low-k lines. They probably opted not to because of there bad experiences with the NV3x series on .13u at TSMC. Seeing how NV is having very good yields on .11u with a relatively high transistor count NV43 at 500MHz, they might indeed have a good shot a clocking a tweaked NV40 at the same core speed or possibly even a tad higher.
 
I was thinking that the most likely senario is both ATI and Nvidia will have 24 pixel pipes plus 8 vertex pipes in their spring/summer V|GPUs, as well as versions with 16 and 12 pixel pipes plus 6 and 4 vertex pipes, respectively. kinda like how it is now with the various versions of R420 and NV4x.

would it be beyond reason that both companies also had 32 pixel pipe versions? people thought it was very unlikely that R420 would have 16 pixel pipes. most thought it would be 8 'extreme' pipes. maybe 12.

opps forgot, 32 pixel pipes probably isnt worth it with only 256-bit memory bus, so 24 pipes is probably the max.
 
I don't think we're yet at the transistor densities required for 32 pipelines and SM3+ hardware (which would be roughly 450-500 million transistors).
 
Chalnoth said:
Oh, they would have saved more than zero, that's for sure. But probably not that many.
FP16 is for data, all hardware is FP32 anyway. There are some FP16-related areas in pipeline control logic but i doubt that they are bigger than 1M in transistor count anyway.
 
There's probably a bit more than that in the NV40, due to the NRM instruction, texture filtering, and blending.
 
Chalnoth said:
There's probably a bit more than that in the NV40, due to the NRM instruction, texture filtering, and blending.
Well, there are no FP32 NRM, texture filtering and blending, so you can't really "save" transistors here, you can only cripple chip's functionality. The same way to "save" transistors would be to cut half of pixel pipelines.
 
FP32 NRM can be achieved in the ALU's, in fact it is - the FP16 NRM is an extra, specific, function such that it can be achieved "for free" (or, at least, interleaved with other instruction operating on the ALU's).
 
Chalnoth said:
I don't think we're yet at the transistor densities required for 32 pipelines and SM3 hardware (which would be roughly 450-500 million transistors...and that's just assuming nothing more advanced than NV4x).
If NV40 is 220M transistors, why would you expect more than twice that for twice the pipelines? I'd say it would be more like 350M transistors.
 
Slight Shift in Speculatiuons...

Will the R520 be Pure FP32 like the current core is pure FP24? Or will ATi pull off Fp32/FP16 similar to what Nvidia is Doing.

Will they offer Fp16 Bliending? or will Blending operations be done at FP32 or some interesting use of Hardware to accomplish "the same thing as Blending" without it being actualy Blending ala Nvidia.
 
Will the R520 be Pure FP32 like the current core is pure FP24? Or will ATi pull off Fp32/FP16 similar to what Nvidia is Doing.
With the shader speeds that we're talking about for the next generation, I would be *very* surprised if R520 had partial precision. Even moreso given ATI's comments regarding _pp in the past...

As for blending, I have no idea. Isn't it a requirement of SM3.0, though?
 
The Baron said:
Will the R520 be Pure FP32 like the current core is pure FP24? Or will ATi pull off Fp32/FP16 similar to what Nvidia is Doing.
With the shader speeds that we're talking about for the next generation, I would be *very* surprised if R520 had partial precision. Even moreso given ATI's comments regarding _pp in the past...

As for blending, I have no idea. Isn't it a requirement of SM3.0, though?

it's no requirement but "it's on our own SM3 parts" according to mr. huddy :)
 
Hellbinder said:
Slight Shift in Speculatiuons...

Will the R520 be Pure FP32 like the current core is pure FP24? Or will ATi pull off Fp32/FP16 similar to what Nvidia is Doing.

Will they offer Fp16 Bliending? or will Blending operations be done at FP32 or some interesting use of Hardware to accomplish "the same thing as Blending" without it being actualy Blending ala Nvidia.
Pure FP32 in the shader, FP16 blending I expect. Blending is too bandwidth hungry to be useful for FP32 yet.
The only beneficial way I see to do blending in a "non-standard" way would be to drop the color ROPs and do the blending calculation in the shader as a last step, being able to access the framebuffer there. But there are other drawbacks to this, and I don't really think it's worth it.
 
Xmas said:
If NV40 is 220M transistors, why would you expect more than twice that for twice the pipelines? I'd say it would be more like 350M transistors.
1. I would expect both vertex and pixel pipeline counts to be doubled. That's most of the transistors in the chip. The only things that would remain the same are the video processor and I/O.
2. I wouldn't expect the chip to remain the same, but have some additional improvements....so maybe I misspoke in my previous post.

So, for my lower range I doubled the transistors of the NV40 and rounded it to 450M. Any extra would be related to other advancements, hence the upper range. So yes, I should have left out the "without improved tech," as I was obviously including it.

Edit: Oh, and since ATI will have to make the move to FP32 precision with the R5xx, one would think that the use of FP16 will suddenly become much more compelling for ATI to implement, even if the only benefit is faster computation for special functions (nrm, sqrt, pow, etc.).
 
Xmas said:
Pure FP32 in the shader, FP16 blending I expect. Blending is too bandwidth hungry to be useful for FP32 yet.
The only beneficial way I see to do blending in a "non-standard" way would be to drop the color ROPs and do the blending calculation in the shader as a last step, being able to access the framebuffer there. But there are other drawbacks to this, and I don't really think it's worth it.
Well, for the purposes of non-graphics computing, I personally think it would be highly useful to allow accessing of a FP32 framebuffer within the shader.
 
Back
Top