eight shader units for R520

I think Dave's hinting that R520 will be topping out at something like 400MHz due to first-cut 90nm process limits (thinking of heat density constraints, which seem to have been a factor in the first generation of AMD A64s at 90nm - which mean that the fastest current A64s are still using 130nm).

That was roughly completely diametrically opposed to what I was hinting at...
 
DaveBaumann said:
I think Dave's hinting that R520 will be topping out at something like 400MHz due to first-cut 90nm process limits (thinking of heat density constraints, which seem to have been a factor in the first generation of AMD A64s at 90nm - which mean that the fastest current A64s are still using 130nm).

That was roughly completely diametrically opposed to what I was hinting at...

:oops:
 
DaveBaumann said:
I would suggest that you'd need to consider why the discussion for 24 pipelines is interesting; what does it actually gain, especially when you figure in memory techologies - pixel fill-rate is hardly an issue now, is it? So you would only be looking to augment shader performance. Then consider the process adoption differences - why would there be a need to make more pipelines?
Well, more pipelines is quite possibly the easiest way to increase shader throughput. You can always add more shader units in series within a single pipeline, but those will typically not be as efficient as separate pipelines (similar to how two texture units and one pipeline aren't as efficient as two pipelines each with one texture unit, but this gets even more severe if the separate units don't have the same capabilities). Now, much like the GeForce 6600, there may be reasons to not bother having support for each pipeline to output a pixel every clock. This may save transistors in a place that will rarely have performance implications.
 
Alstrong said:
How about lowering the clock speed with more pipes?

Dave's just said that's the wrong direction.

Which leaves me clutching at straws: 700MHz, but with only 16 pipes?

Jawed
 
i still think the r520 is just going to be an r420 with a bunch of check box features and be around 35% faster.

24 and 32 pipes with 10+ vs units is something i can see nvidia doing, but not ati.
 
im thinking more along th elines of transistor count. ati is usually much more conservative with transistors. sm 3.0+ and 24+ pipes and 10+ vs units is an insane ammount of transistors. even at 90 nm
 
Going from 130nm to 90nm allows ATi in a best case scenario to fit twice (or even a slight bit more) as many transistors in roughly the same die space. I don't believe the suggestions here of an SM3+, 24 pixel pipe, and 8 vertex pipe card could push it near twice as many transistors (judging from ati's statements of an additional 60million transistors for their sm3 implementation and my own guessimates of how many transistors an additional two quads and vertex pipes would add).
 
Thanks for taking the time to share your thoughts, Entropy. :)

DaveBaumann said:
I think Dave's hinting that R520 will be topping out at something like 400MHz due to first-cut 90nm process limits (thinking of heat density constraints, which seem to have been a factor in the first generation of AMD A64s at 90nm - which mean that the fastest current A64s are still using 130nm).

That was roughly completely diametrically opposed to what I was hinting at...

So you're implying that more than 24 pipes would in effect be overkill because of the possibility of increasing core clocks beyond current limits? And would this be based solely on the merits of the 0.09u low-k process or does the highly nebulous Fast-14 technology somehow factor into your thinking here? :?
 
kemosabe said:
Thanks for taking the time to share your thoughts, Entropy. :)

DaveBaumann said:
I think Dave's hinting that R520 will be topping out at something like 400MHz due to first-cut 90nm process limits (thinking of heat density constraints, which seem to have been a factor in the first generation of AMD A64s at 90nm - which mean that the fastest current A64s are still using 130nm).

That was roughly completely diametrically opposed to what I was hinting at...

So you're implying that more than 24 pipes would in effect be overkill because of the possibility of increasing core clocks beyond current limits? And would this be based solely on the merits of the 0.09u low-k process or does the highly nebulous Fast-14 technology somehow factor into your thinking here? :?

Is everybody fillrate-fetishists around here? Dave's comment sounded to me like questioning the rationale for increasing to 24 pipes at likely memory speeds for this generation. "What does it actually gain?" was his question. We know that ati seems more loathe than NV to spend gates just for the checkbox --the tranny budget has to justify itself.

Now, Chalnoth put forward a possible rational --I'd like to hear Dave's thots on that. . .Maybe the 8 new rumored pipes are pseudo shader-only pipes? Or if not shader-only, shader-mostly?
 
DaveBaumann said:
I would suggest that you'd need to consider why the discussion for 24 pipelines is interesting; what does it actually gain, especially when you figure in memory techologies - pixel fill-rate is hardly an issue now, is it? So you would only be looking to augment shader performance. Then consider the process adoption differences - why would there be a need to make more pipelines?

There is at least 1 company that has announced a new high performance projector technology.

http://www.es.com/news/2004+press+archive/070604.asp

I guess it depend on how long it takes for those technologies to move to the desktop market.
 
daves saying why increase the fillrate when availbale memory bandwidth cant even power current pixel fillrates.
 
well dave obvisouly hinting on more shader units per pipe to increase shader performance rather than just adding pipelines which you then will be limited by bandwith anyway.

In one of those ati slides they state that alu:tex ratio will increase what would also support this
 
How about triple monitor gaming doesn't that need all the fillrate it can get dave maybe nvidia has something in the works. Look at Doom III U need sli for ultra setting for it to shine and play it as a high resolution. 1600x1200 with 8x AA with 16 ansio would be more likely with more fillrate and sli. As for memory bandwith other guy nvidia can go VSA to give dedicated memory for each gpu to take care of bandwith issue. Also Samsung has Gddr4 in the works. SO there are multiple options
 
That's what I believe, too, tEd. Of course, smaller granularity is more efficient, but doubling the number of pipes is far more expensive than doubling the number of ALUs per pipe. Not considering the mini ALUs, R420's arithmetic to texture ratio is 1:1, while R500's is 3:1 (vec4). It seems plausible that R520 is somewhere in between, having two full ALUs per pipeline, but still having 16 pipes.

Considering NV47 however, if that part actually exists, NVidia certainly had less time and ressources to design that chip. So it's more likely they reuse the existing design and add some more pipes. Additionally, the ROPs are already decoupled, they're at a lower clock (at least in the high-end) and they already have two shader units, though not identical, per pipe.
 
now this conversation has moved beyond my ability to understand. :cry:


ah well, I'll just wait for ATI's R520 announcement, press release, interview, conference, demos, etc. it seems to me though, that ATI is likely to "own" the spring/summer and perhaps fall. while Nvidia works on its next-gen GPU ( fall 2005 unlikely, prolly spring 2006) as well as its contribution to PS3 GPU.
 
Back
Top