Will R580 have 48 shader 'pipes?'

Will R580 have 48 shader fragment 'pipes?'

  • Yes, just like RV530 has 12.

    Votes: 86 60.6%
  • No way, it's just too many trannies.

    Votes: 56 39.4%

  • Total voters
    142
No, RV515 has a boring, old-fashioned, memory controller and no ring-bus. So you can't base any comparisons on it.

Jawed
 
I'm a bit surprised so many has voted "no way". It may be expensive and bit low-yield, and thus against ATIs history of not producing "monster chips" like the NV40, but surely not out of the question.
 
Jawed said:
No, RV515 has a boring, old-fashioned, memory controller and no ring-bus. So you can't base any comparisons on it.

Jawed
Thaks! Another guess: One PS-quad of R520 covers cca 3,65% of the whole die. Another 8 quads = +30% + some other things (more complex UTDP? ring stops? register arrays?). Choosing "D" (don't kick me) as the part, which is necessary for each quad (2,59% each, +20,7% total), the die of R580 would be 50% larger. This result is comparable to RV515/RV530 difference, but isn't 430mm2-die too much for 90nm process? Couldn't be R580 simplified in any way?
 
I think we need an RV530 die photo :cry:

When you scale-up an R520 pixel shader quad (excluding texturing) to make an R580 pixel shader quad, it's worth remembering that it doesn't get 3x larger.

That's because all three quads in each unit (there being four units in total) share the same instruction decode/fetch logic, and at least some of the register fetch logic. There's prolly other stuff being shared. Well, I hope it works like that, anyway...

The unit does need a crossbar, which isn't required in R520, since texture results have to be directed to the correct shader quad.

I wrote some relevant stuff here:

http://www.beyond3d.com/forum/showpost.php?p=591777&postcount=98

Jawed
 
So after all this dribble talk of pipes and ALUs, ;) how much faster will the R580 be compared to the R520?

Let's assume a clock speed of 650Mhz core/800Mhz mem for the R580 just for the sake of argument..and leave the R520(X1800XT) at it's stock 625Mhz/750Mhz.

If I was a betting man (and I am:D), I would guess the average performance of the R580 to be about 150% that of the R520(X1800XT). Anyone have other guesses on this?
 
I don't see any possible way for the R580 to clock equal to or higher than the R520 XT. The Radeon X1800 XT is already very hot. The R580 will have to be clocked lower to keep heat in check.
 
no real idea as to what the upcoming 580 is supposed to be however IMO, I really like the forward thinking of the RV530.. so what I would like to see is a 16 pipeline version of the RV530. I am rather enamored by the rv530's design, just it lacks the raw pixel pushing power to keep up and pass it's msrp counterparts in green.

I can't even pretend to know anything about how gpus are designed/built, so i havent an idea whether or not a rv530 X four design would be additive resulting in a 16-4-12-8 product ? Or are only certain parts multiplicative in that a 4-1-3-2 would result in a 16 pipeline part that equates to(4x) 4-1-3-2 ie .. 16 render outs, 16texture units, 48 shader pipelines with 2 z compare units.

Sorry if my dribble seems repetitive as I have pondered such under previous postings mostly pertaining to the x1600/rv530:

http://www.beyond3d.com/forum/showthread.php?t=24232 (1600 reviews)

Im just not all that "up" in what is what exactly ..
I mention in another post that imo ATI has used the RV series to launch parts based on future tech before (as the case was with the RV3xx/RV4xx I believe) that dealth with the move to a smaller process, so it doesn't seem all that alien that ATI again would use the RV series to launch a newer process/feature set while at the same time allowing the process to mature for future implimitation.

Anywho thats my opinion, that of someone who hasn't a doctorate or umpteen years of experience in microprocessor design/manufacturering or advanced API use and programming... just the mindless ramblings of a proactive consumer wanting to know more about how things work and what to look forward to.
 
IMO 4x RV530, but 2x256 bit ring and 8 VS. Should have more then 400 millions tranies.
 
Last edited by a moderator:
No, but it can be well estimated. I think question is how much space of pixel quad ALUs occupies, and how will the quad size scale when ALUs are multiplied.
 
Megadrive1988 said:
R580 will have either 16 or 24 pixel pipelines (and ROPs) , but within the 16 (or 24) pipes, there is said to be 3 fragment shader ALUs per pipe, instead of 1.

if R580 has 16 pixel pipes, then it'll have 48 fragment shader ALUs

Yea, you do have a point.

I have a question guys, Why R580 can't be 32 Pixel Pipelines?..ok, lets say 32:1:3:1 ? why 16 ROPs and fragment processors?
 
Supreet Virdi said:
Yea, you do have a point.

I have a question guys, Why R580 can't be 32 Pixel Pipelines?..ok, lets say 32:1:3:1 ? why 16 ROPs and fragment processors?

:oops:

Well, that's one large, no, HUGE chip. 32 Rops, 32 TMUs, and 96 frgment / shader units!

You have a certain transistor budget (die space) target to work with, so you have to decide where you want to spend your transistors to give you the most return.

In ATI's estimation, shader ops are going to be relatively more important to increase in performance than ROPs. This is why R580 will be 16:1:3:1, instead of something like 24:1:1:1.

I think most people here would agree with this direction, given that going above 16 rops at R520 clockspeeds will likely have diminishing returns...now that we're getting beyond 1600x1200 resolutions.
 
I tend to suspect they think they are overpowered on ROPs, but consider that a lesser evil than spending the transistors and skull sweat to disassociate the ROPs ala NV right now.

Certainly a 4-1 ROP advantage over X1600 does not look justified.
 
Joe, thanks for clarrifying.

But if I am not mistaken the ALU arrangement on 7800GTX is 24:1:2:1? (vec3+scalar or vec4 class?) and on 110nm fabrication also?

XBOX 360 (R500) can carry out 96 billion shader instructions per sec? 16:1:3:2 ?

So, why R580 cannot have more three Shader processors with with more ROPs?
 
The notation that you have got are not comparable. The "3" actually relates to parallel fragment pipes for the R5xx series (i.e. there are 3 parallel fragment pipelines in that part of the pipeline). G70 has two ALU's, but they are operating on the same pixel (technically R5xx also has two ALU's per fragment pipe, but the second is not as capable).
 
Yeah, it's somewhat difficult to compare the architectures. nVidia has within each pixel pipeline a texture unit that is shared with a MAD unit, a second MAD unit, and some special function units. nVidia demands that in order for these units to be made use of, subsequent instructions within a pixel shader must make use of them (the compiler may be doing some intelligent reordering, of course), and there must be enough register space to make use of all units.

ATI's disassociated pipelines and multithreading give their architectures fewer execution units overall, but they are better-able to make use of those execution units.
 
So, most probably the chances are 16:1:3:1 for R580? .. but I am more inclined towards 24:1:3:2 or 32:1:3:1 ?.

The second Shader ALU can also have a full Vec? (element of 4)? with Texture Unit ofcourse?
 
Back
Top