What are Extreme Pipelines?

WaltC said:
Certainly not, as ATi never publicly announced an "extreme" pipe, much less defined one (and isn't going to, unless Dave was kidding.) When companies do something I would call "immoral, deceptive and indicative of the scum that they are," it's because of what they publicly state about their products *officially* that turns out to be intentional fabrication and/or deliberate, premeditated misrepresentation. I can't see how that view deserves any argument.

Who was arguing? I just asked a question. :?
 
Hellbinder said:
WHAT?? :devilish:

You better go check what i posted an Nvnews almost a year ago.

I posted a lot of Things, a lot of different times. Not all of them were intended on being totally accurate. What i have never failed to say even when messing around is its a *16* pixel Architecture.

Oh HB, don't be like that. I was referring to something you wrote similar to "sometimes 8 pipelines are like 16" or something which really to me bolstered the rumor of 8 extreme pipes. No worries anyhow, I still read your posts without hesitation.
 
trinibwoy said:
Who was arguing? I just asked a question. :?

I believe I answered it. Heh...;)

Come on now, this is just informal chatting, not "3d-Graphics Hour With Your Host Captain Kirk, Brought to You in Thrilling Technocolor By Our Sponsor, TWIMTBP," is it?...;)
 
WaltC said:
trinibwoy said:
Who was arguing? I just asked a question. :?

I believe I answered it. Heh...;)

Come on now, this is just informal chatting, not "3d-Graphics Hour With Your Host Captain Kirk, Brought to You in Thrilling Technocolor By Our Sponsor, TWIMTBP," is it?...;)

I should go sleep. Can't afford to stroll into work late again :( Any chance of hard NDA lifted info before 1am EST ?
 
Doomtrooper said:
Inquirer likes to brag how correct it was, their record isn't looking so good right now 8)
Inquirer is always correct. They make sure to cover every possible angle of an issue, so something is always right ;)
 
Chalnoth said:
Inquirer is always correct. They make sure to cover every possible angle of an issue, so something is always right ;)

Yup...it's called the "shotgun" approch to journalism...... ;)
 
I guess this is an extreme pipeline:

21748480_fe8a37eca4.jpg



Derek
 
martrox said:
Chalnoth said:
Inquirer is always correct. They make sure to cover every possible angle of an issue, so something is always right ;)

Yup...it's called the "shotgun" approch to journalism...... ;)

I think journalism is a bit of a stretch. Most journalists don't seem to print every rumour that gets passed their way.
 
DerekBaker said:
I guess this is an extreme pipeline:

21748480_fe8a37eca4.jpg
Actually, this is quite similar to the NV40 pixel shader. Instead of the term "co-issue" you just have "1 3-comp vect op + 1 scalar op", instead of "dual-issue" you just have two alus.
Some differences of course remain - NV40 has more flexible co-issue (can issue 4+0, 3+1, 2+2), the r420 can only do 4+0, 3+1, the r420 has the advantage of a dedicated texture address alu (nv40 needs to sacrifice 1 fp alu for that). There are of course small differences (like the NV40 fp16 free normalize and the mini-alus of the NV40 - no idea if the r420 has something similar), but overall the raw shading power seems to be quite similar.
Oh, and the r420 pixel shader pipe looks quite similar to the r3x0 pipe, except that there's now a second alu, so there's nothing really "extreme" about it.
 
From memory the only informed source I ever heard saying 8 extreme pipes, or 8 uber pipes was Dave B himself right here! After he coined the phrase to the masses it got bandied about heaps becuase DB is in the know and DB doesn't lie.
 
Hm, I wonder why the ALU1 blocks are smaller than the ALU2 ones... maybe it's not that much different from R300 after all.
 
In comparison to this rough analysis of NV40 versus R3xx and MDolenc's fillrate tester:
(counting per listed assembly op in the shaders)

IPC/pipe is still about 1 for the add/mad shaders, so there seems to be no improvement in handling these ops (0.97 versus the 0.98 I got for the R3xx).

For the PerPixelLighting test, X800 IPC seems to be about 1.42 versus 1.24 for the 9800.

The NV40 was 1.32 full precision, and 1.96 partial precision, I presume due to extracting normalization for its special functionality. Based on that assumption: for partial precision performance outside of the assumed normalization functionality, calculated by counting normalization as 1 extracted instruction instead of however many assembly instruction, indicates an IPC of about 1.27.

Overall, this seems to indicate a minor improvement in general case IPC for R3xx->R420, though the cause isn't fully clarified. My guesses remain something like faster execution of ops like rsq (how many clocks on R3xx?) or dp3 ability added to 2nd ALU (this latter seems to fit the 2 or 3 dp3 opportunities that seem likely given dependency and the impact relative to R3xx IPC).
 
Geeforcer said:
No, all lies! Both Nvidia and ATI completely redesigned NV40 and R420 in the last two months! These are *new* NV40 and R420, which are really NV45 and R480! Because there is no such thing as design lockdown and adding another 4-8 pipelines to chip takes 3 seconds! Oh, and Judgment Day is next week.

...Hell Freezes and Beyond3D users don't have favorite 3D chip maker anymore.
 
demalion said:
The NV40 was 1.32 full precision, and 1.96 partial precision, I presume due to extracting normalization for its special functionality. Based on that assumption: for partial precision performance outside of the assumed normalization functionality, calculated by counting normalization as 1 extracted instruction instead of however many assembly instruction, indicates an IPC of about 1.27.

Why would you extract free norm vs any of the other "special functions" like POW, LIT, SINCOS, etc? If you just want to measure vector IPC, you also need to correct for special functions of ATI's scalar ALUs as well. That doesn't seem to be a fair comparison.

Normalization takes 3 instructions. This lower IPC, it would raise it. First of all, it's an additional instruction which be issued in parallel. Secondly, it raises the effectively how many instructions are issued by 3 since you are able to dispatch 3 instructions in the time needed to dispatch 1 instruction. For example, a 5 instruction long shader considering of a NRM (3 instructions) + vector op, plus a final combine, would execute in 2 cycles. On another architecture, it would take 4 cycles.
 
Does the X800 series have an L2 cache? I believe DaveB said it does not in his review. If this is a significant amount of RAM this could explain why there is such a discreprancy in performance per clock cycle between the 6800U and X800, and the transistor count.
 
Tahir said:
Does the X800 series have an L2 cache? I believe DaveB said it does not in his review. If this is a significant amount of RAM this could explain why there is such a discreprancy in performance per clock cycle between the 6800U and X800, and the transistor count.
Possible. Hard to tell though, even if we know a lot more about the structure of the NV40 than we did with NV3x, too much of internal details are unknown - for instance ATI could have larger L1 texture caches than Nvidia, which might negate the potential performance disadvantage of not having a L2 texture cache.
 
Back
Top