ATI R500 patent for Xenon GPU?

Jaws said:
xbox2_scheme_bg.gif

What exactly is the "VPU" designation in the lower right hand corner of each CPU core? It doesnt appear in any other block diagram To my knowledge.
 
Jawed said:
Will quad pixel processing be a part of this architecture? If so, how many TMUs per quad?

Jawed

Yeah, the leaked diagram mentions 2*2 pixel quads, though I can't recall the patent mentioning it.

Earlier, i was musing on 3 ALUs and 1 TMU per US unit. This kinda doesn't fit with the quad. Keeping the same ratio, the next quad friendly slot would be 12 ALUs and 4 TMUs per US unit. Obviously the TMUs are decoupled from the ALUs as mentioned in the patent.

US = 12 ALU + 4 TMU

4*US = 48 ALU + 16 TMU

So 4 US units/cores seem to fit the R500 leaked spec.
 
My understanding isn't brilliant about R300/R420 PS architecture, but I just want to muse on a few things:

In these older architectures, there's 1 TMU per pipe, in other words a total of 16 TMUs.

In R420 I count the following ALUs:

- 4D + 1D vertex shader ALUs = 2 ALUs
- 2x(3D + 1D) pixel shader ALUs + a texture address ALU (1D?)= 5 ALUs

So the old-school ratio of ALUs to TMUs is 7:1. But you've got four types, VS 2.0, PS2.0, PS1.4 and texture address.

I'm presuming R500 uses 3 ALUs like this:

- VS 3.0 4D + 1D (vector + scalar) + 1D texture address = 3 ALUs

OR

- PS 3.0 4D (vector) + 1D (scalar - redundant?) + 1D (texture address) = 3 ALUs

So this gives 3 ALUs per TMU, which gets us back to 48 ALUs and 16 TMUs. But now there's only three types of ALU: 4D, 1D (both VS/PS) and texture addressing.

What's interesting (in terms of performance and transistor count!) is that R420 has a total of 92 ALUs and 16 TMUs, but R500 would appear to be aiming to get by with less than half the number of ALUs (48 ). Blimey.

I suppose that goes to show how many ALUs are sitting idle, on average, in R420...

Jawed
 
Jawed said:
...
What's interesting (in terms of performance and transistor count!) is that R420 has a total of 92 ALUs and 16 TMUs, but R500 would appear to be aiming to get by with less than half the number of ALUs (48 ). Blimey.

I suppose that goes to show how many ALUs are sitting idle, on average, in R420...

Jawed

That kinda sounds too good to be true!

I expect the transitor count to shift from the ALUs to the pixel and vertex reservation stations, i.e large local memory sizes, also control logic complexity in the arbiter, eDRAM and any shared cache. I still expect the R500 to be a transistor beast! :p
 
Duh, "just over half" rather than "less than half". Oh and that 7:1 ratio was a bit dodgy, too (sigh).

But anyway, it seems to me that there's gonna have to be some serious increase in horsepower per ALU.

I've been counting dimensions of ALUs. R420 comes out as having 174D in total spread across its 92 ALUs. R500 in the 4D+1D+1D architecture I've speculated, would have 96D spread across 48 ALUs.

So, on average (LOL) an R500 ALU is more "powerful" (has more dimensions) than an R420 ALU (only just)... OK, I admit it, this all getting a bit stupid.

Jawed
 
nAo said:
r420 = 92 ALUs? :devilish:
Jawed said:
But you've got four types, VS2.0, PS2.0, PS1.4 and texture address.

I don't know if you can count texture addressing ALUs. Also, I'm wondering whether the texture address ALU lives inside a graphics processing engine or inside the ALU engine. In other words, I'm not sure which "side" of the arbiter a texture address calculation would be executed.

Jawed
 
Jawed..no offense, but what you're talking about? :)
Where did u get your 92 ALUs figure?!?
 
Since I can't comment on R500 this is how I'd count R420....

VertexALUs 1x6
Pixel ALU's 16x2

and 2 is being generous for the Pixel ALU's since they're not orthogonal.

So being generous I count 38, and it's effectively less than that because 16 of those are "mini" ALU's.

It's unclear to me what the Texture ALU's actually compute in R520, but I wouldn't count them againt total ALU's.
 
Jawed said:
6 vertex shader pipelines, each with two ALUs:

http://www.beyond3d.com/reviews/ati/r420_x800/diagram/vertex.jpg

16 pixel shader pipelines, each with five ALUs:

http://www.beyond3d.com/reviews/ati/r420_x800/diagram/shadercore.jpg

If each pixel shader only has 3 ALUs and the vertex shader only has 1 ALU, then fair enough. That's what I'm asking for clarification on, amongst other things...

Jawed
ALU is a pretty broad concept, so you can't compare architectures with ALUs if ALUs are different things in different GPUs.
It's very hard to make meaningful comparisons but you should at least count how many operations per clock each GPU (or each ALU) can do.
 
blakjedi said:
Jaws said:

What exactly is the "VPU" designation in the lower right hand corner of each CPU core? It doesnt appear in any other block diagram To my knowledge.

as far as I know, the things called "VPU" within each CPU core is a VMX unit or AltiVec unit. basicly a SIMD / Floating Point Unit of some sort. unless I am mistaken.

I believe it is Xenon CPU's equivalent of the VMX unit that is part of the PPE within the current Cell Processor.
 
If you take in acount recent DeanoC coments, the patent more the processoral work and such talked, make us think that tey need some brute force but from the CPU(s) or from part of it ;) ... something special and unique ...
 
Deepak said:
Jaws and Jawed? :oops:
Both seem to be made from same genetic material judging by their posts. :)
Back in the mid-80s my email/news signature was:

Jawed
some kind of shark joke

According to Google, seems I stopped October 97 (there was a /dev/nul of internet presence for most of the years between).

As a hint, my name isn't pronounced like it's spelt. I used to explain it, but I'm past caring now...

Jawed
 
the wording of the supposed leak suggests to me that a single thread is either a group of 64 pixels or a group of 64 vertices.
 
Back
Top