If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
|
|
#1 |
|
Member
Join Date: Mar 2004
Location: Australia
Posts: 97
|
hi All,
Firstly, apologies if this question has been raised before. I know the basic concepts of pipelining is (ie, prefetch, decode, execute, store blah blah blah) but in terms of graphics pipelines I hear from people that NV40 pipeline architecture is 16x1 and R420 is 12x1. What exactly do these numbers mean? For example, recently on the Inquirer.org site, they are claiming that NV40 might be 32x0. What does the 0 mean in this case? I'm kind of confused in the naming convention used in NxY pipelining. thanks! -Bahadir |
|
|
|
|
|
#2 |
|
Stealth Nerd
Join Date: Jul 2003
Location: Sunny Melbourne
Posts: 1,112
|
Take a 4x2 architecture. It has 4 pixel pipelines, so can work on 4 pixels per clock. Each pipeline has two TMU's, so it can do two texture operations on each pixel, per clock. The 32x0 mode refers to a different sort of calculation where nothing is actually rendered - not incredibly useful, but conversely not completely useless either.
__________________
Human Rights [X---------|----------] Robert Menzies |
|
|
|
|
|
#3 |
|
Senior Member
Join Date: Feb 2002
Location: gjethus, Norway
Posts: 1,256
|
For pixel pipelines, NxY is usually taken to mean 'capable of rendering N pixels per clock cycle, with Y textures applied to each pixel'. So 8x2 will for example imply 8 pixels per clock, with 2 textures applied to each pixel. (If you want to apply more than Y textures, you can do so, but at the cost of being able to rendering fewer pixels per clock cycle.).
Y=0 as in 32x0 implies that the chip supports a mode where it can render 32 pixels per clock cycle, but only if you turn off texturing (in case of Nvidia, there is usually also the added condition that you must only write Z values, not color values, to the framebuffer. This extra condition tends to lead to endless terminology confusion and funny terms like 'zixels') |
|
|
|
|
|
#4 |
|
Member
Join Date: Mar 2004
Location: Australia
Posts: 97
|
thanks for the replies.
So basically a NxY architecture means it is capable of rendering N pixels with Y texture units applied to it in parallel? So if pixel x belongs to texture S, and if pixel y belongs to texture T, with a 2x2 architecture should be able to render it in one pass? Also, with 32x0, are you implying that it will only process 32 pixels in one pass if z depth values are written? |
|
|
|
|
|
#5 | ||
|
Stealth Nerd
Join Date: Jul 2003
Location: Sunny Melbourne
Posts: 1,112
|
Quote:
Quote:
__________________
Human Rights [X---------|----------] Robert Menzies |
||
|
|
|
|
|
#6 | |||
|
Senior Member
Join Date: Feb 2002
Location: gjethus, Norway
Posts: 1,256
|
Quote:
Quote:
Quote:
|
|||
|
|
|
|
|
#7 |
|
Regular
Join Date: Feb 2002
Location: California
Posts: 4,732
|
It's a shadow acceleration mode. If doing shadow buffers, it allows you to fill the buffer at up to 2x the fillrate. If doing stencil shadow volumes, it allows you to write stencils at up to 2x the fillrate.
|
|
|
|
|
|
#8 | |
|
Member
Join Date: Mar 2004
Location: Australia
Posts: 97
|
Quote:
|
|
|
|
|
|
|
#9 | ||
|
Senior Member
Join Date: Feb 2002
Location: gjethus, Norway
Posts: 1,256
|
Quote:
With this understanding, a 2x2 renderer starts by picking (x,y) coordinates for two pixels, and then, for each of the two pixels, computes two sets of {s,t} texture coordinates and then looks up the associated texture data from two texture maps. All this once per clock cycle. |
||
|
|
|
|
|
#10 |
|
Member
Join Date: Mar 2004
Location: Australia
Posts: 97
|
thanks
|
|
|
|
|
|
#11 |
|
Junior Member
|
Thanx from me too, that helped clear the muddy image I had in my mind regarding pipline architectures.
|
|
|
|
|
|
#12 |
|
Senior Member
Join Date: Jul 2002
Location: UK
Posts: 1,758
|
I should point out that this isn't a 'real' view of how things now work internally, but a legacy method from the 3dfx days. It's still used now as a convenience method for explaining it simply.
|
|
|
|
|
|
#13 |
|
Junior Member
Join Date: Apr 2003
Posts: 32
|
here is a question that might expand on Bahadir's question..... ok say we have 16x1 in order to change there state to 32x0 does the pipe line have to be purged before reconfiguring itself from one to the other .... and what sort of performace penalties are we talking about here to go from 16x1 to 32x0 if the pipelines have to be purged..... do we even know how long these pipes are on these chips ???
rets |
|
|
|
|
|
#14 |
|
Join Date: May 2002
Location: New York, NY
Posts: 12,678
|
I would suspect that it would "switch modes" only when there's a change in some global rendering variables, such as whether or not to output color. So yes, the pipelines would definitely have to be flushed. This should, however, be a tiny performance hit, as you should only do this a couple of times per frame.
__________________
April 20, 1979 - America must never forget. |
|
|
|
|
|
#15 | |
|
Junior Member
Join Date: Feb 2004
Posts: 15
|
Quote:
Especially the upcoming technology. |
|
|
|
|
|
|
#16 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,951
|
Search for the term "Quad pipeline" here.
Basically pipelines from DX8 onwards really operate on a quad of pixels - a 2x2 pixel section from a triangle. The reson for this is that there are some instructions that have dependancies on neighbouring pixels. The upshot of this is that a "4x1" pipeline is actually a "single quad" pipeline. Radeon 9800, having an "8x1" pipeline is operating on two quads at any one time. A modern 2 pixel pipeline is still working on a quad, but doing it over two (or more) cycles. |
|
|
|
|
|
#17 |
|
Senior Member
Join Date: Jul 2002
Location: UK
Posts: 1,758
|
It's all a question of bottlenecks inside the engine. You have various things you need to be able to do per clock cycle, and if there's any you don't have enough of, that limits your performance (a 'bottleneck'):
- buffer reads - generate any interpolated values - execute pixel shader instructions - generate sampler addresses - look up textures - buffer writes There might be other bottlenecks too. Saying it's a '8x1' allocates various values: - 8 Z reads - 8 colour reads - 8 interpolators - 8 pixel shader instructions - 8 sampler addresses - 8 texture reads - 8 Z writes - 8 colour writes Because 8x1 doesn't give much information, we see confusing things like '8x1/16x0" which means '8x1, but can do 16 Z reads and 16 Z writes'. It's a useful 'quick fix' but it's got limited relevance to how things work. (At least, I think that's what people who use it mean. I don't actually know, it seems to be a bit of a woolly term!) It may be more complex still. It may have dependencies on renderer state, or these numbers aren't integers, or different parts of the pipeline may share resources with other parts of the engine (e.g. you will see that the first section says 'buffer reads' and the second explicitly separates them into Z and colour - there's no reason that necessarily has to be the case). As Dave says, there's also granularity - there may be smallest chunks of data that can be processed. The further you get into GPU performance the more bottleneck and bubble analysis starts to take over your life |
|
|
|
|
|
#18 |
|
Senior Member
Join Date: Jul 2002
Location: UK
Posts: 1,758
|
So since I and Chalnoth both touched on it I'd better briefly mention bubbles: a bubble is what you get when anything further down the pipe has to wait for some event higher up the pipe.
An extreme example is when the application reads the back buffer - the pipeline has to be completely flushed so that you can guarantee all rendering operations to that back buffer has completed. There's lots of effort and quite a bit of silicon goes into avoiding bubbles. |
|
|
|
|
|
#19 | |
|
Member
Join Date: Mar 2004
Location: Australia
Posts: 97
|
Quote:
so how does "8x1" fit in the picture? Im confused :? |
|
|
|
|
|
|
#20 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,951
|
We're not talking "Geometry Quads" here, but pixel quads. Once you've gone through the geometry setup and evaluated the triangle to screen space the triangle is split up into 2x2 pixel regions (or quads) and then processed by a "quad" of pixel pipelines.
|
|
|
|
|
|
#21 |
|
Senior Member
Join Date: Jul 2002
Location: UK
Posts: 1,758
|
Blame our crappy terminology. 'Quads' in this context are 'Quad-pixels' not 'Quadrilaterals'. I'm not a big fan of it either, but we're stuck with it
|
|
|
|
|
|
#22 |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,951
|
|
|
|
|
|
|
#23 |
|
Senior Member
Join Date: Feb 2002
Location: CT
Posts: 2,024
|
A search on the term "proxel" (the smiley in that post is a link, BTW) will drop you into the middle of some long discussions of this that go into a lot of detail from the different perspective of trying to understand and describe some architectures when this complexity was being exhibited. That term is one I made up (and no one else uses, so don't learn it :P) to try to discuss pixel shading with some relation to the more easily understood pixel and texel terms, and the discussions touch on the topic question from many angles.
If you find the discussions confusing instead of revealing, just disregard. |
|
|
|
|
|
#24 | |
|
Junior Member
Join Date: Feb 2002
Posts: 83
|
Quote:
BTW,How many textures are used in popular game? As I know,quake use 3 texture in some place,serios sam use tri textures,how about UT2004 or Farcry,painkiller?Does quadtexture is widely used now?
__________________
Rong"Rookie"Huo Former Beyond3d Boys,Err,not Bit boys |
|
|
|
|
|
|
#25 |
|
Join Date: May 2002
Location: New York, NY
Posts: 12,678
|
The number of textures used will vary from surface to surface, and will vary widely depending upon the game.
For a complex surface, I would suspect 4 textures would be a minimum. Anyway, the hypothetical "32x0" that we're talking about here would have nothing to do with textures, but rather with rendering z and stencil data. Such an architecture would accelerate an initial z-pass (something that is necessary for shadow volumes, but is also helpful in allowing hardware to not have to render anything that will later be covered up), as well as stencil shadow volume rendering. It may also be possible for such an architecture to accelerate other shadowing techniques, but that would depend upon the hardware implementation.
__________________
April 20, 1979 - America must never forget. |
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| ATI, Cadence and TSMC Produce Fabless X Architecture Chip | Dave Baumann | Press Releases | 6 | 15-Jun-2005 02:43 |
| Simple Question on Next Generation Consoles | gosh | Console Technology | 23 | 12-Jun-2005 17:59 |
| ATI and NVIDIA Proclaim Different GPU Architecture Goals | Megadrive1988 | 3D Architectures & Chips | 3 | 29-Dec-2004 18:37 |
| Unified Pipeline Architecture | trinibwoy | 3D Architectures & Chips | 4 | 25-Sep-2004 04:20 |
| No CELL revealed at IBM event. Sony Licenses POWER.. | Deadmeat | Console Technology | 54 | 02-Apr-2004 16:25 |