Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 15-Jun-2007, 13:14   #851
Silent_Buddha
Regular
 
Join Date: Mar 2007
Posts: 8,952
Default

Well, if more people were like me and won't buy a game unless it either has in game AA support or works with driver forced AA...then maybe software companies would take notice. Either that or they'll just stop porting their console games over to PC.

Either way as in the past I refuse to play or pay for anything that doesn't allow the use of AA in some form. Vanguard being the exception. Peer pressure from a multitutde of friends can be painful.

Regards,
SB
Silent_Buddha is offline   Reply With Quote
Old 15-Jun-2007, 13:26   #852
AlexV
Heteroscedasticitate
 
Join Date: Mar 2005
Posts: 2,354
Default

As good news for the rest of the universe, us normal ppl actually buy stuff because we enjoy it/find something attractive in it/are fooled by some ad or woman or whatever etc. You`re one odd consumer, that`s certain...you were harping about the 2900 being hot and needing an additional PSU, you`re generally in love with the tent modes(that generally tend to blur waaaay too much), you buy games based on AA support?AA support adds something around....wait....zilch, to the intrinsic value of a game or to the intrinsic quality of its graphics. Because I think the vast majority of game consumers vould rather stare at a nice image with complex materials and complex lighting, done through Deferred Rendering or whatever and without AA rather than staring at Q3 with 128x AA. But, to each his own, and please don`t take this as a criticism directed at your person
__________________
Donald Knuth: Science is what we understand well enough to explain to a computer. Art is everything else we do.
AlexV is offline   Reply With Quote
Old 15-Jun-2007, 17:51   #853
caffeinated
Member
 
Join Date: Nov 2002
Location: X.X.X.X:80
Posts: 305
Default

Quote:
Originally Posted by Morgoth the Dark Enemy View Post
With all due respect, you`re being exceptionally dense here, for no apparent reason other than arguing.
You could be right, but I assure you that my "dense-ness" wasn't feigned . In any case, I'd rather move on than argue. Through the following discussion, more information was revealed, and that is what counts

Quote:
Originally Posted by Morgoth the Dark Enemy View Post
And I really wish that R6:V would go away from testing suites, as it`s a POS game, a POS port, a POS implementation of UE3 and ultimately a POS that takes optimizing time away from other more interesting things.
I haven't played it enough to make that particular judgement...but then again, the fact that I have it and *haven't* played it much due to it being rather dull also says something
__________________
Experiencing the joyous quiet of Sunon --> Panaflo.

Last edited by caffeinated; 15-Jun-2007 at 17:59.
caffeinated is offline   Reply With Quote
Old 16-Jun-2007, 19:58   #854
caffeinated
Member
 
Join Date: Nov 2002
Location: X.X.X.X:80
Posts: 305
Default

I still think that if A card can do it, despite it being "impossible" then B card should be able to do it. If it can't, that is a deficiency. It is, in my own view, disingenuous to attempt to explain away a shortcoming by claiming "yes but it isn't fair!".
__________________
Experiencing the joyous quiet of Sunon --> Panaflo.
caffeinated is offline   Reply With Quote
Old 16-Jun-2007, 20:07   #855
digitalwanderer
Dangerously Mirthful
 
Join Date: Feb 2002
Location: Winfield, IN USA
Posts: 15,292
Default

Quote:
Originally Posted by caffeinated View Post
I still think that if A card can do it, despite it being "impossible" then B card should be able to do it. If it can't, that is a deficiency. It is, in my own view, disingenuous to attempt to explain away a shortcoming by claiming "yes but it isn't fair!".
Fair enough, that's your opinion....but that doesn't make it the definitive answer on the matter y'know.
digitalwanderer is offline   Reply With Quote
Old 16-Jun-2007, 20:31   #856
caffeinated
Member
 
Join Date: Nov 2002
Location: X.X.X.X:80
Posts: 305
Default

Quote:
Originally Posted by digitalwanderer View Post
Fair enough, that's your opinion....but that doesn't make it the definitive answer on the matter y'know.
No it doesn't. It does beg the question, however, the *important* question, of how can it be implemented. Direct OS calls aren't possible in every case, so then you have to find a way to make it happen with a different route. That's all. Let's find ways to fix problems if we are going to point them out and sometimes that is something as simple as a change in the thought process.
__________________
Experiencing the joyous quiet of Sunon --> Panaflo.
caffeinated is offline   Reply With Quote
Old 17-Jun-2007, 06:36   #857
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Quote:
Originally Posted by Razor1 View Post
With MSAA there is no way to access the pre-resolved samples is there?
If walls could speak...
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline   Reply With Quote
Old 17-Jun-2007, 14:48   #858
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,862
Send a message via Skype™ to Jawed
Default

Quote:
Originally Posted by nAo View Post
If walls could speak...


Jawed
Jawed is online now   Reply With Quote
Old 17-Jun-2007, 15:04   #859
NocturnDragon
Member
 
Join Date: Feb 2002
Posts: 393
Default

Quote:
Originally Posted by nAo View Post
If walls could speak...
Are you implying that the data is actually there and accessible but developers cannot use it because all APIs don't have an interface to it? (but for dx10 of course)
__________________
NocturnDragon
Q :Why did the chicken cross the road? Evolutionist: Pure chance. Evolutionist: Only the fittest chickens survive crossing the road. Creationist: God created the chicken on the other side of the road. There is no proof it ever was on this side.
NocturnDragon is offline   Reply With Quote
Old 18-Jun-2007, 22:41   #860
DeanoC
Senior Member
 
Join Date: Feb 2003
Location: Sheppey, UK
Posts: 1,439
Default

Quote:
Originally Posted by NocturnDragon View Post
Are you implying that the data is actually there and accessible but developers cannot use it because all APIs don't have an interface to it? (but for dx10 of course)
Erm how do you think MSAA works? I mean there has to be a pre-resolved buffer and the GPU is gonna do the resolves so...

A driver is after all just using the hardware at a lower level than the normal API. Now whether the GPU has the kind of access to the pre-resolved buffer as you'd like (for example is it a fixed function resolve or can you access it like any other texture), is another question entirely. But most hardware can consume MSAA buffer cos there really just linear textures that are wider than non MSAA buffers(a 1280x720 4xMSAA is really just a 2560*1440 buffer at the pre-resolve stage).
__________________
Riding the stormy ship called indie game development
DeanoC is offline   Reply With Quote
Old 19-Jun-2007, 12:02   #861
Xmas
Off-season
 
Join Date: Feb 2002
Location: On the pursuit of happiness
Posts: 3,019
Default

Quote:
Originally Posted by DeanoC View Post
Erm how do you think MSAA works? I mean there has to be a pre-resolved buffer and the GPU is gonna do the resolves so...

A driver is after all just using the hardware at a lower level than the normal API. Now whether the GPU has the kind of access to the pre-resolved buffer as you'd like (for example is it a fixed function resolve or can you access it like any other texture), is another question entirely. But most hardware can consume MSAA buffer cos there really just linear textures that are wider than non MSAA buffers(a 1280x720 4xMSAA is really just a 2560*1440 buffer at the pre-resolve stage).
The context implied getting samples into the shader, which most likely means texture reads.

To have the MSAA buffer as just a simple texture would require disabling color compression, which could substantially reduce the usefulness of MSAA.
Xmas is offline   Reply With Quote
Old 19-Jun-2007, 15:11   #862
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Quote:
Originally Posted by Xmas View Post
To have the MSAA buffer as just a simple texture would require disabling color compression, which could substantially reduce the usefulness of MSAA.
That would be a naive implementation, a better implementation would give you all the subsamples you need in your pixel shader automatically uncompressed: I expect R600 to implement something like that.
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline   Reply With Quote
Old 19-Jun-2007, 15:20   #863
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,862
Send a message via Skype™ to Jawed
Default

Quote:
Originally Posted by Xmas View Post
The context implied getting samples into the shader, which most likely means texture reads.
I don't see why the RBEs can't have a "backward" datapath to the shader pipeline. After all, there's a forward path.

I'm sure I saw somewhere a hint that the 8KB of read/write cache, per SIMD, is used as the path for this. I interpret this to mean that the RBEs have a write path to the R/W cache and this cache is then used as the input for the AA-resolve shader, as though they are per-pixel attributes. The R/W cache is also designed to share data between "neighbouring pixels", a feature of tented-AA - though obviously texture-based AA can do the same.

The RBEs are the perfect units to fetch the render target because they're designed to do this anyway, and because the RBEs will use the least amount of bandwidth doing so (since the RBEs have access to the compression flags). Though the bandwidth saving would diminish as per-frame triangle count increases (assuming ~constant overdraw factor).

Which would run faster? The texture units, in point-sampling mode, can fetch 80 samples per clock. What's the sample rate for the RBEs? 64 colour samples per clock?

Clearly, though, the programmer-exposed concept of shader AA in D3D10 is texture-based, e.g. for HDR or deferred-rendering where each sample is accessible. That fact does rather undermine my argument.

Also, I'm thinking there's a very strong likelihood that ATI has used the multiple-concurrent context support of R600 to effect the shader AA pass:

Code:
   Frame 0 rendering   AA
|---------------------|--|
                         Frame 1 rendering   AA
                      |---------------------|--|
so that AA resolve for frame 0 is able to perform texture fetches while frame 1 is also able to perform texture fetches as it starts rendering.

Jawed
Jawed is online now   Reply With Quote
Old 19-Jun-2007, 19:36   #864
OpenGL guy
Senior Member
 
Join Date: Feb 2002
Posts: 2,291
Send a message via ICQ to OpenGL guy
Default

Quote:
Originally Posted by Jawed View Post
Also, I'm thinking there's a very strong likelihood that ATI has used the multiple-concurrent context support of R600 to effect the shader AA pass:

Code:
   Frame 0 rendering   AA
|---------------------|--|
                         Frame 1 rendering   AA
                      |---------------------|--|
so that AA resolve for frame 0 is able to perform texture fetches while frame 1 is also able to perform texture fetches as it starts rendering.
This would require double buffering the AA buffers as you can't still be resolving the old frame while beginning rendering on the new.
__________________
I speak only for myself.
OpenGL guy is offline   Reply With Quote
Old 19-Jun-2007, 19:41   #865
pelly
Member
 
Join Date: Sep 2002
Location: San Jose, CA
Posts: 159
Default

I'd love to see the second part of B3D's coverage for this card.....I know staff said they're waiting on a few items.....but couldn't the article be posted now (or soon) with the "special bits" being added as an update once the info is available?

By the time we see things, there will likely be some major driver changes....with some possible "key" performance enhancements...
pelly is offline   Reply With Quote
Old 19-Jun-2007, 20:06   #866
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,862
Send a message via Skype™ to Jawed
Default

Quote:
Originally Posted by OpenGL guy View Post
This would require double buffering the AA buffers as you can't still be resolving the old frame while beginning rendering on the new.
Hmm, so R600 can't use regions of memory for distinct render targets, in order to "double-buffer"? The hierarchical- Z and -stencil buffers aren't regionalised either?

What happens when Aero is running? e.g. you have the 3D interface running, which is a z-buffered render target and within that multiple 3D applications each of which has its own "private" z-buffered render target.

Or, erm, isn't that possible? I thought it was...

Jawed
Jawed is online now   Reply With Quote
Old 19-Jun-2007, 20:13   #867
OpenGL guy
Senior Member
 
Join Date: Feb 2002
Posts: 2,291
Send a message via ICQ to OpenGL guy
Default

Quote:
Originally Posted by Jawed View Post
Hmm, so R600 can't use regions of memory for distinct render targets, in order to "double-buffer"? The hierarchical- Z and -stencil buffers aren't regionalised either?
You lost me. If you have a single AA buffer, you can't resolve it and render to it at the same time. You don't know where the application is going to render so it may corrupt data that still needs to be resolved. In any event, as Eric mentioned, we are using shader resolve which means the 3D pipeline is being used for the resolve operation in any event.
Quote:
What happens when Aero is running? e.g. you have the 3D interface running, which is a z-buffered render target and within that multiple 3D applications each of which has its own "private" z-buffered render target.

Or, erm, isn't that possible? I thought it was...
That's no problem at all because everything has their own private buffers. Aero has one Z buffer, each 3D app has their own, etc.
__________________
I speak only for myself.
OpenGL guy is offline   Reply With Quote
Old 19-Jun-2007, 20:23   #868
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,862
Send a message via Skype™ to Jawed
Default

Quote:
Originally Posted by OpenGL guy View Post
You lost me. If you have a single AA buffer, you can't resolve it and render to it at the same time. You don't know where the application is going to render so it may corrupt data that still needs to be resolved. In any event, as Eric mentioned, we are using shader resolve which means the 3D pipeline is being used for the resolve operation in any event.

That's no problem at all because everything has their own private buffers. Aero has one Z buffer, each 3D app has their own, etc.
Ah, I was guessing that because each application can have its own private buffers, a game could use two buffers and flip between them for the purposes of AA resolve, so that there'd be no AA clash.

Oh well.

So what you're saying is that this is a texture unit based resolve, much the same as D3D10 exposes for programmers. Presumably this means that there's no bandwidth or ALU cycle savings to be made, by detecting compressed tiles. Or does R600 dump the compression information out to a second texture, allowing the AA resolve shader to navigate the AA data efficiently?

Jawed
Jawed is online now   Reply With Quote
Old 19-Jun-2007, 21:07   #869
Xmas
Off-season
 
Join Date: Feb 2002
Location: On the pursuit of happiness
Posts: 3,019
Default

Quote:
Originally Posted by nAo View Post
That would be a naive implementation, a better implementation would give you all the subsamples you need in your pixel shader automatically uncompressed: I expect R600 to implement something like that.
Yes, of course. I was referring to Deano's "MSAA buffers being just linear textures" which isn't true when compression comes into play.

Quote:
Originally Posted by Jawed View Post
I don't see why the RBEs can't have a "backward" datapath to the shader pipeline. After all, there's a forward path.
You can have a data path from anywhere to anywhere. I didn't mean to imply any limitation of that kind.
Xmas is offline   Reply With Quote
Old 19-Jun-2007, 21:21   #870
OpenGL guy
Senior Member
 
Join Date: Feb 2002
Posts: 2,291
Send a message via ICQ to OpenGL guy
Default

Quote:
Originally Posted by Jawed View Post
Ah, I was guessing that because each application can have its own private buffers, a game could use two buffers and flip between them for the purposes of AA resolve, so that there'd be no AA clash.
The app could do this, but there'd by no savings as the pipeline would still be in use for the resolve.
Quote:
So what you're saying is that this is a texture unit based resolve, much the same as D3D10 exposes for programmers. Presumably this means that there's no bandwidth or ALU cycle savings to be made, by detecting compressed tiles. Or does R600 dump the compression information out to a second texture, allowing the AA resolve shader to navigate the AA data efficiently?
I'm not at liberty to go into specifics, but we do take steps to do things efficiently.
__________________
I speak only for myself.
OpenGL guy is offline   Reply With Quote
Old 19-Jun-2007, 21:35   #871
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Let say R600 shader core can read back a compression flag which tells us if a tile worth of subsamples is compressed (all samples equal) or not, one would need dynamic branching to dinamically skip extra texture reads/extra math, imho it does not sound a good idea.
What if we can read such a flag and fill a render target with such a mask that we can use later to automatically early-skip every pixel that has compressed subsamples (or the other way around).
Then our resolve pass could be decouples in 2 full screen passes, one processing every subsample belonging to a pixel and the other one sampling just one subsample per pixel.
AA resolves wider than one pixel would need some special care at mask generation time though.
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way

Last edited by nAo; 19-Jun-2007 at 21:40.
nAo is offline   Reply With Quote
Old 20-Jun-2007, 01:01   #872
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,862
Send a message via Skype™ to Jawed
Default

If the compression flags (which are just a 2D structure) are dumped into memory to be used as a texture, it should be simple to do this in a single pass. In the AA-resolve pixel shader, each pixel knows whether it's part of a compressed tile or not. The thread size of the GPU (16 in RV610, 32 in RV630, 64 in R600) determines coherence, which means that the execution time is constant for uncompressed tiles (it's also constant for compressed tiles). We don't know the size of the compression tiles...

This patent document:

Method and apparatus for anti-aliasing using floating point subpixel color values and compression of same

refers to tiles of 2x2 and 4x4 pixels. The size of tiles may depend on the colour precision in the render target... The number of samples per pixel (2, 4 or 8) may also affect the tile size.

Also it uses a 3-level compression scheme: uncompressed, partially compressed and fully compressed and seems to be designed around the concept of an fp16 pixel format (i.e. 64-bit per pixel)

So, ahem, juggling this data is a bit more complicated than I was thinking,

---

If the render target is in a 32-bit per pixel format, presumably the texture fetch for a destination pixel can be performed in a single texture load operation using a "32-bit fetch". This will actually fetch 128-bits, which is 4 AA samples' worth of colour data. The AA resolve shader then needs to "unpack" these samples.

Jawed
Jawed is online now   Reply With Quote
Old 28-Jun-2007, 00:21   #873
ShaidarHaran
hardware monkey
 
Join Date: Mar 2007
Posts: 3,898
Default

Quote:
Originally Posted by caffeinated View Post
Err...should people instead be spinning that the HD 2900XT has spectacular drivers?

Call of Jaurez had noticeable rendering issues. That doesn't automatically translate to "shoddy drivers", and I don't think anyone here tried to "spin" it that way. Also note that the application being renamed in R6:V, *according to the screenshots I saw* seemed only to apply to the 2900XT - there was no notation that such a thing had to be done with the G80 screenshots. Again, that doesn't automatically translate to "shoddy drivers" but if it isn't doing something that G80 is doing...well, I think you can draw the conclusion that they at least need work.
Rendering errors in CoJ, eh?
http://www.geforce3d.net/index/node/35?page=0%2C5
R600's not alone there.
ShaidarHaran is offline   Reply With Quote
Old 02-Jul-2007, 18:09   #874
zealotonous
Junior Member
 
Join Date: Feb 2007
Posts: 29
Default

Quote:
Originally Posted by ShaidarHaran View Post
Rendering errors in CoJ, eh?
http://www.geforce3d.net/index/node/35?page=0%2C5
R600's not alone there.
I can't seem to locate the rendering issues you are presumably referencing. Can you be a bit more specific and tell us where to look?
zealotonous is offline   Reply With Quote
Old 02-Jul-2007, 19:25   #875
AlexV
Heteroscedasticitate
 
Join Date: Mar 2005
Posts: 2,354
Default

Screen-door artifacts that no one seems to understand are inherent to the algorithm employed for anti-aliasing alphas and have jacksquat to do with this or that card. The R600's artifacts in COJ were of a different sort(with those early drivers that were used for the first batch of reviews, it was blurrier than the G80 in a number of scenes, the most apparent one being the one with the fireplace in it at the end), but I think they`re fixed/significantly reduced now, as the benchmark produces fairly clean graphics with the R600 nowadays, in my experience.
__________________
Donald Knuth: Science is what we understand well enough to explain to a computer. Art is everything else we do.
AlexV is offline   Reply With Quote

Reply

Tags
amd, analysis

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 11:00.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.