So who's thinks their predictions were right about the nV40?

DemoCoder said:
Man you guys are tough to please. 2x performance of previous gen + new features, and still not impressed. It used to be that 2x performance was a pretty impressive gain for people.

There is no doubt that NV40 is an extremely impressive piece of kit, but for me personally, I like a high quality picture. I would rather some of those extra frames NV40 gives us be used to improve IQ. Giving me 200 frames per second is all well and good, but I would be happy to sacrifice 50 or a hundred of those to get improved IQ. Once the frames get above a high enough level, I'd rather have more IQ.

I'm disappointed that Nvidia have given us a great product, but seem to have neglected improving IQ beyond anything we've already seen with the Radeons. Nvidia have obviously focussed on getting back the performance crown, but in doing so I think they left a gap in their product - really improved IQ.
 
I just don't think it's credible to expect to be able to run 8xMSAA on current games (e.g. coming out in last 6 months, or next 6 months) at reasonable resolutions. On older games, sure. But I can't even run 6xMSAA today on most games (e.g. BF Vietnam, FC, CallOfDuty, etc) Seems like a bragging right only.

See, I wish for ATI or NVidia to go with something like FAA (but real Z3 method), so I get 8-64x sparse sampled AA on edges.

Just think about the bandwidth alone needed for a 16-pipe card to write 8 samples per pipe. Compression will help somewhat. I guess I'll wait and see if R420 can run Doom3 or HL2 with 8xMSAA @ 1280x1024. I'm holding judgement.

Bouncing,
For me, part of improving IQ is enabling games to achieve higher levels of visual effects. Most FPS today for me look good enough with 4xAA already. I'm just not overlightly excited by the diminishing returns of higher AA levels, unless your gonna go to something like 64x stochastic like in CGI rendering. Instead, I want games to use the extra power and bandwidth to deliver graphics like D3, HL2, Unreal3, EQ2, etc. Not just UT2004 with slightly better edge AA that is hard to tell unless you take a screenshot.

Temporal AA would be something different, but except for 3dfx, no one ever seems to be pursuing it.
 
Man you guys are tough to please. 2x performance of previous gen + new features, and still not impressed. It used to be that 2x performance was a pretty impressive gain for people.

the NV40 is a damned impressive product, but until we get a chance to see the R420 strutting it's stuff the money burning a hole in my pocket is gona carry on burning.

But whereas vbore it was definitely gonna be a R420, now it's 50/50.
 
DC,

And the provided by NVIDIA 8x sample AA mode is more playable right now? Let me see it's a hybrid MS/SSAA sampling mode, where with 8x you get just another 4*4 grid instead of 8*8 with a 8xRGMS mode, fill-rate and bandwidth requirements are times higher than with pure MSAA and therefore performance way lower in way lower resolutions anyway.

To counter your last sentence will I be able to run D3 on NV40 with what they market as 8xAA?

***edit: if it helps any I don't need exotic AA modes for FPS games; guess what I don't have time to look for jaggies that much. However I do need them for racing/flight/space etc sims where I don't need framerates in excess and inevitably I have more time to notice details.
 
Did I ever claim 8x mixed mode was a performance winner? I merely pointed out that a bug in the current implementation makes it 50% worse than it should be. Just like I pointed out that the GT2 ultrashadow numbers are strange. Again, a technical point I was making which is being taken out of context. Where did I claim NVidia's mixed 8x was great or that any future fillrate hungry games would run with it? Humus, are you taking note of how this starts?


I simply don't think the bandwidth and gate costs of higher MSAA modes justifies the marginal improvements in IQ they give. Now, if we're talking Z3/FAA/TBDR techniques, then scaling up the # of samples is way more efficient and justified. I'd have more confidence if I could run more games at 6xMSAA, which I can't. Therefore, I do not expect to be running high-end visual candy games coming out this year at 8xMSAA.

Sure, maybe on flight sims you'll be able to do it. Not my bag of tea. Then again, flight sims usually don't need pixel shaders either. You could probably software render flight-sims with CPU computed edge-AA.
 
DemoCoder said:
Humus, are you taking note of how this starts?

DC....please don't take these things SO personal. The fact that the NV40 can't be everything to everyone is a given. At some point you need to consider that you don't need to constantly defend nVidia. The company is run by frail humans, and humans make mistakes - remember the FX series? You do yourself a disservice by constantly defending them in places they don't either need or deserve defending. You have a great deal on knowledge that many, myself included, appreciate your sharing with us. But please understand that you come across to some as a zelot. Accept that nVidia "may" have made a mistep by not including more FSAA choices. That doesn't mean that, as a product taken as a whole, it's not worlds above not only the FX series, but what ATI has out currently, too.
 
Of course didn't you claim anything relative about the 8x mixed mode, yet it's still a valid counterpoint to your bandwidth argument for 8xMSAA.

User opens control panel checks for AA modes, checks 8x AA and starts playing. Which of the two above options is more usable?

I simply don't think the bandwidth and gate costs of higher MSAA modes justifies the marginal improvements in IQ they give.

4*4 vs 8*8 marginal? Just as marginal as the IQ improvement between 2x and 4xRGMS yes? About twice as much.

Now, if we're talking Z3/FAA/TBDR techniques, then scaling up the # of samples is way more efficient and justified.

Errrrr..... ;)

Sure, maybe on flight sims you'll be able to do it. Not my bag of tea. Then again, flight sims usually don't need pixel shaders either. You could probably software render flight-sims with CPU computed edge-AA.

I mentioned all kinds of sims not just flight sims and yes shaders will get used more and more in those too, especially the racing sims.

Good luck with CPU computed edge AA on let's say FS2004; I'd need those CPU resources elsewhere.

This is a clear difference of preferences here, yet I don't see something irrational or vastly impossible for this generation in what I would had prefered to see. If it makes sense from a technical point to expose 8x or 16x sample hybrid modes on an accelerator, then a way less demanding 8x MSAA mode isn't an exaggeration.
 
DemoCoder said:
Bouncing,
For me, part of improving IQ is enabling games to achieve higher levels of visual effects. Most FPS today for me look good enough with 4xAA already. I'm just not overlightly excited by the diminishing returns of higher AA levels, unless your gonna go to something like 64x stochastic like in CGI rendering. Instead, I want games to use the extra power and bandwidth to deliver graphics like D3, HL2, Unreal3, EQ2, etc. Not just UT2004 with slightly better edge AA that is hard to tell unless you take a screenshot.

I can notice poor AA almost straight away, especially when moving, so when I saw the "telegraph cable" pictures in the reviews, it stood out a mile. I can tell within seconds when I've forgotten to force AA on the control panel for UT2K4.

I'd agree with your points on IQ being more than just AA, but it seems that NV40 still has lower AA quality at 4x than R300. I was hoping for more than that from a part which is stellar in so many other ways, because AA is such an all-encompassing way of improving IQ across the board.

As I said before, it's difficult to go backwards when you are used to a certain level of IQ, so I'm disappointed that Nvidia didn't see fit to make a more balanced AA part for NV40.
 
martrox said:
DemoCoder said:
Humus, are you taking note of how this starts?

DC....please don't take these things SO personal. The fact that the NV40 can't be everything to everyone is a given.

Martox, you misunderstand what I'm taking issue with above. I responded to a thread with a mere technical point: "8x mode being tested under "8x" setting is 2xMS + 4xSS, not 4xMS + 2xSS" as an explanation for why benchmarks seemed to show that 8x ran not just twice as slow as 4xMSAA but 1/4 as slow. I did not claim that this made 8x "usable", "fast", "desirable", or able to run games like D3, HL2, FarCry, CallofDuty, etc in such modes at acceptable rates. Yet, my statements were being used to suggest that I was suggesting such a thing. Moreover, because I was "down" on the concept of 8xMSAA, I was pro-8xS mode, which is ludicrous.

My last post was a reference to Humus's "DC will post about a technical issue and get bashed for bias"

I am not posting these tech issues to defend Nvidia specifically, I am pointing out info that I received first hand from talking to various people at the event, plus my own philosophical conclusions on what I think about the card, or the important of shader compilers now. (compilers being my area of interest)
 
Bouncing, all I can say is, it's up to powerVR to save us. You want to scale up to 16-64x sparse AA to match prerendered scenes. Moreover, you want some temporal AA too to smoth out instability in simulation and framerate. Only architecture I see achieving this is a tile based renderer of some sort.

PVR5, where art tho? Do you have something special in store for us?
 
DemoCoder said:
Man you guys are tough to please. 2x performance of previous gen + new features, and still not impressed. It used to be that 2x performance was a pretty impressive gain for people.

I think you are both offbase BTW, because it's not simply a matter of resolution. Future games that need more fillrate and ALU power are going to run more than 2x slower on older HW.

No, I'm not that hard to please. I already said the NV40 looks like a solid chip all around with no glaring weaknesses.

I said there are two "relatiely minor" quibbles though....one being AA, and the other being Ansio...if they are able to do it their old way, but don't offer the option.

On AA, I have to disagree. Lots of titles are CPU limited today, (See UT2K4) and these cards (NV40 and R420) should be running at higher than 4X MSAA.

This doesn't mean I'm not thrilled about NV40, I am. :) But it's not perfect either.
 
Are ya guys really sure that the NV40 is bandwidth limited? The (p)review at 3dcenter.de indicates otherwise.

My impression about the NV40 is that it's damn fast. I wasn't expecting that much speed from a 400MHz core. However, I sincerely hope that a later driver will bring back the angle independent AF.

There are 2 things which I'm quite disappointed with, though:

(1) Heat. I'm all for silent computing. It might be that the NV40's cooling solution isn't that loud. But if it heats up the PC, it's equally bad for me. I'd prefer by far a solution which exhausts the air out of the PC through the PCI slot.

(2) AA. If the NV40 had a true 8x rotated/sparsed MSAA mode, we could use that on almost every existing game (except some of the newer and more demanding games perhaps). That's a major concern for me.
 
Joe DeFuria said:
No, I'm not that hard to please. I already said the NV40 looks like a solid chip all around with no glaring weaknesses.

My comment was only for Quitch and Ailuros.


I said there are two "relatiely minor" quibbles though....one being AA, and the other being Ansio...if they are able to do it their old way, but don't offer the option.

I'm skeptical on this. I think they saved transistors by going the ATI route. Are we witnessing people flipping sides on this? I thought ATI's AF implementation limitations weren't a big deal? (I never really got too involved in those arguments)

On AA, I have to disagree. Lots of titles are CPU limited today, (See UT2K4) and these cards (NV40 and R420) should be running at higher than 4X MSAA.

Yeah, but D3, EQ2, and HL2 are different issues. They are CPU limited sometimes, but GPU limited others. Depends on the level, how much physics is being triggered, or how many entities are active.

BTW, isn't it time someone shipped a Havok accelerator? :)
 
No it's bickering time. I have about a month or less for NV40, then R420 will be next and after that..................hey gimme dat keyboard back o_O
 
DemoCoder said:
BTW, isn't it time someone shipped a Havok accelerator? :)

I was wondering about that myself, it's about the only physics engine games use...why not off-load some more from the CPU to the GPU? :)
 
Twice the performance is a lot, the extra few notches of resolution in high end games do make a difference, as well as the stability for fps games.

I don't think the IQ debate is over by a long shot. For one, its so close to the R3xx core in FSAA and AA, I suspect tangible differences will largely be in the base vanilla no af/no aa and how drivers interpret scenes. The IQ wars for this generation IMO will be based on developers tuning, and how transparent the drivers are.

I rarely ever use 6x AA with my 9800, it really slows things down too much in general. Indeed, even 4x runs slow in some of the games I like to play at my preffered resolution (like call of duty).

This generation of cards should be able to give me a stable 60fps with 4x at the good resolutions, and thats why I will be upgrading to one or the other in the next 3 or 4 months.
 
For now i just have three complains about the nv40 ultra, the need for two molex, the required 480 psu ( but i really think a good 400w or below psu will do just fine ), but the worse, i was really thrilled for a better IQ (AA) then the r300, a card that entered the market 2 years ago...

Nevertheless, the performance and features that it brings are just great, its a very strong card. Very nice NVIDIA. Now i'll just have to wait for the r420 to make my decision.

Now we only need another company like PVR in the market, come on :)
 
DemoCoder said:
My last post was a reference to Humus's "DC will post about a technical issue and get bashed for bias"

That was a scarily accurate prediction from Humus. :oops:

I must admit that NV40 was somewhat better than I thought. Especially the true FP performance of the NV40 was astonishing. There was no visible register penalty difference between full and partial precisions. Is it confirmed that NV40 always utilizes full FP32 precision in full precision pixel shaders? (Thinking about the banding in Far Cry pixel shading...)
 
DemoCoder said:
I'm skeptical on this. I think they saved transistors by going the ATI route. Are we witnessing people flipping sides on this? I thought ATI's AF implementation limitations weren't a big deal? (I never really got too involved in those arguments)

ATI's AF implementation isn't really a big deal, IMO.

However, I've always said that I would prefer to have both: option for "fast" ansio, and option for "no comprimises" aniso. If 6800 has the ability to do no comprimises aniso, and doesn't enable the option in the driver, that would be unfortunate.

If ATI doesn't offer the option at all...that would be unfortunate too.

My point is, nVidia could score points if they offered both. That being said, if I had to pick one or the other, I'd pick the ATI/6800 implementation as a gamer. Artists / designers / professionals would probably disagree though.

Yeah, but D3, EQ2, and HL2 are different issues.

So we think, anyway. ;)

They are CPU limited sometimes, but GPU limited others. Depends on the level, how much physics is being triggered, or how many entities are active.

Correct...but MSAA is typically not GPU limited AFAIK, but bandwidth limited.
 
Evildeus said:
Mintmaster,
I also see the Bandwidth being the most limiting factor for the R420.
Lets say:

Core 500MHz vs 412 for the XT
Memory 600MHz vs 365
Pipelines 16 vs 8

If i calculate correctly it makes per pipe
1/2*412/500*600/365= 0.67. Means that Ati will need to increase its bandwidth savings by 50% to have the same as the 9800XT.

Of course if the 800 XT core has got the same frequency as the 9800 or even the 6800U it would decrease the need of extreme BW/ BW saving technics.
I see what you're saying, but I'm saying the 9500 PRO is an excellent indicator of how bandwidth affects the R300 architecture.

Assume these numbers for R420 (clocks chosen for a reason):
Core clock: 550 MHz
Memory clock: 540 MHz
Bus width: 256 bit
Pipelines: 16

Now look at the 9500 PRO:
Core clock: 275 MHz
Memory clock: 270 MHz
Bus width: 128 bit
Pipelines: 8

Now you can see that if you effectively double the 9500 PRO's architecture (pipes and memory bus width), then double mem/core clock speeds, the result is R420. I'm claiming that in terms of performance, 4 x 9500PRO = R420. Of course, that's simplifying it a lot, but it's a good starting estimate. R420 will likely have 6 vertex engines as opposed to 8, but that shouldn't affect it's performance too much. Hopefully there are some architectural improvements as well to make it even better, such as better compression or a better shader pipe. They'll probably need it to compete with NV40 per clock (although that's a rather irrelevant battle).

Do you think people will be interested if I start a thread with speculative R420 performance? :D As if we need another rumour-ish thread on next gen architectures...
 
Back
Top