Nvidia CEO: working on something twice as fast?

MDolenc said:
MuFu: Perhaps depth-stencil buffer seperation? :?

Is that good? :LOL:

I have no idea about the software side of things. I was just wondering what some of you more informed guys would guess JC might "request" in terms of simple changes between NV30 and NV35. You'll have to take my word for it that his input has been considered in the refresh.

MuFu.
 
MuFu said:
Could well be over twice as fast in some situations based on specs, yeah. Said this before, but I really hope they resist the temptation to equip it with a dustbuster (of sorts) and shoot for a more conservative clockspeed, say 550-600MHz. Having said that, if they develop a quiet, highly efficient cooling solution and hit 650MHz+ then that'd be pretty neat and they'll have a nice flagship until ATi shoves R400 up their ass.

NV35 has *alot* of fixes/optimisations compared to NV30 on which it is very closely based (256-bit bus etc aside). There are also some enhancements/changes based on input from a certain Mr John Carmack of iD Software fame, which I don't really know anything about right now.

Any ideas?

MuFu.

Thanks for this delicous information, MuFu! :)
I really don't see how they could get away without a dustbuster if they want to use anything higher than 500Mhz, however. IIRC, Low K is about 20% cooler. And it doesn't do miracles, AFAIK.

As for optimizations, let me guess... Scheduling optimizations? I still consider that as the most likely cause of problems for the NV30, and the reason why the drivers are so complex compared to past architectures.

Changes JC wants? Well, the easiest thing to do is look at his .plan file.

The future is in floating point framebuffers. One of the most noticeable
thing this will get you without fundamental algorithm changes is the ability
to use a correct display gamma ramp without destroying the dark color
precision. Unfortunately, using a floating point framebuffer on the current
generation of cards is pretty difficult, because no blending operations are
supported, and the primary thing we need to do is add light contributions
together in the framebuffer. The workaround is to copy the part of the
framebuffer you are going to reference to a texture, and have your fragment
program explicitly add that texture, instead of having the separate blend unit
do it. This is intrusive enough that I probably won't hack up the current
codebase, instead playing around on a forked version.

So, better support for FP framebuffers probably.

Also, in the B3D interview:
In your opinion, why is it that the existing framebuffer content for a given pixel isn't a standard input to Pixel Shaders (which is a S3 DeltaChrome feature/advantage)?

All software developers want this, but the hardware developers insist that dedicated blenders vastly simplify the write ordering hazards.


Uttar
 
Uttar said:
Thanks for this delicous information, MuFu! :)
I really don't see how they could get away without a dustbuster if they want to use anything higher than 500Mhz, however. IIRC, Low K is about 20% cooler. And it doesn't do miracles, AFAIK.

OK, but wasn't Dustbuster a last minute rushed effort? Surely with more warning they'd be able to get something more satisfactory implemented?

So, better support for FP framebuffers probably.

Oh, I do hope so, but I don't hold out much hope for it. We might get the limited blending functionality like that available in fixed-function OpenGL. I think what the programmers really want is the frame-buffer content as input to a pixel program, but reading the last minutes of the OpenGL ARB it looks like the hardware boys all ganged up and said a big "Non!" to this bit of OpenGL 2.0.
 
MuFu said:
Is that good? :LOL:

I have no idea about the software side of things. I was just wondering what some of you more informed guys would guess JC might "request" in terms of simple changes between NV30 and NV35. You'll have to take my word for it that his input has been considered in the refresh.

MuFu.

I am sure you are right when you say his input has been "considered", but the nice thing about considering something is you aren't locked into it...;) Carmack is merely a single consideration out many, many more nVidia will also have to "consider"....I for one hope their implemented considerations are not exclusive of Carmack's input, but not limited to them, either.
 
WaltC said:
I am sure you are right when you say his input has been "considered", but the nice thing about considering something is you aren't locked into it...;) Carmack is merely a single consideration out many, many more nVidia will also have to "consider"....I for one hope their implemented considerations are not exclusive of Carmack's input, but not limited to them, either.

Sorry, bad wording on my part. Some of Carmack's idea have definitely gone into the hardware. This is in addition to the fixes (hardware fog etc). I assume his input didn't consist of him jumping up and down on a table in front of Jen-Hsun shouting "FASTER DAMMIT, FASTER!!!". :LOL:

MuFu.
 
Oh this is a no brainer...

550-600Mhz core and DDRII memory, 256bit bus, same cinefx architecture, merely updated with insignificant features (still no compliancy for vs/ps 3.0), 0.13 (possibly manufactured using low-k, but i have my doubts about that...) which pretty much sums it all up.

The next major technological breakthroughs (although not as major as the DX9 generation) are R400 & NV40, which promise many interesting improvements, among which is dynamic branching with adequate speed (currently, NV30 has major problems in this regard).
 
alexsok said:
Oh this is a no brainer...

550-600Mhz core and DDRII memory, 256bit bus, same cinefx architecture, merely updated with insignificant features (still no compliancy for vs/ps 3.0), 0.13 (possibly manufactured using low-k, but i have my doubts about that...) which pretty much sums it all up.

The next major technological breakthroughs (although not as major as the DX9 generation) are R400 & NV40, which promise many interesting improvements, among which is dynamic branching with adequate speed (currently, NV30 has major problems in this regard).

As I said before, my bets are on the NV30 scheduling system being badly messed up. It would explain bad branching speed, everything acting as only 4 pipelines, and a lot of other stuff.
Anyway, that's pretty much in line which what I expect. But why do you doubt nVidia is using Low K?


Uttar
 
if vidcard companies continue to go all speed crazy like they are, faster cores and more transistors - how the hell are they going to cool them? Most 2GHz+ CPUs need relativly big heatsinks and fans, and they only have like 1/2 the transistors of a high-end vidcard. How will graphics companies fit something substantial enough to cool a vidcard running at, say, even 1Ghz onto a videocard? I think cooling is the next major hurdle.
 
i dont think someone who spoke both english and japanese could understand that babelfish translation...
 
MuFu said:
Is that good?
Well dev rel guys always shout: Clear stencil when you clear z-buffer if you have stencil, otherwise we have to preserve stencil (and since they are the same memory you have some hops to jump through). Of course this thing can be turned around: if you clear stencil only, they'll again have to jump through some hops to preserve z-buffer. If you separate them you don't have to worry about preserving one when clearing other.
This would further improve performance in Doom 3 (similar to two sided stencil).

Uttar said:
As I said before, my bets are on the NV30 scheduling system being badly messed up.
This post from Tom Forsyth made me think about pixel shader speeds of NV30. Remember what NVidia actually did with pixel shaders on NV30... They are executed from video memory, TMUs are able to return multiple texels per clock if they can predict position ahead. There's just lots of driver work to be done to make all this thing working. I guess we'll see quite some improvements here as driver matures...
 
MDolenc said:
MuFu said:
Is that good?
Well dev rel guys always shout: Clear stencil when you clear z-buffer if you have stencil, otherwise we have to preserve stencil (and since they are the same memory you have some hops to jump through). Of course this thing can be turned around: if you clear stencil only, they'll again have to jump through some hops to preserve z-buffer. If you separate them you don't have to worry about preserving one when clearing other.
This would further improve performance in Doom 3 (similar to two sided stencil).

Got ya, cheers. :)

MuFu.
 
if vidcard companies continue to go all speed crazy like they are, faster cores and more transistors - how the hell are they going to cool them? Most 2GHz+ CPUs need relativly big heatsinks and fans, and they only have like 1/2 the transistors of a high-end vidcard. How will graphics companies fit something substantial enough to cool a vidcard running at, say, even 1Ghz onto a videocard? I think cooling is the next major hurdle

switch to quantam-based graphics computing.


i kid i kid :D
 
Back
Top