Nvidia GT300 core: Speculation

Ailuros · Jan 27, 2009

trinibwoy said:
If we're anywhere near as wrong now as we were then, this upcoming generation could be very interesting. Anybody think we'll get better texture filtering this time around? Maybe a control panel option to force bicubic filtering in the shaders or something. It would be slow but probably more useful than the unnecessarily super high MSAA modes they offer now.

I'm not even sure if the methods described in the following patents are flawless/useful or if any of them have been or will be implemented (and for which markets):

http://v3.espacenet.com/publication...R=7456846B1&DB=EPODOC&locale=en_EP&CC=US&FT=D

http://v3.espacenet.com/publication...R=7446780B1&DB=EPODOC&locale=en_EP&CC=US&FT=D

Jawed · Jan 27, 2009

trinibwoy said:
Heh, it might be worth noting how incredibly wrong everybody was about G80. It's pretty entertaining reading those old speculation threads....

I'll stick my neck out here and suggest that apart from D3D11-specifics and perhaps some DWF-funk, it'll look much like GT2xx

I think it'll be more like NV47->G70 than G71->G80.

If we're anywhere near as wrong now as we were then, this upcoming generation could be very interesting. Anybody think we'll get better texture filtering this time around? Maybe a control panel option to force bicubic filtering in the shaders or something. It would be slow but probably more useful than the unnecessarily super high MSAA modes they offer now.

Texture filtering appears to be gaining precision as a bullet point for D3D11 is "Required 8-bit subtexel, submip filtering precision". Plus BC6 and BC7 are two new texture formats for better IQ.

Jawed

Ailuros · Jan 27, 2009

I'll stick my neck out here and suggest that apart from D3D11-specifics and perhaps some DWF-funk, it'll look much like GT2xx

I remember you having a longer neck than that in the past though (that is when I wasn't under that weird embargo of yours LOL)

By the way BC6/7 are texture formats for compressed HDR/LDR content. BC6 won't deliver higher IQ per se since it isn't losless. However being able to compress HDR content it should become handy for saving quite a portion of memory consumption. If the gained memory consumption gets invested elsewhere it could in relevant terms raise image quality.

trinibwoy · Jan 27, 2009

Jawed said:
I'll stick my neck out here and suggest that apart from D3D11-specifics and perhaps some DWF-funk, it'll look much like GT2xx

One could argue that D3D11 specifics (tesselation is the only hardware update I can think of) and DWF are significant changes themselves. But yeah, architecturally it probably won't be as big of a change since going unified and scalar was a massive transformation. But that doesn't eliminate the potential for big changes in ROP and/or TMU hardware. And how do we know for sure that the MAD+SFU configuration is going to live on?

The mere fact that two companies could come up with vastly different approaches to accelerating the same DirectX 10 API should be evidence enough of the diversity of interpretations these guys could have for DX11.

Arun · Jan 27, 2009

Jawed said:
NV47->G70

Haha, you do realize what you're saying there, don't you?

Anyway, here are my guesses: the most obvious change at a first glance will be a much lower degree of centralization. Second biggest change will be the flexibility for intermediary data to move from one unit to another with relatively acceptable performance if the developer feels like it. The third biggest change will be related to circuit techniques. The fourth biggest change will be related to program management. And only the fifth biggest change will be related to batch size!

I definitely agree it won't be a G70->G80-like change, but I would definitely expect GT21x->GT3xx to be a bigger change than NV35->NV40...

Ailuros · Jan 27, 2009

You couldn't come up with a better example than that huh? Thankfully GT2x0 isn't as lame as anything NV3x *shrug*

Arun · Jan 27, 2009

Ailuros said:
You couldn't come up with a better example than that huh? Thankfully GT2x0 isn't as lame as anything NV3x *shrug*

I could, but it was the only one where the programmability improvement was only incremental; NV10->NV20, for example, could have given the wrong idea...

Ailuros · Jan 27, 2009

Anyway albeit there's always room for further improvements for topics like filtering or antialiasing as examples and since we've seen unwanted backsteps in terms of filtering quality as on NV4x/G7x, I'd personally be happy if the current quality degree remains on the same level as today. Anything better than that is of course always welcome.

MDolenc · Jan 27, 2009

Jawed said:
Texture filtering appears to be gaining precision as a bullet point for D3D11 is "Required 8-bit subtexel, submip filtering precision". Plus BC6 and BC7 are two new texture formats for better IQ.

Doesn't "some hardware" already support this?

There were some nice screenshots some time ago on this board from some developer, but I don't remember the thread from the top of my head.

Ailuros said:
By the way BC6/7 are texture formats for compressed HDR/LDR content. BC6 won't deliver higher IQ per se since it isn't losless.

Why not? You can have 2 or even 3 interpolation pairs in a single block which should improve quality quite a bit by removing texture blockines.

Mintmaster · Jan 27, 2009

Arun said:
I could, but it was the only one where the programmability improvement was only incremental; NV10->NV20, for example, could have given the wrong idea...

You could have done R300->R420. Incremental programmability improvement, twice the shading power per clock.

Jawed · Jan 27, 2009

Arun said:
Haha, you do realize what you're saying there, don't you?

Hilarious I actually meant NV40->G70, sigh.

Anyway, here are my guesses: the most obvious change at a first glance will be a much lower degree of centralization.

You mean the die won't look so centralised? Are you thinking that the MC/ROP/L2 architecture will take on a more distributed sense as we see in ATI GPUs?

I'm not even sure if the ATI architecture is more distributed.

So, puzzled what it is that will be less centralised.

Jawed

Jawed · Jan 27, 2009

MDolenc said:
Doesn't "some hardware" already support this? There were some nice screenshots some time ago on this board from some developer, but I don't remember the thread from the top of my head.

Doesn't ring a bell...

Jawed

CouldntResist · Jan 27, 2009

Not meant as taking part in the discussion, I'd like to drop this bit of info:

Author of this post claimed that Ageia's next-gen chip was MIMD.
And as we know, later Ageia was bought by nVidia.

Arun · Jan 27, 2009

MDolenc said:
Doesn't "some hardware" already support this?

I'm also under the impression not just some, but most DX10 hardware supports 8-bit precision there. Maybe Intel or S3 doesn't? Or maybe I'm just wrong, hehe.

Jawed said:
You mean the die won't look so centralised? Are you thinking that the MC/ROP/L2 architecture will take on a more distributed sense as we see in ATI GPUs?
I'm not even sure if the ATI architecture is more distributed.

It's not; ATI moved from a less distributed approach as in R6xx to a more distributed approach in RV770 that is similar to G8x/GT2xx; not identical I'm sure, but very similar anyway.

Jawed said:
So, puzzled what it is that will be less centralised.

Input assembly, triangle setup, rasterization, etc. - this will also result in the possibility programmable blending with probably a performance penalty...

CouldntResist said:
Author of this post claimed that Ageia's next-gen chip was MIMD.
And as we know, later Ageia was bought by nVidia.

Yeah, and I'd be willing to believe them given Ageia's troubles getting their SW to be really benefit much from the HW; anything that'd have been that simpler and more flexible would have made sense from their POV. However I very much doubt they've had any influence in the GT300 design process; it doesn't make much sense for them to outside of everyone at NV being able to glance at their design. Either way it'd have come too late to make a substantial impact.

MDolenc · Jan 27, 2009

Jawed said:
Doesn't ring a bell...

Jawed

Finally found the post I was thinking of: http://forum.beyond3d.com/showpost.php?p=1184115&postcount=8

Psycho · Jan 27, 2009

MDolenc:
I ran into that recently doing some ray marching of perlin noise.. It seemed like both had 6 bit subtexel precision for the texture coordinates into the filter. Got a screenshot from the ati verson online (4850), while the same thing on nvidia (8600) had more jitter, but basicly same precision. http://loonies.dk/filter.jpg

CarstenS · Jan 28, 2009

Is that the same sub-pixel precision as in the OpenGL-extension listings, you're talking about?

http://delphi3d.net/hardware/listreports.php

Jawed · Jan 28, 2009

I was just browsing round the R6xx/7xx Register Reference Guide (as you do before breakfast)

http://forum.beyond3d.com/showthread.php?t=51949

when I noticed that 8-bit weights are an option: page 140 BILINEAR_PRECISION bit of TA_CNTL_AUX

Jawed

Frontino · Feb 10, 2009

Shouldn't nVidia make the whole chip DP native so to have better performance than Intel CPUs on double precision computing?

CarstenS · Feb 10, 2009

Yep, and a branch-granularity of 1 pls.

Honestly, that would make the chip too big.

Nvidia GT300 core: Speculation

Ailuros

Epsilon plus three

Jawed

Ailuros

Epsilon plus three

trinibwoy

Meh

Arun

Unknown.

Ailuros

Epsilon plus three

Arun

Unknown.

Ailuros

Epsilon plus three

MDolenc

Mintmaster

Jawed

Jawed

CouldntResist

Arun

Unknown.

MDolenc

Psycho

CarstenS

Moderator

Jawed

Frontino

CarstenS

Moderator

Similar threads