Nvidia GT200b rumours and speculation thread

igg · Jul 13, 2008

I really hope we'll get some official information on this chip after the R700 performance previews (are they still scheduled for tomorrow?) are published.

Maybe Nvidia wants to spoil the R700 launch like they did with the 4850s launch (9800GTX+).

neliz · Jul 13, 2008

Nvidia DX 10.1 part? .. I don't think so...

annihilator · Jul 13, 2008

Blazkowicz said:
what about a GTX270? they can keep it at 28 ROPs/448bit if needed and up a bit the core, but enable further 24 SPs.

Wouldn't the card have to be 512bit if they enabled the remaining cluster?

trinibwoy · Jul 13, 2008

annihilator said:
Wouldn't the card have to be 512bit if they enabled the remaining cluster?

No, clusters aren't tied to the memory controllers.

INKster · Jul 13, 2008

annihilator said:
Wouldn't the card have to be 512bit if they enabled the remaining cluster?

Trini already answered, but if you want proof of that, look no further than any run-of-the-mill 8800 GT.
It has a disabled cluster, yet retains the 256bit memory interface and 512MB GDDR3 capacity of its 8800 GTS 512MB, 9800 GTX/GTX+ and 9800 GX2 cousins.

Blazkowicz · Jul 14, 2008

well they can independently disable shader clusters or ROP clusters (tied to memory, 4 ROPs = 64bit). a shader cluster is two multiprocessors (8SP + SFU + registers) on G8x/G9x, three on G200, so 16SP and 24SP respectively.

so with G92 you have 128SP/256bit (the full GPU), 112SP/256bit (8800GT, 9800GT), 96SP/192bit (8800GS)
GTX280 is 240SP/512bit, GTX260 is 192SP/448bit. so they respectively have 10 and 8 shader clusters, 8 and 7 ROP clusters.

just recalling those boring facts because there seems to be confusion with those dreadful "clusters"

CarstenS · Jul 14, 2008

trinibwoy said:
Heh, so RV770 has people believing in miracles now eh?

Obviously some like to think that way, yes. Me, I'd rather believe, Nvidia vastly underestimated RV770.

Mintmaster · Jul 14, 2008

CarstenS said:
I'd rather believe, Nvidia vastly underestimated RV770.

Can't say I blame them. ATI's engineering didn't have much to brag about with their previous DX10 efforts.

NVidia may be in a tough spot now due to their commitment to CUDA. I think Jawed pointed this out, but it seems like they're stuck with 8-wide SIMDs. ATI basically has 16x5 SIMDs right now and there's no pressing need to go for better granularity. Even after 55nm scaling, the former is more than half the size of latter, and despite increased utilization and clock speed, that's not even close to being small enough.

I think computational speed is starting to matter less, though. Games are probably using a bit more math, but it's not increasing as fast as GPU ability. We'll see if GT300 has some innovations there.

trinibwoy · Jul 14, 2008

Mintmaster said:
NVidia may be in a tough spot now due to their commitment to CUDA. I think Jawed pointed this out, but it seems like they're stuck with 8-wide SIMDs. ATI basically has 16x5 SIMDs right now and there's no pressing need to go for better granularity. Even after 55nm scaling, the former is more than half the size of latter, and despite increased utilization and clock speed, that's not even close to being small enough.

In terms of math capability I think Nvidia can keep up even with the more expensive 8-way SIMD approach.

What I don't get is why GT200 seems to have a lot more supporting logic than RV770. For example, ALU+TEX on RV770 seems to be a larger percentage of the die than ALU+TEX on GT200 even with NVIO parceled out to a separate chip. Since a lot of the arbitration logic is part of the clusters what is all the extra stuff on the GT200 die?

Freak'n Big Panda · Jul 14, 2008

Mintmaster said:
NVidia may be in a tough spot now due to their commitment to CUDA. I think Jawed pointed this out, but it seems like they're stuck with 8-wide SIMDs. ATI basically has 16x5 SIMDs right now and there's no pressing need to go for better granularity. Even after 55nm scaling, the former is more than half the size of latter, and despite increased utilization and clock speed, that's not even close to being small enough.

I think computational speed is starting to matter less, though. Games are probably using a bit more math, but it's not increasing as fast as GPU ability. We'll see if GT300 has some innovations there.

Can you elaborate as to why you think NV is in a tough spot due to their commitment to CUDA?

Jawed · Jul 14, 2008

trinibwoy said:
Since a lot of the arbitration logic is part of the clusters what is all the extra stuff on the GT200 die?

NVidia's connecting 10 clusters to 8 ROP partitions - whereas ATI's connecting 10 clusters to 4 MCs. The interconnection logic scales faster than either side being connected- it's a combinatorial explosion.

The sheer quantity of memory bus pins is prolly also a factor.

Jawed

Mintmaster · Jul 14, 2008

Freak'n Big Panda said:
Can you elaborate as to why you think NV is in a tough spot due to their commitment to CUDA?

IMO they really don't want to move away from 8-wide SIMDs, as that is something they want to keep consistent in their GPU computing framework. ATI hasn't made any such commitment.

trinibwoy said:
In terms of math capability I think Nvidia can keep up even with the more expensive 8-way SIMD approach.

Sure, but not at the same cost as ATI. NVidia has a long history of designing as optimally as possible for a given set of design constraints (NV3x aside). I'm pretty sure their current design can't get any smaller.

What I don't get is why GT200 seems to have a lot more supporting logic than RV770. For example, ALU+TEX on RV770 seems to be a larger percentage of the die than ALU+TEX on GT200 even with NVIO parceled out to a separate chip. Since a lot of the arbitration logic is part of the clusters what is all the extra stuff on the GT200 die?

I don't think you're right about that. The ALU space is ~25% on both, and while TEX space is about the same for NV's DX10 chips, the TEX area is a lot smaller than the ALUs for ATI. It looks like ~40% ALU+TEX on RV770, and 50% ALU+TEX on GT200 and G92/G80.

It's not just the ALUs that are awesome in RV770, but what ATI can do with 40 seemingly small TMUs is quite impressive compared to the 64 TMUs in G92. XBit labs has some fairly texture-intensive shaders (see R580 vs. R520), but RV770 is still beating G92 in them (here).

Anyway, in the "extra stuff" there's still a lot of arbitration logic to decide which workloads go to which cluster. There's still all the rasterization w/ Z-cull, which needs to feed the shaders twice as fast to take advantage of twice the ROPs in GT200. IMO, 50% non-shader space on GT200 doesn't seem out of place compared to 60% in RV770, all things considered.

nAo · Jul 14, 2008

Mintmaster said:
IMO they really don't want to move away from 8-wide SIMDs, as that is something they want to keep consistent in their GPU computing framework. ATI hasn't made any such commitment.

If only CUDA was not so close to the hardware and a little bit more abstract..
They had probably lost a bit of performance here and there but they would have not found themselves in this situation.
I guess a future CUDA revision is going to address this issue.

Mintmaster · Jul 15, 2008

Yup, that's why I think they're in a tough spot. They have to choose between changing some CUDA fundamentals and letting ATI keep the ALU per-mm2 efficiency crown (which may not be so bad). NVidia would rather not have to do either.

I'm sure they felt that they made the right choice when R600 was out, and still felt fine with RV670. Only with RV770 does it this years-old decision look a bit restricting.

Freak'n Big Panda · Jul 15, 2008

Oh I see, thanks for elaborating. My bet is that they will retool cuda so they can move away from the area inefficient 8 way SIMD design

Mintmaster · Jul 15, 2008

Well, remember that it does give them better branching granularity and dependent instruction throughput (though I think ATI can achieve the latter with minimal cost as well, as I've argued before), and I think they have the option to get even better granularity if they want.

I don't think it's too useful right now, but it could be in the future, especially for non-graphics loads. I think if NVidia improves its texturing, memory controller, and AA performance, the areal math inefficiency may not matter for games.

However, gaudy math numbers must look tempting for HPC customers, too, and if AMD pushed FireStream hard then NVidia may have no choice but to do what you suggested.

psurge · Jul 15, 2008

Slight tangent: what do you guys think of the bulk synchronous parallel processing model? There's a paper from microsoft research at siggraph 08 on it: BSGP: Bulk-Synchronous GPU Programming, scroll down a bit for the paper.

PeterT · Jul 15, 2008

psurge said:
Slight tangent: what do you guys think of the bulk synchronous parallel processing model? There's a paper from microsoft research at siggraph 08 on it: BSGP: Bulk-Synchronous GPU Programming, scroll down a bit for the paper.

Thanks for the link, I just finished reading it. It looks like a very interesting model and implementation, and the fact that they developed a non-trivial application on it (the X3D parser) makes it a lot more credible.

However, I think it's more than just a slight tangent from the topic of this thread. Perhaps this part should be split off to a separate thread in the GPGPU forum?

nicolasb · Jul 18, 2008

Fudo says GT200b will be here in September, maybe even August:

http://www.fudzilla.com/index.php?option=com_content&task=view&id=8515&Itemid=1

We said a few months ago that Nvidia drives two projects in parallel. GT200 65nm that got launched and branded as GTX 280 / 260 is out and there is a GT200 55nm chip that should be launched shortly.

Our sources are telling us that 55nm version of the chip should be ready either in late August or in September which means that Radeon HD 4870 X2 will get some competition.

We believe that Shaders and clock of 55nm GT200 are definitely going to be higher than the 65nm and the chip itself should be a bit cooler.

This means R700 will get some better competition and the fact that this ATI’s dual card is going to end of faster than GTX 280, doesn’t mean ATI has already won the war.

CarstenS · Jul 18, 2008

nicolasb said:
Fudo says GT200b will be here in September, maybe even August:

http://www.fudzilla.com/index.php?option=com_content&task=view&id=8515&Itemid=1

"We believe that Shaders and clock of 55nm GT200 are definitely going to be higher than the 65nm and the chip itself should be a bit cooler."

That's one hell of a speculation. It definitely needs more than just a thorough understanding of semiconductors and the industry as whole to figure that out.

Nvidia GT200b rumours and speculation thread

igg

neliz

GIGABYTE Man

annihilator

trinibwoy

Meh

INKster

Blazkowicz

CarstenS

Moderator

Mintmaster

trinibwoy

Meh

Freak'n Big Panda

Jawed

Mintmaster

nAo

Nutella Nutellae

Mintmaster

Freak'n Big Panda

Mintmaster

psurge

PeterT

nicolasb

CarstenS

Moderator

Similar threads