Nvidia GT300 core: Speculation

KimB · Sep 10, 2009

Kaotik said:
if the Crysis bench on HD5870 is correct, no, it wouldn't be faster than HD5870. in fact it would be a fair bit slower.

Not necessarily. High performance in one benchmark often does not translate to many others.

Mendel · Sep 10, 2009

Kaotik said:
Crysis bench on HD5870

what where?

homerdog · Sep 10, 2009

Mendel said:
what where?

It was on some Chinese forum. Something like the 5870 getting 30+FPS in Crysis at 1080p 4xMSAA Very High. Almost sounds too good to be true but here's hoping.

Edit: It was Chiphell
http://bbs.chiphell.com/viewthread.php?tid=53938&extra=page=1

trinibwoy · Sep 10, 2009

Jawed said:
http://forum.beyond3d.com/showthread.php?t=49327

Did you link that article as support for rpg's statement or the opposite? According to the conclusion RV770 has a much higher theoretical increase in MADD flops while GT200 has a higher increase in actual performance indicating lower utilization on the former.

Edit: Ah re-reading that thread I see you disqualified a bunch of pcgh's data and came up with your own conclusion. Don't think there's anything concrete there. This comparison should get easier with OpenCL (unless people write different versions of their apps for Nvidia and AMD hardware)

Jawed · Sep 10, 2009

trinibwoy said:
Edit: Ah re-reading that thread I see you disqualified a bunch of pcgh's data and came up with your own conclusion. Don't think there's anything concrete there.

There are almost no games that show the GFLOPs advantage of ATI providing a benefit to the gamer. Hardly shocking, really.

The graph is self-explantory really, and I whittled it down to some basic facts, such as HD4870 is 45% faster in ALU-limited shaders, on average. I imagine drivers have changed the picture by now, but who's going to invest the requisite time to find out?

This comparison should get easier with OpenCL (unless people write different versions of their apps for Nvidia and AMD hardware)

You mean like when matrix multiplication on ATI runs at >2x NVidia when both are "fully optimised"?

Jawed

mczak · Sep 10, 2009

DegustatoR said:
512 bit GDDR5.

Ah ok. That would be a LOT of bandwidth.

What do you mean by "shader organization"?

Well still using shader clusters with a couple SP units, some DP units, TMU grouped together, with the ALUs running at higher clock, and unlike ATI still with scalar ALUs. Though if it's really using MIMD won't that actually decrease performance per die area further (at least for graphics)?

trinibwoy · Sep 10, 2009

Jawed said:
You mean like when matrix multiplication on ATI runs at >2x NVidia when both are "fully optimised"?

Sure, pick the example that has no dynamic branching and was hand tuned in IL

Btw, was Nvidia's "fully optimised" version done in PTX? I thought that was just high level stuff.

entity279 · Sep 10, 2009

trinibwoy said:
Btw, was Nvidia's "fully optimised" version done in PTX?

It wouldn't matter as the 4870 surpasses the theoretical maximum nV FLOPS for MM.

trinibwoy · Sep 10, 2009

entity279 said:
It wouldn't matter as the 4870 surpasses the theoretical maximum nV FLOPS for MM.

Yep, and every time this topic comes up we get examples of very specific highly tuned algorithms with no dynamic branching doing well on AMD hardware. Still waiting on more complete applications to emerge that showcase all these flops. It could be that AMD's development environment just sucks but I don't know enough about it to say for sure.

That's why I mentioned OpenCL previously, hopefully we'll get more real apps that run on AMD's stuff so we can make easier comparisons.

Jawed · Sep 10, 2009

trinibwoy said:
Sure, pick the example that has no dynamic branching and was hand tuned in IL Btw, was Nvidia's "fully optimised" version done in PTX? I thought that was just high level stuff.

You're trying to tell me that code written for NVidia wasn't hand-tuned (and why are you using that term pejoratively - it's the norm for performance-critical applications)?

And prundetree's work wasn't done with IL, but a custom front-end he's built for his own use. No different from people who build a Python front end for NVidia, I guess.

Jawed

Jawed · Sep 10, 2009

trinibwoy said:
Yep, and every time this topic comes up we get examples of very specific highly tuned algorithms with no dynamic branching doing well on AMD hardware.

Do you have examples of dynamic branching at high performance on NVidia?

Jawed

trinibwoy · Sep 10, 2009

Jawed said:
Do you have examples of dynamic branching at high performance on NVidia?

Jawed

Optix? /shrug

DegustatoR · Sep 11, 2009

mczak said:
Well still using shader clusters with a couple SP units, some DP units, TMU grouped together, with the ALUs running at higher clock, and unlike ATI still with scalar ALUs. Though if it's really using MIMD won't that actually decrease performance per die area further (at least for graphics)?

That's an awful lot of things to stay unchanged, don't you think?
I believe that while NV will most surely stay with the same design basics (i.e. they won't switch to superscalars or go for the TU pool a la R5x0) we may see that the same design ideas can be implemented very differently in h/w.

neliz · Sep 12, 2009

Why worry talking about a card that's 9 months out when we should focus on the card that comes out in 3?

Arty · Sep 12, 2009

neliz said:
Why worry talking about a card that's 9 months out when we should focus on the card that comes out in 3?

Shrink GT200 on 40nm to sub 400mm2, fight 5850 with it? GT212 relives?

DegustatoR · Sep 12, 2009

neliz said:
3?

This number is wrong.

Arty · Sep 12, 2009

DegustatoR said:
This number is wrong.

So I guess its sooner than later. Still its not GT300 Deg.

Domell · Sep 12, 2009

As i`ve heard NVIDIA is going to demo it`s GT300 at thge end of the September (if this rumour is true ofc).
As we see new AMD GPU is pretty fast so do you think GT300 would be faster than it?
I think NVIDIA needs about 20-30% faster GPU to be successful.

KimB · Sep 12, 2009

Domell said:
I think NVIDIA needs about 20-30% faster GPU to be successful.

Why?

Domell · Sep 12, 2009

Because of delaying, having much bigger and more expensive to produce GPU... NVIDIA won`t be able to sell it`s GPUs on the same price as Rv870 so who will buy more expensive card which isn`t significantly faster than competitors one?

Nvidia GT300 core: Speculation

KimB

Mendel

Mr. Upgrade

homerdog

donator of the year

trinibwoy

Meh

Jawed

mczak

trinibwoy

Meh

entity279

trinibwoy

Meh

Jawed

Jawed

trinibwoy

Meh

DegustatoR

neliz

GIGABYTE Man

Arty

KEPLER

DegustatoR

Arty

KEPLER

Domell

KimB

Domell

Similar threads