NVIDIA GF100 & Friends speculation

Nope, based on 700 MHz, Damien's Number is right on the money.

Doesn´t GF100 have 48 rops?

And the texture number should be multiplied by four if GF100 has 256 TF units, as you can see at anandtech´s overview and Neliz said something like "quad TMU´s - quad damage" thing.
 
So, GF100 has only 32 ROPs?

Doesn´t GF100 have 48 rops?

And the texture number should be multiplied by four if GF100 has 256 TF units, as you can see at anandtech´s overview and Neliz said something like "quad TMU´s - quad damage" thing.

Nope, it has (as far as I know) 48 ROPs. But can only do 256zpc/32 ppc - as I have been told. Obviously, I don't have a Fermi here to toy around with. :(
 
Especially that I know that there's nothing one could try to against your unshakeable faith ;).
I'm sorry but I'm not the one who operate on "faith" here.
What you already know about GF100 leaves no reason to prefer a 5(8/9)70 over GF3(6/8)0 (Eyefinity is nice and all but if I would ever want to play with several 1080p monitors I'll certainly want more performance than even 5970 is able to provide so I consider them even in that regard).
What we still need to know to make a final judgement is price/perfomance ratios of course. But I'll be very surprised if NV won't set right prices knowing almost everything about competing AMDs cards this time around.
Everything beyond that is "faith".

Unless they know Evergreen won't last long enough to be seriously limited in this regard.
So tesselation in EG is essentially useless? Great! You should tell that to everyone who bought that first to market DX11 solution and tell them not to worry -- AMD will kindly ask them to buy another, better card soon. Ain't all of this sounds great?

I hope they follow the sparsed grid SSAA supported by the HD5xxx.
Yeah I'm kinda dissapointed that this wasn't mentioned anywhere in the whitepaper.
It does looks like a pretty straight-forward "hack" for any DX11 GPU. And it's suprisingly useful even on a 5850 card.
 
So, GF100 has only 32 ROPs?

No it has 48 ROPs but there are 4 8-sample/clock rasterizers for a total of 32 across the chip. So although it could theoretically output 48 pixels, on the input side of things there are only 32 so that defines the limit.

This is the first time I read that information. :!:

Rys mentioned it in the architecture thread :)

I don't think so .. Nvidia clearly tweaked the architecture up to favor setup rate , and tweaked it down against texture fill rate.

That latter bit could cause them some grief. The jittered sample support is nice but it's not clear if DX11 even exposes that functionality. If it does then that particular task should be very fast on Fermi. For straight up Gather4 Fermi is still theoretically slower than Cypress.

And the texture number should be multiplied by four

Nope, it can only produce 64 filtered texels per clock. You multiply by 4 for the number of unfiltered samples but that's the same for AMD hardware as well. Damien's chart at hardware.fr lays out all the numbers nicely.
 
I don't get Dave's comment. So AMD designs architectures based on how they will perform on current apps? That's ironic considering they have had an unsupported tessellator in their chips for a while now. It's doubly ironic considering that measured gains in current apps on Cypress are far below the theoretical improvement over RV770.

Yeah, it's definitely a very weird thing to say. Besides AMD, I only know one person that actually seems to think that ATI wasting die space with a useless tessellator in R600, RV670 and RV770, was actually a good thing...
 
5870 GF100
850.... 2800 Mtriangles
27.2... 22.4 Gpixels
2720 ... 1433 Gflops
68 ..... 44.8 Gtexels
143 .... 214 Bandwidth (Gigs/s)

I only disagree with his hot clock frequency. I'm still betting on 1500 Mhz.

Archaeolept said:
do these numbers seem believable? While the triangle rate is awesome (and bandwidth), doesn't this look a little unbalanced for most current games?

Not at this point, no. Theoretical numbers are "great", but the efficiency of the architecture and namely the new cache hierarchy, that may balance everything as the calculations are kept "on die", can only be seen in real applications.
 
Nope, it has (as far as I know) 48 ROPs. But can only do 256zpc/32 ppc - as I have been told. Obviously, I don't have a Fermi here to toy around with. :(

No it has 48 ROPs but there are 4 8-sample/clock rasterizers for a total of 32 across the chip. So although it could theoretically output 48 pixels, on the input side of things there are only 32 so that defines the limit.

Rys mentioned it in the architecture thread :)

Thx :!:
 
So, GF100 has only 32 ROPs?
48, but he says this number comes from "No AA" case which is limited somewhere else.

The rasterizer stays at 32pixels/clock according to him, so without AA this would lead to some ROP idling while with AA they could become the limiting factor, AA level at which that occurs depending on the exact frequencies.
 
Besides AMD, I only know one person that actually seems to think that ATI wasting die space with a useless tessellator in R600, RV670 and RV770, was actually a good thing...
Do you know how much die space it cost on those GPUs?

Jawed
 
I'm sorry but I'm not the one who operate on "faith" here.
What you already know about GF100 leaves no reason to prefer a 5(8/9)70 over GF3(6/8)0 (Eyefinity is nice and all but if I would ever want to play with several 1080p monitors I'll certainly want more performance than even 5970 is able to provide so I consider them even in that regard).
What we still need to know to make a final judgement is price/perfomance ratios of course. But I'll be very surprised if NV won't set right prices knowing almost everything about competing AMDs cards this time around.
Everything beyond that is "faith".

Buying a GF100 right now blindly is what I'd call faith, as it would be hardly an informed decision. There are more question marks than facts at the moment, so please, let me be sceptical, as I rarely have a lot of faith in anything / anyone.

So tesselation in EG is essentially useless? Great! You should tell that to everyone who bought that first to market DX11 solution and tell them not to worry -- AMD will kindly ask them to buy another, better card soon. Ain't all of this sounds great?

Please, it makes me nauseous when someone tries to tell me that I said something I didn't. Fermi's tesselation is clearly superb, but does this render Evergreen useless? Remains to be seen. Maybe Fermi's tesselation is an overkill?

I've been lurking long enough in these forums to know your bias. But please let me remain sceptical, even though you may "find my lack of faith disturbing" ;).
 
I'm sorry but I'm not the one who operate on "faith" here.
What you already know about GF100 leaves no reason to prefer a 5(8/9)70 over GF3(6/8)0
There is one obvious reason... they're available.

And no, what we know wouldn't even change my decision if I were in need, pure bottleneck-free "benchmarks" are not informations but technical details which could very well have zero effect on performance, the few informations we have all say it's not a given GF100 will be faster, even with those fancy "4 times or even more faster" charts.

Before today, we knew nothing but could speculate about what GF100's performance would be whereas we knew Cypress performance.

Today, we know little more except they worked hard to reduce bottlenecks, but they don't show any result proving it was relevant.
 
Why do people insist on comparing a dual GPU card to a single GPU card? I'm sorry, it may work on a $ to $ basis, but after that, it holds no water. Most people with a decent IQ will compare it to Cypress, not Hemlock.

People like you?

http://forum.beyond3d.com/showpost.php?p=1339955&postcount=156

I still think this launch was a fail on AMDs part. I think they should have told the sites, no Nvidia products in your launch reviews, only use our 4870, 4890 and/or 4870x2. The performance might have been a bit more awe inspiring for people to go ga-ga over. But when you have benches being run with GTX285s staying within 20-30% performance of it and GTX295 and in several cases, 4870X2s beating it, the WOW factor just isn't there anymore.

From the Radeon 5800 review thread. My the double standards are sure flying there...

So if it's Nvidia then it's fine to compare a dual GPU to a single. But if it's AMD then it's certainly not OK. My how your comment on IQ must hurt.

Regards,
SB
 
Last edited by a moderator:
I think that needs verifying.

An increase in triangles definitely increases the DS workload. Can DS keep up with the increase in triangles? Is DS texture-fetch limited?

In other words, which takes longer?: generating extra triangles in TS or interpolating their attributes in DS?

Jawed

Hang on, isn't triangle setup outside both DS and TS. Higher tess factors will surely increase time spent in both. So I can't see how DS or TS affect tri-setup.

However, in any real DS, dozens of alu cycles will be spent interpolating the attributes, and I expect the TS to be faster than that, somewhere around 1 tri/clock.
 
I'm sorry but I'm not the one who operate on "faith" here.

And then you write this gem:


What you already know about GF100 leaves no reason to prefer a 5(8/9)70 over GF3(6/8)

Except for, oh, I don't know, what you're going to write next

What we still need to know to make a final judgement is price/perfomance ratios of course. But I'll be very surprised if NV won't set right prices knowing almost everything about competing AMDs cards this time around.

Everything beyond that is "faith".

There's so much double-talk in this post my head is spinning. It's hardly inconceivable that a 5870 might offer 75-80% of the performance of Fermi in most 2010 games for ~60% the cost come this March or April. Will also be easier to attain, and probably run cooler and quieter. Your "faith" that NV will "set right prices" is kind of amusing too; NV will have target margins they need to hit. They might be forced to adjust for lowered margins due to the competiitve landscape, but they can only go so low, and my biggest concern for Fermi is that Charlie, despite all his obvious biases, might not be too far off when writing that TSMC's 40nm process might not be the best for this chip. If yields are initially as bad as can be reasonably expected, don't you think that's gonna impact NV mgmt's initial pricing for Fermi parts?

Like you wrote, I don't plan on multi-display gaming anytime soon, my single 30" works just fine. So a 5870 is plenty fast for me, so unless Fermi can offer more of a performance gap over it than 20-25% then its price better reflect that relative performance for this consumer. Otherwise, the better value for me, considering the pace at which I upgrade anyways, would be a 5870. But of course I'm waiting to see how yields, clock speeds, power, pricing, etc., work out before I decide to buy.
 
Back
Top