NVIDIA Fermi: Architecture discussion

aaronspink · Jan 1, 2010

Razor1 said:
That wasn't PR that stated that, it was a product manager . And its a very fast card, you wouldn't see them stating that if it wasn't, they would go through the route of value for the money or something else. Thats marketing and PR for ya. There is a fine line between a lie and spin. If that statement is false, they are just lieing.

For the record and no offense, product managers ARE PR people. I'm fairly willing to bet that Dave would admit he spends a fair amount of time on PR.

aaronspink · Jan 1, 2010

DavidGraham said:
Could you please over simplify that statement ?
All I could clearly understand is that the CPU isn't being utilized well enough in Crysis, and it is something I noticed a while back .
Also is it a well documented fact , or just a guess ?

Specific issues with the Crysis engine have been discussed on these forums but it was a while back. Someone did a profile and determined that it was doing an unusually high number of draw calls compared to other contemporary engines.

Davros · Jan 1, 2010

aaronspink said:
I'm fairly willing to bet that Dave would admit he spends a fair amount of time on PR.

No, Dave would never stoop that low

Jawed · Jan 1, 2010

aaronspink said:
Isn't that the whole point of VLIW?

Yes it's doing a good job, isn't it?

I'm interested to see how ATI fares with heavy LDS workloads, though. Much like NVidia, avoiding bank conflicts is key. I can't tell which architecture is going to suffer worse from stalls. I think NVidia has an advantage because accesses are treated like gathers and it's the operand collector's problem, rather than making the ALUs stall immediately. NVidia, I think, only stalls once the hardware thread population is exhausted (i.e. latency hiding from other threads).

NVidia's too-small register file, which means less latency-hiding to avoid stalls, may be an issue though.

A good way to avoid stalls in ATI is to increase vectorisation per work item, i.e. increase instruction-level parallelism. But register file still constrains that, ultimately.

I think it's fair to say that local memory is going to be key to the performance of a lot of GPGPU algorithms - though I still think it has a future like that of Cell SPE LS. But there'll be an interim where the only competition, ATI and NVidia, is purely local memory focused. NVidia has already signalled at least a partial step away with the dual-functionality local memory/L1 cache. Arguably the cache-hierarchy in Fermi changes the game - some problems will like cache more than local memory.

But then I also suspect NVidia has a better handle on the classical vector-computer techniques, scan, scatter and gather, which is why I think the 3rd generation GPGPU techniques that come out of people programming with Fermi will steal a march on OpenCL 1.0 techniques.

Where is OpenCL 2.0?

http://sa09.idav.ucdavis.edu/docs/SA09-OpenCLOverview.pdf

Page 8 says 1.1 is coming within 6 months and 2.0 is 2012

2 years

Hopefully 1.1 catches up with D3D11-CS. It seems to me that Fermi/CUDA 3 has about an 18-month lead on OpenCL 2.0.

Jawed

Jawed · Jan 1, 2010

Vincent said:
Has anyone heard of GF104( Derivative from Fermi ) ?

256SPs +256bit GDDR5

That would be around 300mm² I guess.

From the GF100 that's been pictured, and assuming 548mm² (23.4x23.4mm):

256-bit GDDR5 physical I/O = 42.4mm²
8 clusters = 137.4mm²
scheduling/control/blah-blah (that nice square in centre of die - assume it's halved) = 9.7mm²
MCs/ROPs/L2s (2/3 of remaining central portion of GF100 - ignoring the fact that some of this could be single-instance) = 102.9mm²

That totals 292mm².

Feel free to adjust

Jawed

seahawk · Jan 1, 2010

Jawed said:
That would be around 300mm² I guess.

From the GF100 that's been pictured, and assuming 548mm² (23.4x23.4mm):

256-bit GDDR5 physical I/O = 42.4mm²

8 clusters = 137.4mm²

scheduling/control/blah-blah (that nice square in centre of die - assume it's halved) = 9.7mm²

MCs/ROPs/L2s (2/3 of remaining central portion of GF100 - ignoring the fact that some of this could be single-instance) = 102.9mm²

That totals 292mm².

Feel free to adjust

Jawed

Would be a nice Chip. If it clocks high enough, it could be faster as a dual.chip solution thanGF100 in the high end single GPU version.

Razor1 · Jan 1, 2010

aaronspink said:
For the record and no offense, product managers ARE PR people. I'm fairly willing to bet that Dave would admit he spends a fair amount of time on PR.

Product managers aren't PR, what you just stated, is very offensive to a person that is a product manager

Razor1 · Jan 1, 2010

Bouncing Zabaglione Bros. said:
A leaked, off the record vague statement by some unnamed product manager before the chip is even finalised? Oh, well, that proves it's true because Nvidia would never fib in that way.

I'm sure Fermi would be a very fast card if it lives up to all it's design aims, but we already know that the development has not been smooth and there have been many problems. If there were power and clock problems, it wouldn't be the first time that a GPU has been released at less than what was initially envisioned.

We have been waiting since October you know.

A2 was pretty fast too

And go through some old launches, nV doesn't fib thats a good sign of a solid marketing team and pr team, they stretch the truth and pull and twist but never lie. Product managers btw the way for people that don't know what that is, are the guys that develop the product and make sure the teams making the product are informed what is going on with everyone in thedifferent teams. They hold the vision of the product from start to finish it has nothing do with external shit.

We know the developement hasn't been smooth from what sources, rumor sites?

Arty · Jan 1, 2010

seahawk said:
Would be a nice Chip. If it clocks high enough, it could be faster as a dual.chip solution thanGF100 in the high end single GPU version.

It would be the fastest Nvidia product but I dont think it would top Hemlock just for the simple reason that I dont think such a hypothetical GF104 can match or surpass a Cypress XT.

Razor1 said:
A2 was pretty fast too

And go through some old launches, nV doesn't fib thats a good sign of a solid marketing team and pr team, they stretch the truth and pull and twist but never lie. Product managers btw the way for people that don't know what that is, are the guys that develop the product and make sure the teams making the product are informed what is going on with everyone in thedifferent teams. They hold the vision of the product from start to finish it has nothing do with external shit.

We know the developement hasn't been smooth from what sources, rumor sites?

Only if it was so simple; black and white. But alas, we have to deal with grey and one which has more shades of black.

Alexko · Jan 1, 2010

So I guess "This puppy here is Fermi" wasn't a lie, it was the truth being stretched...

FenderBender · Jan 1, 2010

seahawk said:
Would be a nice Chip. If it clocks high enough, it could be faster as a dual.chip solution thanGF100 in the high end single GPU version.

Both the 9800GX2 and the GTX295 dual chip GPUs from the past years have both used the largest GPUs (though usually downclocked) and not smaller die derivatives. Probably because anyone paying the premium for dual GPU is interested in the maximum performance even at a premium price.

That said, I admit I am impressed by the performance/watt of the 40nm GT240. Power-wise, you could make a single-card quad GPU board with its low wattage. That's not a practical SKU because of teh crowded PCB, and you'd still have to share PCIE and your onboard RAM would likely be only 512MB per core, but for something that scales to multiGPU, and doesn't need the memory size or PCIE bandwidth, it'd be great. These apps might be something like brute force hash or code cracking, or number theory factoring/trial division.

Sontin · Jan 1, 2010

Alexko said:
So I guess "This puppy here is Fermi" wasn't a lie, it was the truth being stretched...

It's the same with the "first 40nm mobile product" from AMD.

Bouncing Zabaglione Bros. · Jan 1, 2010

Razor1 said:
A2 was pretty fast too

And go through some old launches, nV doesn't fib thats a good sign of a solid marketing team and pr team, they stretch the truth and pull and twist but never lie. Product managers btw the way for

<cough>NV30<cough> Although most of the lies came in the six months before the launch, so I'm sure NV PR would claim they don't count as "launch lies" as they are actually "pre-launch lies".

Razor1 said:
people that don't know what that is, are the guys that develop the product and make sure the teams making the product are informed what is going on with everyone in thedifferent teams. They hold the vision of the product from start to finish it has nothing do with external shit.

Doesn't mean the design teams or the process can reach that "vision".

Razor1 said:
We know the development hasn't been smooth from what sources, rumor sites?

Are you really going to suggest with a straight face that everything is going to plan because Nvidia hasn't come out and formally said they've screwed the pooch?

From that, and the fact they have to keep going back to do respins and keep promising the product for months, and then missing those deadlines again and again. First it was October, then November, then end of the year, then Q1, now March. Nobody in their right mind would suggest it was planned to miss Christmas, back to school, and the marketing opportunity of DX11/Windows 7. Along with letting one of their AIB makers sink.

Razor1 · Jan 1, 2010

Bouncing Zabaglione Bros. said:
<cough>NV30<cough> Although most of the lies came in the six months before the launch, so I'm sure NV PR would claim they don't count as "launch lies" as they are actually "pre-launch lies".

could you show me where they lied. If I remember correctly, they never stated anything concrete until weeks before the launch.

Doesn't mean the design teams or the process can reach that "vision".

Are you just being obtuse here lol, they are the guys the make sure the teams get things done on a day to day basis

, part project managment part evangalist.

http://www.rapidsi.com/resumes/Anand%20Mandapati%20Resume.pdf

here do you see PR in that resume.

Are you really going to suggest with a straight face that everything is going to plan because Nvidia hasn't come out and formally said they've screwed the pooch?

Heh we will see

next time get a dictionary and see what things mean. Or just google it, I work with product managers all the time, so I kinda know what they are all about.

From that, and the fact they have to keep going back to do respins and keep promising the product for months, and then missing those deadlines again and again. First it was October, then November, then end of the year, then Q1, now March. Nobody in their right mind would suggest it was planned to miss Christmas, back to school, and the marketing opportunity of DX11/Windows 7. Along with letting one of their AIB makers sink.

TSMC has been more of burnden then nV "hint".

MfA · Jan 1, 2010

Bouncing Zabaglione Bros. said:
Are you really going to suggest with a straight face that everything is going to plan because Nvidia hasn't come out and formally said they've screwed the pooch?

I don't think that anything which has been happening in the last couple of months was not going to plan myself. Personally I doubt functional first silicon for new architectures is the most likely outcome of the first tape out, personally I doubt they have just been going through metal respins in the mean time, I think they are just on their second tape out and such a tape out was always within expectations. Everything being done now just takes time ... their problem was never the work being done now, their problem was being well over half a year behind AMD in taping out in the first place.

Thalb · Jan 1, 2010

Razor1 said:
it was a lie and that was a really bad thing to do. But who cares they showed what Fermi could do, they demoed it that day.

Actually they spoilt a very interesting preview to their next generation hardware with that fake
I first supposed it was an intentional disinformation, showing chip #7 from week 35 and the sawed-off PCB with wood screws to give Charlie some giggles, and then launch GTX380 on November 24th to silence them all heathens...

Well it did not happen! Now the same thing is going to happen in march? some people seem to go for some "Fermi Constant": it was launching Nov 24th before Cypress came out, after that it was before Christmas, then Q1'10,and now march? It's like those guys predicting Oil will be over in 30 years: In the 1970ies, it was by the year 200, now it's 2040.

Thalb · Jan 1, 2010

Bludd said:
This thread has spawned a lot of new posters. Where has this thread been posted to collect such an influx of new people?

Nowhere. I can speak only for myself, but I've been following your site for ca. 1 year before my first post. That I have more time to actually post messages rather than just read them right now might have something to do that I have a few days off over the holidays. Anyway I think your site is the most competent about graphics cards of all that I found in the internets. So if you're not interested in new posters and prefer to remain your restricted club, I will behave myself and stop posting. Truly when people call for posters to be banned for no reason whatsoever with their first post... I have no words for that.

MfA · Jan 2, 2010

Razor1 said:
Missing by a few weeks I don't know what you want. You want a viable product wait a few weeks.

Just curious, where does this come from? (Q1 is longer than a few weeks after all.)

Arty · Jan 2, 2010

Razor1 said:
The problem is ATi sold close to 1 million chips of Dx 11 these 6 months, ya know what that is very low when there is no competition.

AMD's DX11 were launched in late Sept, sane people would count it as three. Or did March 2010 come early this year?

Razor1 said:
Missing by a few weeks I don't know what you want. You want a viable product wait a few weeks.

So March 2010 is in fact here!

Dave Baumann · Jan 2, 2010

Razor1 said:
matel layers. or was it the other way around, either way they had 5 respins on the chip, including silicon and metal respins, Fermi doesn't have those kind of issues.

Nope. R600 shipped on A13, which gives the number of spins. There were also no silicon spins there, metal only.

NVIDIA Fermi: Architecture discussion

aaronspink

aaronspink

Davros

Jawed

Jawed

seahawk

Razor1

Razor1

Arty

KEPLER

Alexko

FenderBender

Sontin

Bouncing Zabaglione Bros.

Razor1

MfA

Thalb

Thalb

MfA

Arty

KEPLER

Dave Baumann

Gamerscore Wh...

Similar threads