NVIDIA Kepler speculation thread

silent_guy · Feb 4, 2012

CarstenS said:
Right, they did - or PR'ed to do so, depending on your point of view.

#1 is actually, what they're accused of for a long time. Now imagine the possibilities, a new Physx version would open, fully optimized for modern multicore-CPUs (apart from distributing multiple particle emitters over a number of cores) and their advanced instruction sets. Wouldn't that make a system shine, which satisfies certain hw requirements in order for the new Physx stuff to work it's magic?

Unless they patch the game itself to do more work on both AMD and Nvidia to the point that the game becomes heavily CPU bound, it would still be a net increase in the demands on the GPU and thus a net loss in a comparative benchmark of existing games.

jimbo75 · Feb 4, 2012

Assuming they did a new multi-core optimised version wouldn't that give Bulldozer a leg up vs Sandy/Ivy Bridge lol.

CarstenS · Feb 4, 2012

silent_guy said:
Unless they patch the game itself to do more work on both AMD and Nvidia to the point that the game becomes heavily CPU bound, it would still be a net increase in the demands on the GPU and thus a net loss in a comparative benchmark of existing games.

I was talking - and that's purely hypothetical, mind you, and I don't have any insider info on this either! - about a new, optimized Physx version (maybe just a different path in a dll) for CPU-Physx that only gets utilized once an appropriate Nvidia-GPU is detected (i.e. Kepler). Having pseudo-opened Cuda recently, this would put them in a position to claim everybody can develop their own optimized version as an added value to their respective customers.

Again, I am not saying that this is going to happen, but rather that this is a scenario that could fit what Charlie's been describing.

edit: Yes, it's silly season

DarthShader · Feb 4, 2012

itsmydamnation said:
yep redirect code that was written for low thread count low latency processor to a high threaded high latency processor.............. what could possible go wrong/ how hard could it be //Clarkson

They can intercept the call to CPU version and insert the GPU version of the call. That's not hard if you own the whole ecosystem. PS. Given how PhysX used to be doen on x87 and probably still doesn't use vector SSE instructions I kinda doubt there is "low thread count low latency" version at all.

Sxotty · Feb 4, 2012

Silent_Buddha said:
Sure, and the 30 watts I save on my CPU isn't much. And the 10 watts I save on each light bulb aren't much. And the 40 or 50 watts I save on the refrigerator might not be much.

If I ignore all of those and discount them as "not much" then at the end of the year I save 0 dollars.

Power savings converted to monetary savings has NEVER been about 1 large lump of electrical savings from 1 item. It's about the combined savings from multiple items.

Just from the computer alone I save over 100 watts in power due to choice of components without sacrificing much performance. A few watts from MB choice. A little bit from low voltage memory (lucky to get it on sale). A big chunk from video card. Another large chunk from CPU. A few watts from HDDs. And then a percentage of all that from the PSU. Passive cooling when able (no power used by fans), etc.

If you start discarding power savings from any single item you've already defeated the purpose of trying to reduce consumption and increase monetary savings (I'm doing this for the money and not the environment. ).

The same principles apply to groceries, and just about anything else in the world.

Regards,
SB

You misunderstand. How much better does the refrigerator that is power hungry work? Not at all. How much better does the incandescent light bulb work than a CFL? When you want to play a game a power hungry GPU does work significantly better than the integrated GPU. They are not comparable products. Nor does comparing refrigerators and lightbulbs to processors make sense. The refregerator adds more insulation, the light bulb works on a different principal. The GPU from one manufacturer and another work on very similar principles. I really think that there is a difference between a wall wart that is inefficient and a GPU. When you replace the wall wart you get the same exact service. When you replace the GPU you no longer do. Yes it all adds up. I am not arguing it doesn't. I do not however believe that we live in a world where that is driving the decisions for most purchasers of high end GPUs. I do believe that at least some people think about light bulbs and refrigerators.

itsmydamnation said:
personally i think AMD pissed charlie off somehow, so he had a little cry and carry on and is now caught between a rock and a hard place. Fanboys on both sides kind of sitting around going WTF Mate?!?!?

That was my theory as well

xDxD · Feb 4, 2012

The story about physx is a nonsense by me, nvidia's vgas are already most powerfull with physx....because the code is nvidia's middlewere...absolutely non-sense, talking about nothing...

excuse my english please

Psycho · Feb 4, 2012

DarthShader said:
They can intercept the call to CPU version and insert the GPU version of the call. That's not hard if you own the whole ecosystem. PS. Given how PhysX used to be doen on x87 and probably still doesn't use vector SSE instructions I kinda doubt there is "low thread count low latency" version at all.

I don't think you got the line you quoted at all. I haven't really looked into the API, but cpu-physx seems to be essentially single threaded and synchronous, and in that case there isn't really a good reason to batch things up. So the application is probably making a lot of small calls to it, and waiting for them to execute..
The GPU has very high troughput, but generally also very high latency, so you need to batch tasks up, and you should also run it asynchronously. So if you do a gpu backend implementation of the current synchronous cpu-physx API, it will probably be quite a bit slower than the cpu version.
Secondly; as the games using cpu-physx generally aren't very cpu limited (you could say they aren't doing that much physics processing) the potential gain of acceleration those calls is basicly nothing.

But ofcourse you could make a faster version of gpu-physx, which could benefit those 2 games from last year utilizing it - one of them may even be on the benchmark list

CarstenS's suggestion is more reasonable, but it will take time before we would see any effect of it.

Zogrim · Feb 5, 2012

Psycho said:
but cpu-physx seems to be essentially single threaded and synchronous

Actually, SDK 2.x supports asynchronous stepping.
It was "not designed for native parallelism", as docs honestly say, but supports minor threading inside single physics scene, and can run set of sub-scenes (called compartments) in parallel.

As for GPU acceleration of CPU PhysX games - unlikely. Even latest SDK 3.2 does not support GPU acceleration for rigid bodies (not talking about joints, character controller, sweep and raycasts, etc) - something what constitute 99% of "physics" in current games.

Silent_Buddha · Feb 5, 2012

Sxotty said:
You misunderstand. How much better does the refrigerator that is power hungry work? Not at all. How much better does the incandescent light bulb work than a CFL? When you want to play a game a power hungry GPU does work significantly better than the integrated GPU. They are not comparable products. Nor does comparing refrigerators and lightbulbs to processors make sense. The refregerator adds more insulation, the light bulb works on a different principal. The GPU from one manufacturer and another work on very similar principles. I really think that there is a difference between a wall wart that is inefficient and a GPU. When you replace the wall wart you get the same exact service. When you replace the GPU you no longer do. Yes it all adds up. I am not arguing it doesn't. I do not however believe that we live in a world where that is driving the decisions for most purchasers of high end GPUs. I do believe that at least some people think about light bulbs and refrigerators.

Is the 50 watt higher refrigerator capable of cooling faster or colder? Quite possibly. But the point would be that I don't need my food to be any colder than necessary and I don't need to cool it faster.

Similar to video cards or CPUs. The 6970 compared to 580 (using same "generation" of GPUs here) has the 6970 saving ~50 watts at load for ~10-15% less perf. Idle is only ~10 watts lower.

So is burning through 50 watts at load worth that 10-15% perf increase? For me, not really. The perf. is close enough as to be indistinguishable most of the time. The money savings at the end of the year however would be quite noticeable. And the 10w at idle is just 10w burning through money for no good reason.

Things get even worse if you compare 7970. Where idle is now ~20 watts less and long idle a ridiculous 30w less. On the flip side load power is now only ~35 watts less (in games not OCCT) but performance is 15-20% higher.

I'd love to compare it to Keplar, but at this point who knows when Keplar will be out, much less how it'll perform? Myself I'd love to see Keplar come out at better perf/watt. If they also have DP output (rumored that they do) then I might actually be able to seriously consider Nvidia for the first time in a long time.

Anyway, even using the performance justification. You can still realise significant savings without significant sacrifices. Mmm, just like any other area where you budget to save money without sacrificing quality of life.

That said, not everyone is interested in saving money. And that's fine. It's all about the value they find in a product. I don't find value in (to me) wasted money so I have difficulty not considering things like this.

I'm sure just like people like me think it's foolish of people to throw away money like that. I'm sure people that can support extravagent expenses while still saving for their future think it's foolish to limit my spending.

Well, at least I hope they are saving for their future.

Regards,
SB

CarstenS · Feb 5, 2012

Zogrim said:
Actually, SDK 2.x supports asynchronous stepping.
It was "not designed for native parallelism", as docs honestly say, but supports minor threading inside single physics scene, and can run set of sub-scenes (called compartments) in parallel.

As for GPU acceleration of CPU PhysX games - unlikely. Even latest SDK 3.2 does not support GPU acceleration for rigid bodies (not talking about joints, character controller, sweep and raycasts, etc) - something what constitute 99% of "physics" in current games.

I think Rigid Bodies not being GPU accelerated has something to do with the attached image from an older (2008) LRB-presentation, which clearly shows how Amdahl cripples Rigid Bodies even on a much more freely programmable µarch like Larrabee.

It's only natural for engineering to go for the low hanging fruits first. And it's only natural for marketing not to disclose that.

Zogrim · Feb 5, 2012

CarstenS
It is hard to parallelize rigid bodies for themselves, yes.
Here is some papers
http://tclab.kaist.ac.kr/~sungeui/Collision_tutorial/Richard.pdf (NVIDIA).
http://bulletphysics.org/siggraph2011/takahiro_siggraph2011.pdf (AMD).

And all of those are ineffective for small amount of rigid bodies (typical for current games).

If NVIDIA has found a way to run such physics effectively on GPU, that would be cool - but again, very very unlikely.

trinibwoy · Feb 5, 2012

CarstenS said:
If you mean GK104:
In their reviewer's guide? No doubt about it.
In reviews using mostly integrated benchmarks of game applications? Possible.
On average in real-world in-game scenarios? Doubtful at best.

Nope, the big boy, $500 part. I"m under no illusions of GK104 being a world beating part for $299. Well that's hoping that it's $299 and nVidia isn't planning to pull a Tahiti.

Silent_Buddha said:
So is burning through 50 watts at load worth that 10-15% perf increase? For me, not really. The perf. is close enough as to be indistinguishable most of the time. The money savings at the end of the year however would be quite noticeable.

Even if you live in CT with the highest electricity rates in the US (18c/kwH) and play games for 8 hours a day, every day for a year your net increase in annual electricity costs would be $26. You really want us to believe that $26 a year is noticeable!? Also, I'm gonna assume you don't play games for a living so the real cost is probably closer to $5 a year

Of course people are interested in saving money. However there's simply nothing to be saved in this particular case. It's a dead-end argument. The numbers just do not make sense from a cost-savings angle.

Lightman · Feb 5, 2012

A bit off topic, but if anyone want to save on electricity while gaming just turn on vSync where possible and or turn on AA modes. It can cut GPU power by 50%-75% while still delivering silky smooth performance.
Of course not an option for some games and gamers where 200FPS feels still laggy and vSync introduces extra input lag.

Back on topic.
I'm in down to earth camp regarding Kepler and especially GK104. TSMC transistors can only do so much for both AMD and nVidia and therefore for certain number of them I would expect certain power consumption and performance. Long gone are days of low hanging huge architectural gains one camp could benefit from. Therefore assuming similar die size / number of transistors I'm expecting +/-25% performance while keeping power consumption in check!
In other words, I don't see clear winners in either camp after Kepler releases. There will be pros and cons to both as there were in prior architectures.
Also if nVidia is to gain some clock/power headroom due to maturing TSMC process, then I will expect AMD to follow suit with revised chip similar to what they did in RV770/90 era.

Time will tell :smile:

trinibwoy · Feb 5, 2012

In other news....

http://semiaccurate.com/forums/showpost.php?p=151520&postcount=19

CharlieD said:
Did I mention that I am one exit up the 101 from Nvidia HQ right now where there are TONS of those cards?

Ask the following list of people who showed them a picture of a GK104 in the past week:

Chris Angelini (Toms Hardware)
Scott Wasson (TechReport)
Anand (Anandtech)
Ryan Shrout (PCPerspective)
Mark Hachman (PC Magazine)
David Kanter (Real World Tech)
Koen Crijns (HardwareInfo.nl)
Johan (Anandtech)
Andreas Stiller (CT/Heise)

So TONS of GK104 cards at nVidia HQ and industry heavyweights are in the loop but nothing for us peasants?

Charlie is hyper-defensive though, OBR seems to have him rattled.

Sxotty · Feb 5, 2012

trinibwoy said:
Even if you live in CT with the highest electricity rates in the US (18c/kwH) and play games for 8 hours a day, every day for a year your net increase in annual electricity costs would be $26. You really want us to believe that $26 a year is noticeable!? Also, I'm gonna assume you don't play games for a living so the real cost is probably closer to $5 a year

Of course people are interested in saving money. However there's simply nothing to be saved in this particular case. It's a dead-end argument. The numbers just do not make sense from a cost-savings angle.

Thanks Trini you beat me to it. I am not saying that I mind the idea of being efficient. I like cool and quiet type of things. I liked the notion of using the IGP and GPU both to save power at idle. I just don't think someone spending $500 on a graphics card is going to be upset about $26/year. I would argue there is a far greater argument to be made that you can get 70% of the performance for $250. That is about 10 years worth of electricity costs.

hoho · Feb 5, 2012

trinibwoy said:
Even if you live in CT with the highest electricity rates in the US (18c/kwH) and play games for 8 hours a day, every day for a year your net increase in annual electricity costs would be $26. You really want us to believe that $26 a year is noticeable!? Also, I'm gonna assume you don't play games for a living so the real cost is probably closer to $5 a year

If you'd live in a place where air conditioner is needed to keep it cold enough then you can probably tripple that power usage.

trinibwoy · Feb 5, 2012

hoho said:
If you'd live in a place where air conditioner is needed to keep it cold enough then you can probably tripple that power usage.

So $15?

whitetiger · Feb 5, 2012

trinibwoy said:
So $15?

And the other side is that for a lot of the time, the heat is not actually wasted
- it just contributes to the general heating of your home.....
- so, when you have your home-heating on (as opposed to air-con on) the actual cost is close to zero....
(i.e. the difference between electric & gas heating)

So, a difference of a few 10s of Ws is not very important compared to the actual performance of the card.

ninelven · Feb 5, 2012

I've got a solution for Charlie. He can post an encrypted 7zip file with an image of GK104 and release the password after the NDA is up.

DuckThor Evil · Feb 5, 2012

It would still compromise his source just the same.

NVIDIA Kepler speculation thread

silent_guy

jimbo75

CarstenS

Moderator

DarthShader

Sxotty

xDxD

Psycho

Zogrim

Silent_Buddha

CarstenS

Moderator

Zogrim

trinibwoy

Meh

Lightman

trinibwoy

Meh

Sxotty

hoho

trinibwoy

Meh

whitetiger

ninelven

PM

DuckThor Evil

Similar threads