Nvidia BigK GK110 Kepler Speculation Thread

I am sorry , but the "use the full potential of my CPU" argument is not simply valid in this case. When the top CPUs are struggling to deliver 40 FPS , then you have a serious problem .. your engine is suffering from a massive bottleneck that needs to be resolved , either by more optimizations , or shifting the load elsewhere.

It's pretty obvious , the Cryengine had never had this kind of CPU dependency , not in it's first version nor the second , the fact that it does now , asserts the nature of this newly created problem, it's like they wanted to intentionally cripple the hardware to restore their glorious days of the "Can it run Crysis?" era !

Eh? So, if a game engine pushes the GPU it's forward looking.

If a game engine pushes the CPU it is bottlenecked? :p

There are still lots of things that can affect the overall graphical impression of a game that GPU compute is still unsuited for. Coherent physics affecting world geometry (vegetation that not only reacts to the character but to external forces such as wind, rain, projectiles, etc.). I really hate games that have say static trees and grass that never moves. Those aren't directly related to "graphics" but they make a game look so much better when they are realistically animated with regards to potential external forces (physics simulations). Canned animations only get you so far before you start to see repeating patterns.

Which is also why ALL of Cryteks engines (from Far Cry to Crysis to Crysis 2 to Crysis 3) are hugely CPU bound.

It's also why any MMORPG that tends to have a moderate amount of physics are generally CPU bound.

Theoretically once APUs with competent GPGPU capable IGPs become more common place we'll be able to push that to the IGP while your discrete graphics card renders the graphics.

Anyway, I'd argue that any game engine that doesn't push the CPU (becoming CPU bound) and GPU (becoming GPU bound) isn't really pushing the game engine envelope.

I fully expect that CPUs are going to get pushed pretty hard with the new generation of game engines that will be developed to take advantage of the new generation of game consoles.

Regards,
SB
 
How do you know it is CPU limited? It scales *exactly* with resolution going from 1680x1050 to 1920x1080.

The CPU scaling tests are done at 1920x1200, and only at medium quality. Here they gain 36% higher framerate (55->75fps) from a 80% higher clock (2.5 -> 4.5 GHz).

CPU dependent: yes. CPU limited: No

Cheers
Two samples don't give very much data. Was the game CPU limited until 3 or 3.5 GHz or some other value? Perhaps part of your benchmark is CPU-limited and other parts are not. Was the benefit evenly distributed across the 2.5 to 4.5 GHz range? If so, then I assert that part of the benchmark is very CPU-limited but the rest not so much.
 
Having part of the game CPU-limited is not the same as having the whole game limited by CPU.
I never said it did. What I was saying is that two data points (2.5 GHz and 4.5 GHz) don't tell you a lot and I gave some ideas on how to improve the data to better understand what is happening.
 
I never said it did.
Fair enough.
What I was saying is that two data points (2.5 GHz and 4.5 GHz) don't tell you a lot and I gave some ideas on how to improve the data to better understand what is happening.

Performance does scale linearly with CPU, but it also scales well with GPU performance. If it is valid to say the game is limited by CPU it is equally valid to say the game is limited by GPU.

IMO, Crytek has struck a fair balance between CPU and GPU, they certainly push the envelope though. You'll need a state of the art CPU and a state of the art GPU to get the most from it.

Cheers
 
All this being said, there's still lower detail levels available in Crysis 3 for a-still-great-looking game on lower end hardware and I can only applaud Crytek for being able to achieve good utilization of even higher end hardware ressources. Something, that precious few developers have dared to do for too many years now.
 
That on medium settings , not High or even Very High ..

Okay, but again you're link shows performance at very high to be scaling very nicely with GPU performance up to 44fps with a 680. Based on the scaling in that chart there's no reason to expect it won't keep going up with more GPU performance. I'm sure some 690 or Titan benchmarks would prove that point.
 
Performance does scale linearly with CPU, but it also scales well with GPU performance. If it is valid to say the game is limited by CPU it is equally valid to say the game is limited by GPU.
80% CPU speed gain only results in 36% performance, not very good CPU scaling really. As I said, you need to look at more data points. Perhaps all the benefit was going from 2.5 to 3.5GHz and the rest didn't buy anything at all. Perhaps only parts of the benchmark got faster (Fraps would tell you this.).

Ideally, the gains would be where your performance was worst. Otherwise the gameplay experience won't seem much better.
 
A damn long and hard analyze should be done on the threads, calls and usage.. dont forget some " threads " are surely dedictated to some specific works.. hence the difference if you bench the game here and there on cpu "performance " . But if someone have access to the crytek Engine, it should be easy to get the informations without having to " analalyze all" in real time.

The gain on performance ( or time ) is really variable, depending the scene, the render, the simulation, the calculation. all of this is really variable from a situation to an other. ( without even starting speaking about optimisation ).

OpenCL said it right.. you will need ameliorate so much the performance on the place they are lagging behind.. but basically it will increase dramatically the performance where you was allready fast at the same time. ( untill you hit a bottleneck ofc )
 
Last edited by a moderator:
Just bought a titan, I could careless about the gaming performance of this card, so I have not done any test on gaming, but in terms of computing performance, the card is a very good:

(1) 2.5-5X faster than a GF110 on float-32, have not tried float 64 yet.

(2) For parallel applications, 10X-12X faster than a multi-threading program run on a sandy-bridge 3930K, again, float-32.

(3) I have not tried any exclusive GK110 features, just rebuild my codes with compute capablity 3.5, and thats it.

I congfigured GK110 for compute only, GF110 for display such that my destop wont be freezen when I run some CUDA programs, also by doing this, I find the GK110's compute performance increases a little.

The only problem I get so far now is the two cards are too large for my case to accommodate, so I have to insert hard drivers where FDD and DVD rom should be resident, I may need to buy a new case i guess.

Anyone, I think for CUDA users, the card is impressive, basically a much cheaper tesla K20.

I guess the reason why Nvidia relesease this card so cheap is that it want to build a tough competition against the upcoming intel MIC, which i have also tested in my office, and the performance is not as impressive as this card for many parallel applications I can think of to offload from CPUs, but sure the programming side is a easy, bascially anyone who know C++ and multi-threading can do programming on MIC in about half an hour.
 
Just bought a titan, I could careless about the gaming performance of this card, so I have not done any test on gaming, but in terms of computing performance, the card is a very good

I guess the reason why Nvidia relesease this card so cheap is that it want to build a tough competition against the upcoming intel MIC, which i have also tested in my office, and the performance is not as impressive as this card for many parallel applications I can think of to offload from CPUs

Since you have tested both the Titan and the MIC and your results are that the MIC is not as impressive how much does the MIC underperform the Titan?

but sure the programming side is a easy, bascially anyone who know C++ and multi-threading can do programming on MIC in about half an hour.
Do you think that the ease of programming the MIC's is why it underperforms?

Or is it just that the MIC is not as powerful as the Titan?
 
Since you have tested both the Titan and the MIC and your results are that the MIC is not as impressive how much does the MIC underperform the Titan?

Do you think that the ease of programming the MIC's is why it underperforms?

Or is it just that the MIC is not as powerful as the Titan?

For the few parallel programs I wrote that can be run on MIC, MIC is about ~2X faster than two-xeon CPUs in my office (the two-xeon is about 70% faster than my 3930K at home), these tests have taken data-transfer time into account, the same for my other benchmarks.

But MIC is easier to programming with, since MKL can support it directly, intel compiler can build it just like any other C programs, you dont need to deal with the half-assed nvcc compiler, the werid buidling process and the half-assed cuda libraries.

Intel uses the same C lanague without much extension (open mp for multi-threading), and you can basically treat MIC as some multi-core CPUs with the same multi-threading program techiques, but a drawback is, at least at the time when I am testing that toy, MIC only support linux system.
 
In terms of theortical output, GK110 is 2.3X faster than MIC at float-32 maths ops and 1.3X at float 64, sure the large cache of MIC can help it at handling more general tasks, but if your code can benefit greatly from large cache/branch predications etc, then maybe it is more suitable to run on CPUs directly.
 
In terms of theortical output, GK110 is 2.3X faster than MIC at float-32 maths ops and 1.3X at float 64, sure the large cache of MIC can help it at handling more general tasks, but if your code can benefit greatly from large cache/branch predications etc, then maybe it is more suitable to run on CPUs directly.

Kepler's 30% advantage in flops is very tiny in the grand scheme of things.

More caches, branch prediction helps all codes.

MIC does not have any branch prediction.

MIC has 2x more cache per core than Kepler. Which makes a lot of difference for everything.
 
Back
Top