Crytek on PS3/X360 (+ more - great read)

Titanio

Legend
Nemo80 found this and mentioned it in a couple of other threads, but I thought it was worth its own thread.

A german game magazine, GameStar, has posted a pdf of an interview with Cevat Yerli, CEO of Crytek. It's quite an interesting read, more technically orientated than most of this type, and has some good info in it.

It's in german, and available here:

http://www.gamestar.de/dev/pdfs/crytek.pdf

Babelfish translation (readable enough, but can anyone do better?):

You work straight on the next game and concurrently on the CryEngine2...

... said we to it something?

You, I * g * where do not go the technical journey?

In the rough one the journey goes to an Streaming architecture, CROSS platform to Multi Threading, thus multi-core and Multi CPU, and over each pixel at least a Shader will run.

Do you remain with SM3.0 or do jump it directly on SM4.0?

We support SM2.0 and upward

And upward?

And upward.

The new PC processors and the coming consoles set substantial on Multi Threading. How do you use the Potential?

We scale the individual modules such as animation, physics and parts of the graphics with the CPU, depending on how many threads the hardware offers. We support both Multi CPU systems and Multi Threading and multi-core. With three CPUs with two hardware Threads each (dual core CPUs) it can be that we scale on six Threads. Possibly we do in addition, without it, depending on, how quickly the individual cores and/or CPU Threads run. But we develop a system, which analyzes, how much Threading power are available and scale the engine then accordingly.

x86, power PC and power PC plus Cell: All architectures have their own Threading organization.

The 360 resembles Hyperthreading. There is in principle three CPUs with two Hyperthreads each. If you ask the hardware manufacturers, is not it naturally like that. But if one analyzes it as a software developer, it is nothing different one than Hyperthreading. That is, one has six Threads, actually however only three times 1.5 Threads. On the PlayStation 3 it looks differently with Cell: The head CPU has two Threads (somewhat better than Hyperthreading), and in addition comes seven synergetic processors.The eigth SPU existing in the Design was omitted.

Because of Yields?

A pure aspect of production. The SPUs are not as flexible as a conventional CPU, and therefore we scale there differently.

Between individual architectures hardly the complete code can be transferred. At least Low level calls x86 and power PC must be modified, but how portable is the code within the two power-PC-BASED consoles?

Also not really portable. We have, I believe, the only German enterprise PS3-Devkits. Accordingly we can look ourselves in the hardware and practice instead of only speculate. The PS3 is a system, which needs special adjustments, which are particularly begun to cut on the PS3-Architecture - a simple port does not function.

It is thus not like that, how Sony stated that you a beautiful Layer have, your code, and everything functioned on Cell marvelously?

Wish themselves in such a way. But that is far still. Also the Devkits is not so far yet. On the basis of the provided information to the Devkits one must operate at quite a Low level, in order to get which from the hardware.

Topic interprocess communication: how relevant are the differences between Multi Thread architectures?

A very relevant question. The following things are relevant with the Multi Threading on the hardware side: Does the Threads run on genuine cores (they in each case their own register set have?)? Or there is a hardware abstraction as with power the PC, where - here two - the Threads has genuine own registers sets, but on the same core is nevertheless, so that at the Issuen of instructions on only individually existing units both Threads cannot work. Multi Threading with Hyperthreading tries only to always distribute the instructions on the different super+scalar units (Math operations on the Integer and floating unit, load/net curtain etc.).

With several cores still the question about the bus binding exists: Do they divide the same bus to the periphery and to main storage? How is the Caches implemented, does divide all Threads the same Cache? And: Shared MEMORY vs. independent local memory. A complete power PC core is also part of the Cell system. This is present however not only as independent arithmetic and logic unit, but also as host for the individual Cell cores. With Cell architecture still individual Cell cores connected by an ultrahigh bus are in the system except the power PC core, which can communicate independently with one another. If one uses the optimal parallelism, one can by this parallel architecture and the super fast bus during optimal extent of utilization a linear scaling with the number of Cell cores to reach.

How do Multi Thread systems on single core PCS behave?

As developers we are in the bloedesten situation, in which we could be. We must support 32 and 64 bits, single and Multi CPU, Singleund Multi Thread, Cell and not Cell as well as OpenGL and DirectX. The expenditure to develop a technology which uses these parameters optimally, is extremely high. The technical expenditure is at least twice as high as with the first CryEngine.

The step from 32 to 64 bits was program-technically surely simpler than from single to Multi Threading.

Unfortunately it is not a step. We cannot dare the step yet, we must both support.

Are there performance problems, if multi threaded the programmed Cry engine 2 on a single core PC runs?

The code can run sequentially. One loses thereby some efficiency, but what one wins by further optimizations, is higher. Thus the price to lose a certain Framerate if the code on a single Thread CPU runs, is so marginal that one can clean-get by other code improvements again. Most PCSpiele is anyway not correctly optimized, also far Cry.

In the console segment the developers take often amazing out of a relatively harmless hardware, because they must make it. Would run on the PC exactly the same, one need for Doom 3 probably only a Geforce 4 Ti.

That is the point. The development of the hardware is so rapid, which one gets only short time, around which rauszuholen. Exactly the same it is also with the CPUs. If one goes with a Multi Threading Renderer on a CPU, one must take oneself evenly the time to optimize. The largest problem thereby are Cache Misses. Additionally one should avoid global memory between the individual Threads. Simply said: If we access the same pot, the pot may not change. If I access briefly before you an element, then kriegst you it no longer, or it is not any more what you expected. In order to go around, one changes in best in a step the something, gives the ErErgebnis then further to the open memory and there for other CPUs freely (Unlocking).

How keep do I consistent with six Threads physics?

One can solve that not only technically, but also creatively. One must go new ways: As we can, and that is a basic question, how we can scale our play from single Thread to eight Threads qualitatively. In such a way that equal Gameplay it remains but the play qualitatively better ausieht, or in such a way that I can play it better. On the PC one must reduce its choice to cosmetic improvements, since everyone has the power, but it would give serious differences in the Gameplay. On the technically fixed consoles one could permit also better Gameplay with the suitable optimizations. Accordingly one must examine two scaling bar as multi-platform developers: FX and Gameplay. On the PC often only FX is optimized (higher dissolution of texture etc.). We would like to scale both FX and Gameplay, over the intensity of the logic or over the quality of the Shader for example.

How does it stand with the Portability between x86-PC and Xbox 360?

Architecture is different in principle, but nevertheless a lot more similarly than from Xbox 360 to PS3. The CPUs of PC, Xbox 360 and PS3 has actually only a relevant similarity - and that is Multi Threading. When generic CPU is the 360-Prozessor the most efficient, but if one takes the seven SPUs of the PS3 in addition, then achievement conditions look again differently. Before we had the PS3-Devkits, we thought that PS3 and Xbox 360 are closer than PC and console - with the development. Does not seem so however * laughs *

Therefore you optimize your engine on the SPUs of the PS3?

Definitely. That is a must for us, since we want to use the power of the PS3 completely. Accordingly the PS3 is gotten nearly a completely own engine architecture, a kind sub-architecture in the CryEngine 2.

The development of the console title far Cry Instincts still outward gave. This time is portiert?

Ähem ...

... the graphics interfaces provide nevertheless for additional expenditure?

Yes, because of OpenGL IT for the PS3 we must describe the entire Renderer. If one takes it exactly, the CryEngine 2 will have altogether any special solution for any special problem. If that abstracts a play developer, the technology is optimized very specifically. Otherwise one cannot use the power perfectly. That can alternative be abstracted naturally also in such a way that it runs on all systems, but then loses the strongest platform at most.

It's almost as if they were reading the minds of B3Ders with some of the topics discussed ;)
 
Last edited by a moderator:
powers0oc.gif

"But what does it all mean, Basil?"

A quick summary?
 
HappyBread said:
powers0oc.gif

"But what does it all mean, Basil?"

A quick summary?

The main things I took from it:

- Along the lines of debate being had elsewhere here, X360's "two threads per core" is because of shared hardware more like "1.5 threads" (not news to many, but to some of us)
- Threading on the PPE in Cell is different from the XeCore (seemingly better) - this surprises me, I didn't think there was much difference at all between the two. And I wonder now what the differences are in this regard.
- They have PS3 kits and are making an "engine within an engine" so to speak, just for PS3.
- The gap between PS3 and X360 from a porting point of view is bigger than PC to X360.
- This bit has a wobbly translation, but he seems to be saying in one part that if you looked at them as "generic processors" X360's is more "efficient" but when you factor in Cell's SPUs, the "achievement conditions" look different.
- His talks a good deal about multi-threading/parallelism, what it can be used for and the issues affecting it, bringing up some points directly related to PS3/X360 differences. Seems relatively positive on how things could scale with Cell.
- They want to use physics to scale both gameplay and presentation (relevant to recent discussion i guess).

The whole thing is worth a read imo, but it would indeed be more comfortable with a human translation.
 
Last edited by a moderator:
No, what they are trying to get at is the same thing with hyperthreading. Having two threads is not like having two cores, it is just used to help use up the idle execution units left in the processor.
 
onetimeposter said:
incorrect. as that would contradict when they say xbox 360 is very efficient

In a dual-threaded system where you are sharing hardware - and one of the points he raises about threading is whether each thread has its own hardware or is sharing hardware - you're highly unlikely to have performance like 2 threads with their own hardware. 1.5x seems more reasonable and realistic. It's not about the GPU, but that you're running two threads on one core.
 
Last edited by a moderator:
Titanio said:
The main things I took from it:

- Along the lines of debate being had elsewhere here, X360's "two threads per core" is because of shared hardware more like "1.5 threads" (not news to many, but to some of us)
- Threading on the PPE in Cell is different from the XeCore (seemingly better) - this surprises me, I didn't think there was much difference at all between the two. And I wonder now what the differences are in this regard.
- They have PS3 kits and are making an "engine within an engine" so to speak, just for PS3.
- The gap between PS3 and X360 from a porting point of view is bigger than PC to X360.
- This bit has a wobbly translation, but he seems to be saying in one part that if you looked at them as "generic processors" X360's is more "efficient" but when you factor in Cell's SPUs, the "achievement conditions" look different.
- His talks a good deal about multi-threading/parallelism, what it can be used for and the issues affecting it, bringing up some points directly related to PS3/X360 differences. Seems relatively positive on how things could scale with Cell.
- They want to use physics to scale both gameplay and presentation (relevant to recent discussion i guess).

The whole thing is worth a read imo, but it would indeed be more comfortable with a human translation.


1) Is shared hardware meaning GPU-CPU? 1.5 CPU, the rest GPU?
2) Shorter threads was a given to be less efficient with SPUs
3) PS3 needs its own engine as its different as they say from PC/Xbox360
4) Carmack said the exact same thing
5) Likely the performance difference will be on and about the same
 
my god...
onetimeposter said:
incorrect. as that would contradict when they say xbox 360 is very efficient
sarcasm?.... hope so


2 threads in one core = hyperthreading... whats the surprise here?

2 threads sharing one Core will never result in a 2x performance jump... its more like a 25% increase at best

marketing got you...

also, he's saying Cell is better in its "hyperthreading", its maybe a typo or he as a D2 CELL revision (2vmx units in one core).
 
onetimeposter said:
1) Is shared hardware meaning GPU-CPU? 1.5 CPU, the rest GPU?

No. You have one core. Two threads sharing it. They don't have dedicated execution hardware for each thread - that effectively would be two cores for two threads. The "dual threaded" bit is that there is hardware there to facilitate fast switching between the two threads on the core.

onetimeposter said:
2) Shorter threads was a given to be less efficient with SPUs

Hmm? Anyway, not sure what this has to do with threading on the PPE..
 
Titanio said:
No. You have one core. Two threads sharing it. They don't have dedicated execution hardware for each thread - that effectively would be two cores for two threads. The "dual threaded" bit is that there is hardware there to facilitate fast switching between the two threads on the core.



Hmm? Anyway, not sure what this has to do with threading on the PPE..

Carmack in his interview said that while shorter threads were revolutionary for rendering, 3d imaging and cgi pre-rendered. its not good for game development and game efficiency
 
onetimeposter said:
Carmack in his interview said that while shorter threads were revolutionary for rendering, 3d imaging and cgi pre-rendered. its not good for game development and game efficiency

I'm very confused. What has this got to do with the PPE threading?
 
dskneo said:
my god...

sarcasm?.... hope so


2 threads in one core = hyperthreading... whats the surprise here?

2 threads sharing one Core will never result in a 2x performance jump... its more like a 25% increase at best

marketing got you...

also, he's saying Cell is better in its "hyperthreading", its maybe a typo or he as a D2 CELL revision (2vmx units in one core).

do you know the difference between the VMX between X360 and pS3?
 
onetimeposter said:
do you know the difference between the VMX between X360 and pS3?
i do.... and that does not change the fact that 2 threads sharing ONE core does not Equal 2x performance boost, no matter the ammount of registers the Vmx may or may not have.

i mean, come on... you actually believe 6 threads between 3 cores = 6 x the boost of one thread?

hyperthreading ensures that the CORE (one core) is running as close to its limits all the time. It does not mean it doubles the performance of ONE thread core. Not even close.
 
I though hyperthreading shared one set of execution values, while the X360 cores has two sets?

"This is achieved by duplicating the architectural state on each processor, while sharing one set of processor execution resources. "
http://www.intel.com/technology/hyperthread/

I don't think anyone is claiming that the 2 threads double the performance.
 
In theory having two threads should never (on a hyperthreading type situation) have more performance than 1 perfectly done thread - there is only enough execution units for 1 thread at a time, and if that thread can fill those then the second thread isn't going to be doing much. The only reason for adding a second thread is to fill the gaps caused by latency (cache, mem access, etc, etc) -- it isn't because the core has enough resources for two perfect threads; they expect all threads to be rather imperfect.

I'm not quite sure why any of this is a surprise -- it is hyperthreading all over again as far as I can tell (with a few changes, but same theory and purpose).

Maybe I just have no clue what I'm talking about, but it seems pretty straight forward to me.
 
Back
Top