scooby_dooby said:
From the CELL paper I read way back, they said that to use cache produced un-predictable latencies, so they removed the cache, used a register instead(???) so that latencies were always bad, but at least they were predictable(they never mentioned the downside though).
They used local sram instead. The latencies are far from "always bad", the exact opposite in fact. Something like 4 to 6 cycles, I believe, possibly faster than L2 cache (although someone may want to correct me on that? I always assumed one of the big advantages of the local sram was its speed and or access behaviour).
scooby_dooby said:
It's a good example of the one-sidedness of these tech papers though. They making something like removing cache, sound like a good thing.
If it means, for example, that more silicon to use for more execution units, might that not be considered a good thing? You've got to accept that the choices they made, they made with the best interests of the chip. It's not like they were landed with a set of choices they subsequently had to justify or make "sound good". They had the choice to put cache in there or not, and they made the choice not to. It was a voluntary thing.
drpepper said:
hey, I'd like to read up on them! Any interviews, essays, etc... of the topic at hand.
TIA
scificube said:
Please do. I for one would love some insight into our beloved developer's minds.
Here are a couple that spring to mind.
In Crytek's next engine, from a gamestar.de interview they were asked how they planned to use CPU potential. They said:
"We scale the individual modules such as animation, physics and parts of the graphics with the CPU, depending on how many threads the hardware offers."
John Carmack was asked how he was planning to use the CPU in his next engine, and he said:
"we’ve got the game and the renderer running as two primary threads and then we’ve got targets of opportunity for render surface optimization and physics work going on the spare processor, or the spare threads"
The NFactor 2 engine from Inis - that one with the freaky looking character with the hair - splits threads like this: a thread for the main game loop, and then a thread each for rendering, physics, hair simulation and audio.
Tim Sweeney told Anandtech how he was planning to use the SPUs - for physics, animation, particle systems, sound and "possibly" a few other areas. Perhaps most encouragingly, though, he said that the things the SPUs weren't very well suited for don't take much time anyway, and would run fine on the PPE - which would seem to rather explicitly support the choices STI made in terms of what they optimised and aimed the chip at.
There may be others, but these are the ones I can most readily remember.