Thoughts on next gen consoles CPU: 8x1.6Ghz Jaguar cores

Again, I think you should see the Jaguar as a successor to the PPE component only, and the 4 CUs as a successor to the SPE component. The Jag is 3-4x the GFlops of the PPE and should be quite a lot more efficient in most tasks. 4CUs should amount to almost 2x the SPEs, and will be more efficient in most tasks as well. The 14CUs for graphics deliver about 3x RSX GFlops, but again, should be quite a lot more efficient. Any component has access to the full 172GB bandwidth, instead of 25GB for RSX and 25GB for Cell. And the memory pool appears to be a unified 4GB, which is already 8x the memory of the PS3, which also had a split memory pool making it harder to use it fully.

So my estimate is that in total, Orbis should be able to be at least 8x as capable as the PS3, and because the system is far closer to PC, I expect far more developers to get close to its maximum potential than was the case for PS3.

But we'll find out more soon hopefully ;)

14CU is 6x RSX isn't it?
 
You're right, I didn't calculate that correctly. The RSX probably delivers about 250GFlops realistically, right?

The really interesting thing will be how much things like texture compression support can improve memory and bandwidth efficiency versus last gen consoles, and how much things like load-times will be a bottleneck, etc. And of course it will be interesting how much doing it at 1080p @ 60fps will take away from the 'extras'. But I certainly hope things like dynamic resolution for busy scenes become the norm, so that it would allow something like 1080p @ 60fps to become the norm as well.
 
If you scroll up a bit, it was my point that the SPEs are not really suited to usual general purpose code. We agree. ;)

Yes and no. If you need to perform the general purpose tasks in a sustained manner, you may still be able to vectorize or batch them for Cell or other similar architecture.

But it is generally a pain because you may need to revamp the code and data structure. Cell h/w is also stagnant for 7-8 years now, modern processors should be able to run circles around it generally speaking -- unless the algorithm is very suited for SPU's characteristics, then there may be surprises.

For a "standard" processor like Jaguar, it is also much easier to reuse existing optimized code for good to great performance.

If Sony port Linux to PS4 using 512MB RAM, it should run *much* better than PS3 even without the GPU. This is because software has advanced too (More optimized for mainstream architectures). I would love to see what's the speed up in various areas.
 
You're right, I didn't calculate that correctly. The RSX probably delivers about 250GFlops realistically, right?

The really interesting thing will be how much things like texture compression support can improve memory and bandwidth efficiency versus last gen consoles, and how much things like load-times will be a bottleneck, etc. And of course it will be interesting how much doing it at 1080p @ 60fps will take away from the 'extras'. But I certainly hope things like dynamic resolution for busy scenes become the norm, so that it would allow something like 1080p @ 60fps to become the norm as well.
Why do you think the current consoles don't support texture compression?
 
Eh, good question. Of course they do, but I thought they weren't nearly as efficient at it, don't support it for all types of data, etc. Also I remember there were some differences in even dxt1 support between 360 and PS3?

Anyway, I read up a bit on the subject and came across this neat overview of old to new, with interesting comments as well from someone who worked on recent formats:

http://www.reedbeta.com/blog/2012/02/12/understanding-bcn-texture-compression-formats/

A
 
Would Sony and MSFT be allowed to use ISPC on their system?
It seems like quiet a great language to leverage the numerous cores next generation systems have at hand :)
In that paper the developers points out some cases where that language gives better results than the use of intrinsics and productivity for coders doesn't seem in the same ballpark.

Quiet an interesting paper even though as usual a tad over may head :( (time to go back to school? my brain is screaming for "food"...).
 
Last edited by a moderator:
No, only the XCPU has twice the FLOPS - this agrees with what bg was saying

Sony's Jaguar is standard - hence why vgleaks are calling it Jaguar and not calling the XCPU Jaguar.

I still place a ? on that. Does anybody have a confirmation the flops count for a vanilla 4 core jaguar? So far it's only VGLEAKS saying it's 102.4 Flops, but as with the 14+4 issue, we may have to get other sources when there's a bit more technicalities involved.
 
I still place a ? on that. Does anybody have a confirmation the flops count for a vanilla 4 core jaguar? So far it's only VGLEAKS saying it's 102.4 Flops, but as with the 14+4 issue, we may have to get other sources when there's a bit more technicalities involved.
I believe.

8 flops per cycle, 1.6Ghz, 8 cores.

8*1600000000*8=102400000000. So there ya go 102.4GFlops. Or half that for 4 cores = 51.2GFlops.
 
I believe.

8 flops per cycle, 1.6Ghz, 8 cores.

8*1600000*8=102400000. So there ya go 102.4GFlops. Or half that for 4 cores = 51.2GFlops.

Well, Lherre said that "some" things in the latest Orbist target specs "doubled". My bet is CPU flops, GPU L2 cache and texture cache ( all of this plus 8 ACEs would make a good GPU custom design to improve both rendering and compute efficiency ).
 
Both APUs would be custom designs. How custom is something it looks like we'll have to wait to find out. And then we can speculate on the merits of those customisations. Last time the 360 version of the PPE cores did have some advantages over the one in Cell, so it's definitely possible that a difference exists this time. And if Durango has DDR3, then more CPU fpu/simd prowess would be relatively more useful perhaps than for GDDR5, who knows?
 
I believe.

8 flops per cycle, 1.6Ghz, 8 cores.

8*1600000000*8=102400000000. So there ya go 102.4GFlops. Or half that for 4 cores = 51.2GFlops.

my ? goes to how Durango is getting the 200GFLOP rating when we're rating PS4 at 102.4GFLOPS using simple math. I did the math too so I understand how the 102.4GFlops came from. I also think the 102.4 Flops is most probably accurate.

Somehow these cores for durango would need to do something like 8 Adds + 8 multis in a clock.
If not that, then it would either be 3.2Ghz (lol) or 16 cores (clearly inaccurate)

[strike]or secret sauce[/strike]
 
Last edited by a moderator:
my ? goes to how Durango is getting the 200GFLOP rating when we're rating PS4 at 102.4GFLOPS using simple math. I did the math too so I understand how the 102.4GFlops came from. I also think the 102.4 Flops is most probably accurate.

Somehow these cores for durango would need to do something like 8 Adds + 8 multis in a clock.
If not that, then it would either be 3.2Ghz (lol) or 16 cores (clearly inaccurate)

[strike]or secret sauce[/strike]

What about 10 cores ( 2 used for OS ) at some higher clock speed some where in the 2ghz range(2.4 or so) ? Could be why the soc is having fab problems
 
my ? goes to how Durango is getting the 200GFLOP rating when we're rating PS4 at 102.4GFLOPS using simple math. I did the math too so I understand how the 102.4GFlops came from. I also think the 102.4 Flops is most probably accurate.
[strike]or secret sauce[/strike]

Would the bespoke audio bits have a Flop rating? (the diagram seemed to put kinect's MVEC inside the SoC - and there are persistent rumours that audio system is special).
 
What about 10 cores ( 2 used for OS ) at some higher clock speed some where in the 2ghz range(2.4 or so) ? Could be why the soc is having fab problems

Would the bespoke audio bits have a Flop rating? (the diagram seemed to put kinect's MVEC inside the SoC - and there are persistent rumours that audio system is special).

I have no damn idea other than it's fishy.
Most of the math in the rumors about Orbis/Durango add up in some shape or form but the Durango CPU currently doesn't.
 
For Durango, I predict two 4 CPU clusters, one core disabled for yield purposes, one reserved for OS/background apps (leaving three cores per cluster for games).

For Orbis, Sony already stated eight cores, so that's all the cores in two four-core clusters. I predict they'll revise this down to 7 usable cores, reserving one for disabling for yield purposes.

Cheers
 
If it is two four core clusters, it would seem rather silly to have one core disabled for yield purposes in one of the two. Either do it in both or not at all.

What we did hear is that the OS will use one core, where the rumor statest that apparently it won't even reserve it fully, but we'll see.
 
If it is two four core clusters, it would seem rather silly to have one core disabled for yield purposes in one of the two. Either do it in both or not at all.

You'd give up 15% of your performance by always disabling two cores instead of one.

The OS could just number the cores in the cluster with four good cores as core 0-3, so games/applications always have the same view of CPU core configuration.

Cheers
 
Back
Top