Next Generation Hardware Speculation with a Technical Spin [2018]

Status
Not open for further replies.
Like having a super mega store with a doggy door as its sole entrance and exit.
y29m7.jpg
 
I think the Xbox One X with it's cpu customizations seem likely for any future consoles, as in a much tighter symbiosis between the many cores of the GPU and the few faster cores of the CPU.
 
Personally, I do not believe next gen consoles will use Jaguar CPU architecture again. I also think the idea of moving over to ARM is absolutely not happening.

95% chance it'll be Zen 2 /2+ family, that's what Sony & Microsoft are both going to use.
 
I'm going to guess 4 zen 2/3(?) cores, 8 threads with the rest of the silicon being used for graphics/compute cores.
 
SMT doesnt fit well in Consoles where code is optimized to make run cores as much is possible and not occupy memory band with inappropriate calls... Better more (not SMT) cores... IMHO balance is the key... ONE X is really great in term of balance...
 
SMT doesnt fit well in Consoles where code is optimized to make run cores as much is possible and not occupy memory band with inappropriate calls... Better more (not SMT) cores... IMHO balance is the key... ONE X is really great in term of balance...
I already pointed out that doesn't happen. Maybe 5% of devs (actually far less) will manage their code that low level. The realities of economics means you can't hand-hold a million lines of code as you would a SNES game, keep track of every transaction, and balance your code to not stall. I'd argue HPC is the place you'd see a removal of SMT if it made sense. HPC computing can be manually tweaked per line as you're running a single workload by-and-large, and the costs and savings are significant. If you could get 10% savings by removing SMT and writing unstallable code, it'd be worth it. However, you can't. It's hard enough writing multithreaded code that works efficiently; let alone can run with optimal usage of every CPU core.
 
Personally, I do not believe next gen consoles will use Jaguar CPU architecture again. I also think the idea of moving over to ARM is absolutely not happening.

95% chance it'll be Zen 2 /2+ family, that's what Sony & Microsoft are both going to use.

I expect a Zen based arch with a very Jaguar like acceptance from the tech forum crowd. LOL.
 
I already pointed out that doesn't happen. Maybe 5% of devs (actually far less) will manage their code that low level. The realities of economics means you can't hand-hold a million lines of code as you would a SNES game, keep track of every transaction, and balance your code to not stall. I'd argue HPC is the place you'd see a removal of SMT if it made sense. HPC computing can be manually tweaked per line as you're running a single workload by-and-large, and the costs and savings are significant. If you could get 10% savings by removing SMT and writing unstallable code, it'd be worth it. However, you can't. It's hard enough writing multithreaded code that works efficiently; let alone can run with optimal usage of every CPU core.
OK. But what about those old engines that rely on single core performance ?

For instance for the same clock and same number of cores, with an unoptimized port, which would be best for Broforce ? CPU cores with or without SMT ?
 
There have been examples of CPUs in the past that dropped some of the external address bits, as they weren't intended for systems that needed full addressing.

This shook loose an Amiga memory. The 68EC020 was one of these. It had 24-bit external addressing instead of the 32-bit addressing in the standard 68020.

Not really relevant to the discussion. I was just amazed that I had instant retrieval of that fact after all of this time not thinking about it.

OK. But what about those old engines that rely on single core performance ?

For instance for the same clock and same number of cores, with an unoptimized port, which would be best for Broforce ? CPU cores with or without SMT ?

Games that aren't especially multithreaded wouldn't really be using the virtual SMT cores, right? So probably no difference?
 
Ok then, let's try another case. A game like TLOUR using a custom and very efficient multithreaded engine (with fibers). Would the game run noticeably better with SMT ?
 
Ok then, let's try another case. A game like TLOUR using a custom and very efficient multithreaded engine (with fibers). Would the game run noticeably better with SMT ?
probably won't benefit them at all. They coded for full control. They don't want the system to switch threads IIRC
 
Ok then, let's try another case. A game like TLOUR using a custom and very efficient multithreaded engine (with fibers). Would the game run noticeably better with SMT ?

I don't see why not. You optimize for the target hardware. If the target hardware has SMT, you code with that capability in mind.

Edit: Did you mean running the current code unchanged? If so, are you talking the same number of real cores as the game was designed to run on without SMT vs the same # without or same # without vs less cores with SMT?
 
Last edited:
probably won't benefit them at all. They coded for full control. They don't want the system to switch threads IIRC
So what kind of engines, on consoles, could actually benefit from having SMT ?
I don't see why not. You optimize for the target hardware. If the target hardware has SMT, you code with that capability in mind.

Edit: Did you mean running the current code unchanged? If so, are you talking the same number of real cores as the game was designed to run on without SMT vs the same # without or same # without vs less cores with SMT?
Well unchanged and not. Would the optimized SMT code be faster than the already heavily optimized TLOUR multithreaded code ?
 
If you've written a suitably optimal engine, SMT could potentially assign the wrong thread at the wrong time and block what you're trying to do. However, the scheduling ought to be making the same choices as the devs code and so not trip itself up. That is, if there's no stall coming that needs hiding, the thread wouldn't be switched.

SMT won't result in a performance improvement over the best possible. If you code a piece of 100% perfect code, SMT doesn't increase performance. In that case, it'll be unnecessary overhead. Worst case, it can slow you down a bit. Realistically though, you add 5% more silicon to get 10% more performance, where that 10% better performance would require two/three/four times as much effort from the programmers to hand-tune. (illustrative figures only, of course. Impossible to quantify these things)
 
So what kind of engines, on consoles, could actually benefit from having SMT ?

Well unchanged and not. Would the optimized SMT code be faster than the already heavily optimized TLOUR multithreaded code ?
My response was specifically for ND's engine. Unfortunately there seems to be less information out there on how multicore programming works on other engines. But if I recall their presentation correctly, they want the cores to never switch away threads because the thread is responsible for generating and switching fibres, and somewhere in near the beginning of their presentation I do recall that the CPU would thread swap between cores for some reason for running OS tasks and that was detrimental to their performance and became the first thing they needed to lock down.

The way that ND does multi-threading is a solution that aims to have very high/equal core utilization. It ensures as much as possible that each available core is working on some item, or generating new items for other cores to be working on them. I think without fibres this type of 'switch' would have too much overhead, but from what I remember from the presentation, fibre switches have minimal overhead. Fibre switching is just a form of job pool. In this case they had full control over how it would operate. From what i can see the largest discernible difference between this and standard job pools is that there is no master thread assigning work back onto the stack, they effectively decentralized that function.

But having a shared fibre pool to work from is just one way to do multi-core programming. Others opt to assign full roles to cores, like sound, animation, AI, etc to specific cores and the threads don't need to have any overhead switching threads, they just work on what's been assigned to them. It may not saturate all the cores as well as the shared fiber pool, but as long as no core is going over budget it will work. And in the scenario with SMT, you are now given 2 threads per core, it should help alleviate further any bottle necking possibilities on each core.

With SMT traditional shared thread/job pools appears improved. Mini jobs controlled by a single thread are assigned to the job pools as tasks out for the idle cores to pickup to do work on. SMT is probably going to be helpful if you setup for that type of event as not all jobs will be equal and you're going to get better saturation of the cpu if the thread stalls.

I think overall SMT is a good thing. And I assume in the console space it could be a feature that developers can turn off if they desire
 
Last edited:
It's worth noting that the multithreading in Jaguar isn't SMT - it's Clustered-multithreading. The way threading workloads affect the cores will be different and what devs have to do to maximise core utilisation will be different between Jaguar and Zen.
 
complete OneX SOC is 7 billion transistors... Ryzen alone (8 core) is almost 5 billion....

thats why maybe improved, maybe 16 cores, maybe with more cache ... But jaguar is still the first choice
If it's improved, it's not Jaguar. It'd be a new architecture, something not on AMD's roadmap and bespoke for a console at (considerable) added cost.
 
Status
Not open for further replies.
Back
Top