So what kind of engines, on consoles, could actually benefit from having SMT ?
Well unchanged and not. Would the optimized SMT code be faster than the already heavily optimized TLOUR multithreaded code ?
My response was specifically for ND's engine. Unfortunately there seems to be less information out there on how multicore programming works on other engines. But if I recall their presentation correctly, they want the cores to never switch away threads because the thread is responsible for generating and switching fibres, and somewhere in near the beginning of their presentation I do recall that the CPU would thread swap between cores for some reason for running OS tasks and that was detrimental to their performance and became the first thing they needed to lock down.
The way that ND does multi-threading is a solution that aims to have very high/equal core utilization. It ensures as much as possible that each available core is working on some item, or generating new items for other cores to be working on them. I think without fibres this type of 'switch' would have too much overhead, but from what I remember from the presentation, fibre switches have minimal overhead. Fibre switching is just a form of job pool. In this case they had full control over how it would operate. From what i can see the largest discernible difference between this and standard job pools is that there is no master thread assigning work back onto the stack, they effectively decentralized that function.
But having a shared fibre pool to work from is just one way to do multi-core programming. Others opt to assign full roles to cores, like sound, animation, AI, etc to specific cores and the threads don't need to have any overhead switching threads, they just work on what's been assigned to them. It may not saturate all the cores as well as the shared fiber pool, but as long as no core is going over budget it will work. And in the scenario with SMT, you are now given 2 threads per core, it should help alleviate further any bottle necking possibilities on each core.
With SMT traditional shared thread/job pools appears improved. Mini jobs controlled by a single thread are assigned to the job pools as tasks out for the idle cores to pickup to do work on. SMT is probably going to be helpful if you setup for that type of event as not all jobs will be equal and you're going to get better saturation of the cpu if the thread stalls.
I think overall SMT is a good thing. And I assume in the console space it could be a feature that developers can turn off if they desire