- The OS runs on a separate ARM processor. AFAIR on xenos the os uses an entire thread on one of the cores.
- The OS runs on a separate ARM processor. AFAIR on xenos the os uses an entire thread on one of the cores.
I don't think so. The ARM core just handles some security and authentication stuff.The OS runs on a separate ARM processor.
FWIW Wii U CPU is 3 cores at 1.7 the Wii CPU clock. So 3X1.7=5.1X Wii CPU. Then with more cache maybe you can call it 6X. I got that from a GAF post, but I guess 6X isn't a horrible generational leap, though back in the PS360 day it wasnt a great one either.
I don't know for sure, but I think it may could be too late
That's what threading is. If you have a thread that has its own parallel code stream and execution units, you've got a core. Threading is about optimising use of execution units by running multiple streams of code through the processor. Depending on the number and type of execution units, threading can have no benefits at all.
For Wii U's CPU capability, we need to know firstly the peek throughput in terms of execution units, and then compare its efficiency vs. code running on Xenon.
I clarified that in my post but unsurprisingly this rabid thread is too fast for an edit.That it runs its own OS doesn't mean it's running the OS of the Wii (or Wii U for that matter).
In the early days of Cell discussion, someone posted a link into cache efficiency that showed cache misses in some architecture (I think x86) were pretty rare. Adding in that devs have now gotten used to structuring their data to feed stream processors, it should only be a small subset of workloads where SMT can be any benefit. I'm thinking single digit percentages. Certainly not order-of-magnitude gains on efficient game code, such that a hypothetical Xenon with multithreading and OOE can compete with the same chip sans SMT and IOE at twice the frequency. SMT is great in a fully multitasking OS running all sorts of code, but should be of little significant value to well-written game code.SMT will give the processor something to do while waiting for a cache miss and will lessen the average impact of a branch misprediction if some of the instructions in flight are from the other thread. No processor will be able to schedule around misses in the LLC so unless there are no misses and no branch mispredicts you'll get some benefit, even if you're keeping the execution units fed otherwise.
In the early days of Cell discussion, someone posted a link into cache efficiency that showed cache misses in some architecture (I think x86) were pretty rare. Adding in that devs have now gotten used to structuring their data to feed stream processors, it should only be a small subset of workloads where SMT can be any benefit. I'm thinking single digit percentages. Certainly not order-of-magnitude gains on efficient game code, such that a hypothetical Xenon with multithreading and OOE can compete with the same chip sans SMT and IOE at twice the frequency. SMT is great in a fully multitasking OS running all sorts of code, but should be of little significant value to well-written game code.
In the early days of Cell discussion, someone posted a link into cache efficiency that showed cache misses in some architecture (I think x86) were pretty rare. Adding in that devs have now gotten used to structuring their data to feed stream processors, it should only be a small subset of workloads where SMT can be any benefit. I'm thinking single digit percentages.
Certainly not order-of-magnitude gains on efficient game code, such that a hypothetical Xenon with multithreading and OOE can compete with the same chip sans SMT and IOE at twice the frequency. SMT is great in a fully multitasking OS running all sorts of code, but should be of little significant value to well-written game code.
Single digit gain in performance, making assumptions as to what game engine code looks like. ERP has other ideas based on much more relevant experience.You mean only a single digit percentage of workloads will see any benefit or you'd only see an average of a single digit percentage gain?
No, I'm saying that those believing OoOE or multithreading have the ability to make all the difference in a processor are mistaken. Multithreading is a cheap way in terms of silicon to gain extra performance, but it's a very different sort of extra performance to adding execution units, increasing clocks, etc. It's dependent on what you're doing when. SMT is only raised here as Rangers mentioned it and it was worth covering what SMT brings to the table.No one's stipulating order of magnitude gains...You're claiming that Xenon's SMT was a waste of hardware and that is the one complaint I have not heard anyone make.
COD never had these.At the cost of things like SSAO and dynamic lighting...
And I'm pretty sure Blops 2 is lower then Blops 1?
In total throughput, Xenon will be more powerful. But in terms of executing game code, a lot depends on the code devs are using. I believe Xenon will be more powerful because I believe devs are writing optimised code that makes efficient use of the processor, but it's wrong to compare the straight numbers. GHz*threads is not at all accurate!
There's a lot of existing code developers are and will be using. It was written and tuned for in-order, deep pipeline processors so it's not like OoO CPU will run it significantly faster. If anything it's harder to optimize for OoO CPU because it's less predictable than its less "brainy" counterpart*. Unless existing code gets dropped on the floor and developers stop thinking about branching, pairing and what not, there's little reason to believe that Wii U will get advantage due to architectural differences. It's also important to keep in mind that code won't suddenly be written for Wii U alone (not by 3rd parties and not any time soon at least) so it will have to behave well on PS360. Calling GHz*threads inaccurate is really a straw man here. I haven't seen anyone doing the brain-dead math here saying that a*b=k*c*d therefore 360 is k times more powerful than Wii U.
*Intel concluded some time ago that wimpy cores (deep pipelines, specialized, in-order) may give more raw power than brainy ones but very few people know how to write a decent code for those. Game developers are a fairly rare breed that does.
I'm not quite sure where Nintendo have gone with Wii U. Seems to be an awkward middle ground. Weak OoOE means, I presume, developers will still need to spend time optimising their code, especially given the lack of raw power in the CPU. If Nintendo had gone with simple cores with more grunt, I'd understand. If they went with a really easy-to-use CPU, I'd understand. But Wii U seems to offer not much of either. Almost as if their conversation with IBM went something like:
"We want it small and BC."
"Okay, we'll take the existing Broadway design and go multicore."
"But we want it better too."
"Okay, we can add some good out-of-order execution to make it easier to write for."
"Great. Only don't make it too big."
"Um, right. So we'll add...a little out-of-order support?"
"Yeah!"
I bet people would rather write nice code w/o thinking about arch. But they do think about arch while writing the code for some time. Otherwise we'd less improvement on consoles in the past 6 years or so. If you write any high-frequency code for PS360, you'll face the need to optimize - streamline, get rid of jumps/calls, etc. A lot of engine code is being reused from title to title. There's a lot of code that works great on PS360 that won't benefit a lot from OoO CPU and it's not like developers will start writing code cowboy style - PS360 is still there as a target for their titles.Intel's targeting a different market with PCs. You have legacy code and a massive range of developer abilities writing on a zillion different platforms. x86 has had a very strong requirement in making bad code run fast. In precision software engineering jobs with a high quality of software engineering, all those hardware extras don't achieve a great deal, in theory. However, as mentioned many times this gen, developers don't want to be writing CPU-hand-holding code. They'd much rather be able to whack code onto the console and have it run well without effort.
And yes, Intel is in a different market, so your statement validates my previous claim. Intel is in the "brainy" market: lots of code of varying quality. Console code is mostly driven by guys with a lot of expertise. A lot of effort is being put into optimizations. This is why game developers can and deal with "wimpy" cores that are in PS360.