bgassassin
Regular
Are you mixing up wavefronts with "threads" (work items)? While the wavefronts are the actual threads of the hardware, usually the work items (from which you have 64 in a wavefront) are called that way in the context of GPUs. But there is no way you can fill even the very small Latte GPU with just 160 or 192 of these "threads". For starters, the VLIW architectures always interleave two wavefronts on a single SIMD to cover instruction latencies (the command processor keeps more wavefronts to swap in in case one hits a long latency instruction [memory access] or control flow). That means one needs already 256 "threads" (4 wavefronts) at minimum, to be even able to hide the ALU latencies for a tiny GPU with just two SIMDs. For running efficiently, you would want significantly (an order of magnitude or something in that range) more than that.
And there is actually no efficient way to run less "threads" than one has in a wavefront. So 10 "threads" for GS doesn't make the slightest sense, 10 wavefronts (640 "threads") do.
Probably due to some recent explanations that amounted to 1 thread:1 shader. I have a limited understanding of threads and even that was from a long time back. So thanks for the clarification.
So what would be your explanation as to why Nintendo is listing the numbers in this manner? Which apparently even caused a dev to say Latte had 192 ALUs.