Yes I've read and some parts do not make sense at all because you are trying to reach a conclusion
I'm not the one who's trying to reach a predetermined conclusion here.
Who spoke about FLOPS utilization?
The guys at the start of the discussion? "GA102 never reaches 30 TFLOPs in games because the shader processors will halt while waiting for other bottlenecks in the chip"
You in your previous post? "Does it mean Ampere will hit double of FP throughput respect to Turing at ISO clocks? No, it will depend on the workload. This is all."
I was speaking about hardware utilization, in particular second FP pipeline staying idle whenever an INT instruction must be executed.
H/w utilization means little if your chip is still efficient enough per transistor to be competitive. That's the beauty of GPUs - you can do things in a million of different ways, the only thing which matters is the performance per price. So strictly speaking it doesn't even matter and all this discussion is pure theory.
But you are trying to push your view even saying that "Ampere has increased HW utilization". If we were talking about ROPS, or TMU, I could agree, but we were specifically talking about shader core, and in the shader core I have ALWAYS the second FP OR the INT pipeline idle. Whan in Turing that does not happen. So there is hardware in Ampere always unused. How it is possible that you write "hardware utilization increases" is beyond me.
Well let's look at the information which we have then?
Do you see any "second FP OR the INT pipeline" here?
As I've said, the answer to that question is tied to the answer on how exactly Ampere handles INT execution.
If it's a separate SIMD then yeah there will be more idle h/w in Ampere then in Turing when running the same code.
If it's the same SIMD as that which is used for FP32 then no, there will be less idle h/w here than in Turing.
You are continuing to move the target.
I'm dead solid on my target. It's you who constantly move between h/w and flops utilization - which aren't at all the same.
And then you will have the hardware on the INT pipeline completely unused.
Again, choose what you're talking about. It's either perf/flop which this discussion has started on or general h/w utilization. If it's the latter then there are two possible scenarios for Ampere, not one.
But as I understand that you are not trying to honestly discuss, but you are pushing your agenda here, I will stop to discuss here.
Says the man who switched the goalpost at least two times in two consecutive posts.