I wonder how, games steadily become more and more graphics heavy and gameplay-lite (especially AAA titles on consoles)
I already explained how. Besides, not all games are AAA titles on consoles, and they are not even the most popular or profitable.
Yeah, that's why you have to use denoiser and other tricks to hide the fact that there's a paltry amount of rays in each image.
That's not half that bad as you're trying to imply.
Ray-tracing introducing noise is a wrong assumption. Discretization, monte carlo integration and stochastic sampling cause noise, this stuff is required for physically based reflections/shadows/etc with both ray-tracing or rasterisation.
Get rid of these concepts and treat all materials as perfect mirror or perfect diffuse and you're done with noise, but that's a derp solution.
As for denoisers, they blur rough surfaces where noise happens, but that's not that critical because integrating 1000s of samples from all possible directions would still provide very blurry reflection on rough surfaces (because reflections on rough surfaces must be blurry, that's exactly what we are doing mathematically by integrating and averaging many samples)
you basically say that we have to pay extra so that poor devs will have to do less routine work. A noble endeavour, but I'll pass, thanks.
Another wrong assumption, guess who will pay one way or another for ever increasing development costs?
Business will cover expenses with micro transactions, microservices, loot boxes, DLCs, you name it.
You're probably using a different definition of "burst" than I am. Aside from RT there is rarely any single workload that occupies the GPU for more than 10% of total frame time. And during that time utilization is almost always poor.
I would cosider geometry draw calls a burst workload, these are usually way below 1 ms, cache flushes and state changes can be required in-between draw calls and other overheads are possible, these are usually small, burst and with low utilization (that's why async compute is usually overlapped with them)
I haven't seen a math bound frame in any game that I've profiled. If you're lucky you'll have one or two workloads during the entire frame that keep the ALUs 50% busy.
By math bound I simply mean that frame performance is limited by any computations on GPU die, doesn't matter whether it's fixed functions blocks or SIMDs, obviously, 100% SIMD ALU utilization is a very rare case.
Following the roofline model, what I can say for sure is that most of frames are never bound by vram bandwidth.
There are plenty of articles on memory frequencies performance impact, performance never scales linerly with memory frequencies for obvious reasons (the roofline model).
Also, regression models usually converge to low coefficients for the bandwidth metric, way lower in comparison with other metrics (especially when you combine all GPU metrics together), which shows that games are rarely bandwidth bound in general.