From how I understood that presentation in the
context of
his consultation, prior GPU designs used to have a "maximum fixed amount" of each specific memory types (register/threadgroup/tile) that you could allocate BEFORE spilling to higher level caches/memory. Normally in this design you usually have unused memory resources depending on the shader/kernel (compute = unused tile memory, graphics = unused threadgroup memory, etc.) and that if you wanted to allocate more of a specific memory resource than possible, you would usually spill allocation to slower/higher latency caches and memory ...
What dynamic caching does is that you can flexibly carve out unused memory resources to allocate more memory for other memory types that are in use. Occupancy is improved in a sense where you can avoid more cases of spilling to higher latency memory so your shader/kernel spends less time waiting/idling on memory accesses but otherwise you won't see the hardware launch more waves. It's conceptually similar to Nvidia Volta's unified L1/shared memory pool but it goes one step further and unifies register memory space as well!
On AMD, their latest hardware design seemingly can apparently dynamically vary the number of waves in flight throughout the execution of a shader/kernel ...