thanks for your answer sebbi.
FIY we sample the cube map (dynamic) for IBL (PBR). We sample that cube map using a predefined mip level (computed according to material roughness) in the lighting pass. However, for low roughness, we get texture aliasing on distant objects, so I want to clamp that computed mip level according to the derivatives (i.e. no need to sample miplevel0 for an object that's very far from the camera).
The clamping mip level value will be computed by sampling a dummy single channel mipmapped cube-map, where each mip map level contains the mip level (i.e. first level will be filled with 0s, second will be filled with 1s. etc...), using the derivatives. So, sampling that cubemap will return the miplevel that I can clamp with).
The question is whether to do that during geometry pass, and store the resulting miplevel to use for clamping in a gbuffer, or during lighting pass (in which case there is no gbuffer layout change, I will have to reconstruct the derivatives, and sample the dummy miplevel cube map using the computed derivatives).
I used to store some gradients in a G-Buffer in the past, since I had a similar issue with shadow map (cascade-VSM). However, we are bandwidth-limited (especially with MSAA), so I removed the explicit storage of derivatives in G-buffer, and rebuild them in the PS (by sampling the neighbours depth g-buffer pixels - left/right & top/bottom - using the horizontal neighbour with the smallest depth difference - to handle object discontinuities - and reconstructing position from the selected neighbours depth; the derivatives is the delta between each neighbour and the currently shader pixel).
This turned to be faster in my case. I assume this is the solution you describe in your second paragraph (although I am doing this in a pixel shader, not using compute).
Now, with this cubemap, computations to create the reflection vector are a bit heavier, since we handle parallax, etc.. I assume I could bypass the parallax steps when computing the gradients. However, this would mean a few g-buffer reads (4 depth + 2 normals), vs. a single write (geometry pass) + single read (mip level) if I store the mip level in a gbuffer (but overdraw).
Storing the mip level sounds like a good idea, although it makes my gbuffers format slightly fatter. I assume I will have to implement both approaches and test which one is the fastest, since I can't really guess which one will be faster...
Long answer, I guess I am just thinking loud in fact