Nvidia engineers last year spoke about chiplet design (can't find the interview) and were very specific about future consumer products to remain monolithic dies due to cost and complexity reasons.
I'd agree that it's probably highly unlikely in this case too, although not impossible.
After reading through the patent a few times, one of the first things I thought of was the strange rumours about GDDR6x made a little more sense if the ray tracing hardware was being at least partially decoupled from the rest of the GPU.
Interestingly enough, one of GDDR6's generally unused features are its dual channel per package design.
Not too much of a stretch to imagine tweaking it to either add a 3rd 16-bit channel (read-only or otherwise) while keeping identical signaling specs to give the ray tracing 'coprocessor' its own half-width memory bus.
Or alternatively, multiplexing the coprocessor's memory access onto one of the two 16-bit channels.
IE - Most of the time the basic GPU keeps its full memory bandwidth, but for the times when the 'coprocessor' needs to access memory, it could lock the 2nd channel on the GDDR6 for its own use and then release it when finished. At no point does the GPU completely stall or get locked out of memory, but from time to time will drop to half memory bandwidth. If you aren't ray tracing, then the base GPU gets the full memory bandwidth 100% of the time and the coprocessor is idle.
The patent does make a lot more sense in the context of a hypothetical datacentre product for Google Stadia and the like, though.
Take GA100, add an RTX 'coprocessor' to the interposer, sprinkle a little dual-ported HBM2 into the mix, and you have your halo graphics product without needing to make a separate die or compromise GA100's compute performance by removing functional units to make room for the RTX hardware. Given that GA100 is already right near the reticle limit, it's either add the ray tracing capability as a coprocessor on the package, or design a completely separate ~800+mm2 die for a maximum-effort RTX-capable graphics product.
If this is the approach Nvidia's taking, it's pretty easy to see how the rumour mill may be right about the coprocessor/chiplet, just wrong about which product segment.