There is some conceputual overlap between the two (Avoiding divergence).
It's interesting how we got low-level tools for one (turing just has a few key shader instructions that help pulling it off). Whilst a very big blackbox for raytracing, which in theory could have been a few shader instructions as well.
It shows how one (rasterization) is just much further/stable in exposing low level tools, while the other is at beginning.
The thicker abstraction reminds me of the more arcane elements of texture and resource management, particularly in older days where formats and architectural choices were as wide-ranging as the larger number of vendors and their more limited attempts at consistency/compatibility. Silicon can sort of be characterized as a ~2-dimensional space, with stepwise execution often being a ~1-dimensional affair. The units, paths, caches, memory subsystem, and DRAM tend to at most offer a 2-dimensional scheme, often with a very strong preference for movement along one axis (SIMD divergence, pixel quads, cache lines, DRAM pages, virtual memory tables, streaming/prefetch loads, etc.).
Keeping a soup of 3-dimensional geometry up-front and mapping it early-on to a screen space, and using well-researched methods for getting better mapping of 2D elements to more linear cache and DRAM structures has some nice effects in setting down direct relationships between items, resources, and execution. The rasterizer-directed, heavily-threaded, and SIMD hardware maps rather well to the problem of utilizing DRAM arrays and caches as we know them, and the direct relationship between elements in the pipeline in effect serve as compression in terms of data or hardware usage.
Texture-space rendering at least still keeps a 2D space for rendering, albeit no longer the same global 2D screen space as before. The mapping is still somewhat natural, though some of the prior assumptions that could be made when using screen space can no longer be automatically assumed due to the indirection added by the extra pass and variability in the properties versus the global screen. This exposes an extra bit of the process to something the silicon has to perform a bit more work to map to its capabilities.
RT functionality, and the functionality handled by the RT core are a problem space that has more dimensions than can be readily reduced, and like the old days the players in the field do not have a consensus on which judgement calls are to be baked into their methods or acceleration structures.
The RT core at least attempts to protect the vast majority of the SM compute hardware from floundering on a workload that behaves poorly with the granularity of the hardware, or the linearity built into DRAM.
The memory behavior seems to be a big reason why the fixed-function element is paired with the memory pipeline, much like how texturing is generally adjacent and still has internal operations specific to its handling of data with properties that can defy linear breakdowns.
In other ways, the BVH and RT hardware have a few parallels with TLB hardware, which is another case of handling spaces with more movement along other dimensions than the linear hardware would like. Granted, the adoption of a tree (albeit much flatter than many page table formats) and the indirection from traversal (not a directed walk down a hierarchy like page tables) can create a high-level impression of such almost by default. Perhaps some of the elements learned over the years for managing a tree of virtual memory metadata will inform what happens with RT hardware, however.
edit--late correction: I blanked on the characterization of the tree depth facing the RT core. The externally visible flat tree would be separate from the particulars of the BVH traversal, whose specifics regarding depth, subdivision, duplication, and other vendor-specific tweaks lead to a more variable amount of depth and set of operations at each juncture.
Perhaps releasing implementations now can establish a foothold in the market if there is going to be a competition over which methods go into the next generation, and then perhaps which methods will emerge from the black box.
This could have been an interesting data point if we did have primitive shaders to run on that workload. The number is in the same region of the Vega white paper's peak NGG discard rate, although the distance in terms of peak, PR blurbs, and a different architecture on someone else's unique workload make it risky to infer much.