Instead of using OoO µops scheduling, Denver relies upon software optimization layer (Dynamic Code Optimization), which monitors perf counters and performs instructions reordering, loops unrolling, registers renaming and so on for frequently used "hot" parts of code, then it saves optimized µops code into RAM for reuse. More info on DCO can be found
here.
Architecturally the processor's behavior is externally different with respect to the hardware and optimization software.
For example, the uops written out in the optimized code are a VLIW format that is not 1:1 to the internal format. A simpler decoder than is present for the ARM instructions is needed to expand the in-memory format, which multiplexes fields and lacks signals for which unit will execute a uop for the purposes of code density.
The pipeline is also skewed so that fetched bundle can contain a read-operation-write chain in one bundle, which at least from an ISA level is not generally matched with standard cores outside of specific RMW instruction.
The skewing itself is not entirely without precedent, although I'm not aware if any ARM microarchitectures chose to skew sufficiently for both source memory reads and destination writes. More significant is how the architecture commits results to the architectural state and to memory in a variable-length "transaction" at the granularity of a whole optimized sub-routine, which neither ARM's ISA or uops permit.
That transactional nature is something that I'm curious about in light of the recent Meltdown and Spectre disclosures, since Denver's architecture has an aggressive "runahead" mode for speculating data cache loads and an undisclosed method for tracking and unwinding speculative writes in the shadow of a cache miss. Per Nvidia circa 2014, the philosophy was to load whatever it could then invalidate any speculative operations and queued writes, thus specifically relying on cache side effects to carry over from a speculative path.
Also unclear is how Denver tracked its writes, since its transactional memory method might have meant an in-core write buffer, or potentially updates to the L1 cache that could be undone. The latter case might mean additional side effects.
The prefetch path alone seems like it could be susceptible to a Spectre variant, and even if the optimizer were changed to do more safeguards there's a lag before it is invoked for cold code.
Denver was also quoted as having the full set of typical direct, indirect, and call branch prediction hardware that could be leveraged for Spectre variant 2.
Meltdown might depend on what's effectively a judgement call for when permissions are checked and transactions are aborted, and whether the pipeline's speculative writes affect the L1 in a way other ARM cores wouldn't.
Unfortunately, I think the one Nexus device that used Denver aged out of security updates just shy of finding out what, if any mitigations might be needed.
According to the following, at least some of the above apply to Xavier.
https://www.morningstar.com/news/ma...ystem-is-also-affected-by-security-flaws.html