AMD's paper could use some copy-editing. It mangled the V1-1 mitigation segment and it looks like it failed to include the fixed code sequence. Some grammatical errors and some sloppily handled splits across page breaks.
I'd assume those that need to know this most would get the point, but it's unfortunate.
Mitigation V2-4 isn't particularly informative since it varies so much between implementations--which would be the place that would need more detail.
AMD seems to be saying that it intends to harden its return address predictor, since it doesn't expect future processors will need the option to flood it to prevent hostile code from setting a bad entry.
Interestingly, Intel's stating they intend to have CPUs that do not need software mitigation for Meltdown and Spectre later this year.
https://www.slashgear.com/intel-spectre-and-meltdown-proof-cpus-coming-this-year-25517122/
Changes to processor architecture are in the pipeline to permanently bypass the Meltdown and Spectre loopholes. However, it’ll take a little time to get them ready, and Intel says that the updated chips won’t be available on the market until later in 2018. It’s unclear what ranges Intel is prioritizing, since the security flaws affect so many models
Given the lag time for significant design changes, I'm curious what could be done. Google became aware of the exploits last June. Tapeout to launch of a chip this year could take 2-3 quarters. How long Intel had to make changes or how hacky they may be is uncertain.
Meltdown might be amenable to something of a quicker fix. If the hardware is positioned to know that it's in the scenario where user code is hitting a kernel address, it might suppress the operation like AMD does with its specific check.
If that part isn't changed, perhaps the load pipeline can be made to disable forwarding or zero out the value in the faulting scenario. That wouldn't necessarily require new behaviors in the rest of the chip.
Spectre is the more pervasive one, and if Intel means no software workarounds like serializing before branch checks, retpolines, and the indirect branch control instructions and barriers, I'm curious what would be changed in what is publicly a short time across multiple hardware units and scenarios.
Not knowing the details of the hardware, perhaps there's enough information nearby for the pipeline to trap out if it detects a subset of instructions that can generate side-effects not rolled back by standard misprediction handling.
Adding a new cache partition seems expensive, and not speculating at all seems impractical.
Maybe expand the role of the line fill buffers to delay booting things out of the L1, although that's a complex set of areas to change.
Perhaps a variation on what is being done to counteract Meltdown in Power?
https://git.kernel.org/pub/scm/linu...t&id=aa8a5e0062ac940f7659394f4817c948dc8c0667
It purges the L1 upon exiting the kernel or hypervisor, since it apparently does catch permissions violations if a load misses the L1. That seems like a specific corner, perhaps a window opened up by the way-predicted data cache and parallel TLB and directory checks.
Perhaps a hack fix by Intel if it's rushing could be to make it so a Meltdown or Spectre dependent access prompts an L1 wipe, or perhaps less expensively invalidate one line from other cache sets so that there's no identifiable pattern to the invalidations.
That requires tracking which operations loaded a value speculatively, and somehow through a dependence chain launched a load with an address derived from it.
Some of that might be partially present, based on how the register allocations are tracked and freed for standard mispredicts.
The pipeline could then detect that this happened, and wipe all or part of the L1.
TSX's low-latency line invalidation capabilities might make it possible to do this with less overhead.