- Exclude AMD from the PTI enforcement. Not necessarily a fix, but if AMD is so confident that they are not affected, then we should not burden users with the overhead"
So now its a matter of time to see if they can indeed trigger on AMD and if so, then the config will need to change to include them for PTI.
It's quite possible this specific vector won't hit AMD, for now.
Perversely, this exemption may lead to AMD systems becoming less secure over time if additional attacks on KASLR or some other exploit that can be preempted if the kernel is generally unavailable in user space comes along, since that was the original intent of the isolation changes. Additionally, the isolation changes may be the basis for further mitigation strategies for the side-channel attacks mentioned in some of the KAISER proposal's source documents that AMD was vulnerable to but were ignored or listed as "future work".
I think it would behoove AMD to operate under the assumption that it won't enjoy a status of being a beneficiary of a less expensive but provably less safe memory mapping decision forever.
Frankly, it may behoove AMD and Intel, or perhaps many stakeholders in hardware and software, to make some decisions about how to define and manage not just instruction or execution semantics, but the characterization and management of how knowable the behaviors or history a given section of the system or stretches of code can be.
Since these attacks exploit a 'design quirk' in the implementation of speculative execution inside the CPU pipeline, it would be interesting to know on how many different processors and architectures the attack would be reproducible. While ARM processors take the lions share in today, there are still countless devices using some kind of PowerPC variant (and a simplified Linux as OS). Would those devices equally at risk?
On a side note, in the x86 space, speculative execution has been employed since mid '90s (P6 and 6x86) and the concept is even older and certainly not limited to x86 architectures.
Why nobody came up with this concept of attack before? Is this vulnerability a consequence of a CPU design fault, or something has changed at OS level, in the last few years, that made this attack possible?
Multiple trends, such as:
- high-precision timers
- desire for more performance in more complex system corners (need to optimize where things haven't been highly optimized, like system calls, synchronization, load/store hazards, correlating behavior across domains)
- desire for better profiling, tuning, self-management (perf counters, trace data, event records)
- desire for easier development and support
- power efficiency/performance (ex. aliasing to simplify hardware and save power, power/clock management)
- the increasing difficulty in exploiting browsers and operating systems
- virtualization (reduce overhead, latency in a manner that interacts with virtual memory and OS)
- hardened software leaning more heavily on isolation measures leveraging the kernel
- different cycles in hardware design versus software
- different cadences or silo-ed communication for the individual speculation measures
- increasingly sophisticated nation-state and criminal organizations for IT crime
- maturation of data technologies and large revenues in businesses using data or making data a business
- time it took to take research from academic curiosity and realize it had broader potential
- the rise of server/cloud infrastructure that ripped out the original assumptions about local control of the hardware
And so on.
I have some thoughts on what some mitigations could be in the future, but also what that might mean for designing and understanding these systems or developing for them. Being able to reason about system behavior and performance, or to understand why things may fall apart cuts both ways, as it's a question of what someone is able to know, and the power that knowledge imparts is directed based on the motivation of the person that has it.
Also, I'm curious about in-order architectures with run-ahead, or in-order architectures with explicit speculation measures that might have side effects, or what other ARM vendors (Nvidia Denver, Samsung's M cores, Qualcomm's server, etc.)
Similarly, if I have time I need to get some references on discussions about hardware speculation, speculation in general, and hardware/software management of system functions.
Linus Torvalds' strong words about speculation remind me of a raft of discussions about dynamic vs static performance measures, prediction vs predication or conditional moves, software vs hardware TLB management, weak memory ordering versus strong ordering with speculation, the decision by Linux and others to have kernel and user memory mappings in the same space, etc.
In isolation, I've found a number of such arguments interesting and compelling, but there's an interesting emergent property to when all those items came together--and then at some point everything was different.