CPU Security Flaws MELTDOWN and SPECTRE

The first changes for implementing Meltdown and Spectre mitigation for Linux ARM64 were reported on:
https://www.phoronix.com/scan.php?page=news_item&px=ARM64-Linux-4.16

Contrary to a news story indicating that Cavium had stated Meltdown affected ThunderX2 (Vulkan), it seems like that custom ARM is being whitelisted for not needing KPTI.
Qualcomm's Falkor CPU may be in the reverse situation, where a workaround for TLB updates is partially removed because the old assumption about shared kernel and user mappings has changed with KPTI being on.
There seems to be a desire to expand the whitelist for cores known not to need Meltdown mitigation, including a possible wrinkle with checking for more than one core and turning on KPTI if any core needs it (eg. bigLITTLE).

Branch prediction hardening for ARM's standard cores and the custom cores in this pull as well.
 
Just to make sure, what ARM calls "Variant 3" is in fact Meltdown, and "Variant 3a" some sort of Meltdown-related thing, right?
Spectre only has 2 variants
Very confident it is right.
The link to ARM in my post you quote shows it as Meltdown, as does the whitepaper that can be accessed same page.
Says:
What are the attack mechanisms?
There are three main variants of the exploits, as detailed by Google in their blogpost, that explain in detail the mechanisms:

  • Variant 1: bounds check bypass (CVE-2017-5753)
  • Variant 2: branch target injection (CVE-2017-5715)
  • Variant 3: rogue data cache load (CVE-2017-5754)
In addition, Arm has included information on a related variant to 3, noted as 3a, in the table below.

Just to clarify my context was Meltdown as it was a just to add to the 3dilettante post I quoted that was also on same subject and had important relevant information that subtly changes some of what was said.
As part of the 3a caveat and mechanism on the Cortex they state in the paper:
Practicality of this side-channel
This side-channel can be used to determine the values held in system registers that should not be accessible. While it is undesirable for lower exception levels to be able to access these data values, for the majority of system registers, the leakage of this information is not material.
Note: It is believed that there are no implementations of Arm processors which are susceptible to this mechanism that also implement the Pointer Authentication Mechanism introduced as part of Armv8.3-A, where there are keys held in system registers.
.....
Software Mitigations
In general, it is not believed that software mitigations for this issue are necessary. For system registers that are not in use when working at a particular exception level and which are felt to be sensitive, it would in principle be possible for the software of a higher exception level to substitute in dummy values into the system registers while running at that exception level. In particular, this mechanism could be used in conjunction with the mitigation for variant 3 to ensure that the location of the VBAR_EL1 while running at EL0 is not indicative of the virtual address layout of the EL1 memory, so preventing the leakage of information useful for compromising KASLR.
So I would still be careful about the 3 models in the table (link in my OP) shown with the 3a vulnerability even with the caveat.
If anyone interested here is the paper on the Pointer Authentication Mechanism that may be implemented/used if supported.
https://www.qualcomm.com/media/documents/files/whitepaper-pointer-authentication-on-armv8-3.pdf
 
Last edited:
At this point, the executives for AMD and Intel have committed to some level of hardware-level fix for their upcoming designs. Intel is saying a design with hardware-level changes specific to Meltdown and Spectre will appear at some point later in 2018, and AMD gave similar language for Zen 2, which was recently announced as being design-complete.

What those changes could be, their level of effectiveness, or if later designs will have further improvement is unknown at this point.
AMD's whitepaper does have some hints for what some of the changes could be. It indicates that some future design with more than 32 entries in the Return Stack Buffer will no longer require that privileged code spam the return predictor per V2-3.
V2-2 hints at possible architectural changes in the future for controlling indirect branch speculation. The IBRS-related changes in V2-4 would be a form of architectural change, although partly software-driven. AMD didn't commit to how V2-4's features would carry forward, and it's a bit more complicated knowing which sub-features apply to which cores.
AMD's a bit quiet on V1 architectural changes, other than promising that the LFENCE serialization setting used for software fixes will be supported going forward.

Intel does promise a Meltdown fix in hardware.
Spectre seems to be more complicated to interpret, given the statement of hardware fixes versus a lack of clarity on the IBRS settings and other workarounds not being given some sort of expiration date.
One software/hardware direction from Intel's speculation paper (4.4) that could apply to V1 and V2 is memory protection keys on future hardware with hardware Meltdown mitigation. While not wholly in hardware, something like this is something of an indication that both Spectre variants could have additional protection.

For what it's worth, it appears that VIA's latest x86 core is not affected by Meltdown, but is Spectre-vulnerable.
Some of the language concerning the difficulty in exploiting Spectre may be reminiscent of AMD's language concerning the "near-zero" chance of exploiting V2.
https://fuse.wikichip.org/news/733/zhaoxin-launches-their-highest-performance-chinese-x86-chips/
"We’ve asked Zhaoxin if they are affected by the recent security vulnerabilities and they confirmed that the KX-5000 series is unaffected by Meltdown. They also noted that their chips are indeed affected by Spectre, adding that it requires a much more complex sequence of operations, making an attack incredibly difficult and impractical."


Unclear at this point is how deep the various changes can be, which may be a function of resources, how long the vendors knew of the vulnerabilities, how far they are willing to go to mitigate, and what existing features could be leveraged to get better time to market for mitigation.
Intel's memory key + Meltdown fix could be consistent with a change that more promptly checks for access faults, though there could be various ways this could be implemented with varying levels of continued speculation.
AMD's hinted direction for V2-3 may mean that it will be keying some of its RSB hardware by privilege level. Other elements might be available for an implied protection of the kernel, given the close presence of the TLBs.
Less clear is what might be available for cases where privilege boundaries are not being crossed in AMD's case.

Upon reflecting on AMD's current immunity from Meltdown, I wondered how painful KPTI could have been if it were vulnerable. AMD's ASID implementation seems more heavily linked to virtualization, and a single hypervisor at ID 0 and the rest being guest VMs. This seems closer to Intel's VPID rather than the orthogonal PCID functionality that helps salvage some of the performance lost to KPTI.

Perhaps this less-flexible permissions model correlates with simpler hardware and to AMD's speculation stopping more quickly at privilege boundaries. It does seem to carry forward into its memory encryption measures, which apply at a VM-level.
Intel's situation has left it vulnerable due to its more permissive speculation window, although perversely that flexibility might have left room for elements that salvaged some of the performance lost like PCID or in the future might help more with Spectre (MPK).


I'm still not sure if it's better if Intel were to reach a point where it can dispense with KPTI, rather than AMD and Intel reaching a better support level for separate OS and user mappings.
How kludgey the hardware workarounds would be for designs nearly finished is unclear. If it's blocking speculation, it might hurt performance outside of the corner case. Measures that kick in specifically when such mis-speculation occurs or do more to encapsulate side effects might be out of the scope of a late-stage silicon tweak.
 
To cover some of the last questions over affected architectures:
MIPS has two speculative processors that are vulnerable to Specter V1 and V2, but not Meltdown. Other licensed cores or derivatives are unknown. The Loongson line of MIPS cores have some decently speculative pipelines as an example, though how MIPS handles privileged mode may mean the same set of conditions is less likely than the more readily co-resident user/kernel mapping environments of x86, POWER, and ARM.
https://www.mips.com/blog/mips-response-on-speculative-execution-and-side-channel-vulnerabilities/

Fujitsu's advisory has updated to indicate its SPARC processors are unaffected by Meltdown, with Spectre remaining under investigation.
http://support.ts.fujitsu.com/content/SideChannelAnalysisMethod.asp
From its blog, it goes further to state its mainframe architectures are secure by default due to the claimed impossibility of running hostile code. It claims its /390 based mainframes are immune to these exploits. I'm not sure where those fit architecturally relative to the S/390 Linux patch changes that include at least some branch prediction management changes.
Fujitsu claims its x86-based mainframes are secured by an intermediary translation layer that prevents arbitrary code from making it into the system, though some secondary products or third-party components wouldn't have this and could be affected.
http://blog.global.fujitsu.com/mainframes-mainly-unmoved-threat-spectre-meltdown/
 
Was guaranteed to happen and this will not be the only evolution to occur for Spectre/Meltdown; latest paper identifies and proves further weakness using write request with cache coherence protocols.
Interesting this paper involved on of Nvidia's senior R&D scientist/architecture designer.
MeltdownPrime and SpectrePrime: Automatically-Synthesized Attacks Exploiting Invalidation-Based Coherence Protocols https://arxiv.org/pdf/1802.03802.pdf
They do confirm they created and successfully tested this approach against Intel CPU, but the concept applies more broadly as well.
In theory the patches (I am a bit wary to say all) should also mitigate this new attack vector, but it will be another headache for a HW solution.
 
Since cache is globally distributed repository of state, it would seem to follow. I think some of the KAISER proof of concept work or the papers it cites notes side channel attacks that cross cores or exploit an LLC, which serves as some precedent.
An interesting element here in Nvidia's process is leveraging memory model validation tools to automatically scan through the design's state space and detect vulnerabilities.

Given where this is, I'm not sure how easily a hardware mitigation can exist without involving a broader architectural or platform change.

Long ramble:

Other considerations besides write invalidation may depend on elements like the cache protocol's policy for migrating ownership of cache lines, which may change the timing of reads to addresses based on whether an address had been read into a separate chip or coherent client. For example, some of AMD's later server processors will assign the cache line most recently loaded from another cache a status that gives it an owned or exclusive state for responding to snoops. If the prior cache was in a different socket, the same chip with separate L3s, or some interaction with exclusive victim cache invalidation, another thread may be able to detect that the cache line that is permitted to respond to a snoop is more distant.
For protocols that implement a forwarding state like Intel (Zen has an F state, unclear if the meaning is the same as Intel's), there may be timing games there as well. Protocols that do more to track locality and sharing would potentially give more knobs to turn to manipulate timings.
It might take a more complex threading scenario or additional cache manipulation to help reset the attacker thread's cache and cause it to re-snoop its prepared values.

Nvidia's focus was on write-invalidate and also write-allocate scenarios. The alternative to that would be a protocol that updated all copies of a cache line, whose downsides have left no common architecture that implements it. Depending on how such a cache were implemented, the possible synchronization overhead of updating lines might create a signal. While that might not be as long-lived, it may provide a more clear signal if exploited since it would involve actively contended lines that would be kept most recently used in the cache and less subject to spurious eviction. The amount of cache preparation needed might be reduced by the inversion of behavior as well, possibly requiring only a few actively selected lines to serve as the side channel.

Brainstorming even hacky fixes to more global versions of Spectre may point to why Intel has been reluctant to give a clear indicator that its IBRS and other changes an indicator that they will ever go away.

Local to a core, more assertive changes may be to collect operations in buffers outside of the formally coherent cache, until speculation is resolved. This is done for writes, and there may be possible ways to hold a number of reads as part of a deferred miss handling scheme, perhaps short of allocating storage specifically for speculation.
I had thought of a possible hack for reducing the accuracy of a side channel by making the CPU's speculation rollback logic invalidate enough lines from all cache ways so that it's not easily known which ones were the subset that had been influenced by a secret value. It would seemingly leave a way of timing things based on whether the cache was perturbed at all as a side-channel, but that seems like it would be significantly more difficult to tease out given general operation.
It would seem like a bit more information could be encoded in cache state for a direct-mapped or fully-associative cache in that scenario.
The number of lines per way is a cap for what can be expressed with an N-way associative cache, whereas direct-mapped makes each way as informative as a line and fully-associative has the whole cache as a way. Random invalidations may be able to help noisy up the picture, short of a full purge or some kind of cache rollback from miss buffer or victim cache. It would take more work since there's more leeway for code to play around with cache sectors and without associativity constraining things.

Unfortunately, such local solutions don't work for more global caching because caches hierarchies generally don't "push" very well. While outside of instructions that invalidate or may implicitly invalidate lines could have architectural exposure, what elements unknowingly assume the straightforward and context-agnostic cache coherence model may not be readily evident from a functional or security standpoint.
Hiding speculation in this case could threaten to break coherence or consistency in a very complex and unforgiving portion of an implementation.
Perhaps fixes could be locally executed in a core, but globally mitigation would take on a more transactional character.
Creating a sort of context ID for speculative epochs might help control what changes are visible or add indirection/hashes to non-final state, although not without cost. Potentially, tracking contexts that are stomping on the speculative sets of another context or even their own need to prompt architectural or platform attention. A thread could find itself automatically constrained from jumping too far if its speculative and non-speculative accesses thrash, and a separate thread or process could be throttled or migrated to another core/cache if it becomes problematic.

Architecturally visible changes like protection keys, or specialization/hints for branch and memory instructions could give context of make speculative state more clearly a property of architecture. Some things, like Nvidia's tools-based approach, show some elements of this already have frameworks and quantifiable properties. Extending that to create measurements of how much a unit or storage element's state or history can express, how clearly, and for how long may need to be created.
 
Last edited:
AMD has a Spectre/Meltdown-like security flaw of its own
Researchers find 13 vulnerabilities in AMD’s Ryzen and EPYC chips, which could let attackers install malware on highly guarded portions of the processor.
...
It's unclear how long it will take to fix these issues with AMD's processors. CTS-Labs said it hasn't heard back from AMD. The researchers said it could take "several months to fix." The vulnerabilities in the hardware can't be fixed.
https://www.cnet.com/news/amd-has-a-spectre-meltdown-like-security-flaw-of-its-own/
 
Last edited by a moderator:
This seems more like the other class of exploits centered around the Intel management engine.
Some of these seem to require compromising the BIOS, or bad southbridge firmware. The CNET article is a bit light on detail.
I'm not sure why the vulnerability was disclosed so quickly.
 

Reportedly this is just Viceroy Research / CTS Labs trying to manipulate markets, something they're apparently headed to court for in Germany already.
The "flaws" apparently require you to actually manually flash malicious BIOSes on your machine
Quote from the supposed "masterkey vulnerability" for example straight from their "whitepaper"
Exploiting MASTERKEY requires an attacker to be able to re-flash the BIOS with a specially crafted BIOS update.

edit:
Also,
"The researchers gave AMD less than 24 hours to look at the vulnerabilities and respond before publishing the report."
 
The exploits seem to require having compromised the system already, given the ability to reflash the BIOS or load malicious (yet vendor-signed) software.
Should it be confirmed that the PSP's ability to validate its payload can be compromised, it could allow for some kind of persistent exploit, if a system were intercepted prior to delivery or briefly handled by a compromised administrator.

I'm not sure how effective these attacks can be for the security-conscious data centers, who would have to have their servers already compromised by an administrator or local privilege escalation already.

More thorough vetting of the claims seems necessary.

edit: There may be one specific point to this, if the claims are verified with proof of concept if the security model can be breached with regard to its multi-key SEV feature, which in part is meant to help mitigate compromised hosts.
 
Last edited:
Im allways a bit supsicious about thoses security firm.. coming from nowhere.... ( i dont say the breach are true or not... )..
 
https://arstechnica.com/information...-in-amd-chips-make-bad-hacks-much-much-worse/

The four classes of vulnerabilities—dubbed Masterkey, Ryzenfall, Fallout, and Chimera—were described in a 20-page report headlined "Severe Security Advisory on AMD Processors." The advisory came with its own disclaimer that CTS—the Israeli research organization that published the report—"may have, either directly or indirectly, an economic interest in the performance" of stock of AMD or other companies. It also discloses that its contents were all statements of opinion and "not statements of fact." Critics have said the disclaimers, which are highly unusual in security reports, are signs that the report is exaggerating the severity of the vulnerabilities in a blatant attempt to effect the stock price of AMD and possibly other companies.

Still, Dan Guido, a chip security expert and the CEO of security firm Trail of Bits, told Ars that the paper accurately describes a real threat. After spending much of last week testing the proof-of-concept exploits discussed in the paper, he said, he has determined that the vulnerabilities they exploit are real.


"All the exploits work as described," he said. "The package that was shared with me had well-documented, well-described write-ups for each individual bug. They're not fake. All these things are real. I'm trying to be a measured voice. I'm not hyping them. I'm not dismissing them.
"
 
Yay yet another sham business model created from the meltdown stuff. Creating companies who's sole purpose is to come up with ridiculous means of "hacking" hardware of enormous companies for the purpose of stock manipulation?
 
Yay yet another sham business model created from the meltdown stuff. Creating companies who's sole purpose is to come up with ridiculous means of "hacking" hardware of enormous companies for the purpose of stock manipulation?

It seems that a company went searching for vulnerabilities (good) and found some legitimate issues (good) but then chose to sensationalize them for maximum financial effect (bad). In some way it's not different than short sellers who scrub through company books looking for anything that can be harmful if published in the name of profit and "holding companies accountable". In case of security vulnerabilities though, the economic hard from the race to disclose/profit to innocent third parties may be quite substantial.
 
<Insert conspiracy theory connecting "CTSLabs" to Intel in order to sink AMD's ship here>

OkWlIxA.jpg



Also, Viceroy Research, which is under investigation in Germany (https://www.cnbc.com/2018/03/12/reu...ys-viceroys-prosieben-report-broke-rules.html), has released "AMD obituary" in which they say AMD's stock is worth $0.00 and only option for the company is to file for Chapter 11 bankruptcy (https://viceroyresearch.org/2018/03/13/amd-the-obituary/)
 
The "flaws" apparently require you to actually manually flash malicious BIOSes on your machine
In other news: startup security firm TheOnionSecurityResearch announces it has proven the upcoming AMD CPUs Zen+ & Zen2 are vulnerable to new attacks we have titled PUTINDIDIT, DEFINITELYNOTCIA & WOULDWELIETOYOU, these new attacks allow a malicious user to force Zen+ & Zen2 CPUs to constantly display a slideshow of images of Putin doing various manly things, mostly topless.
The shocking thing about these attacks is their simplicity: All that is required is to trivially compromise senior AMD CPU design staff, insert malicious code into the AMD CPU design libraries or automated chip layout software respectively.
TheOnionSecurityResearch provided AMD 1hr notice before going to print but when contacted the AMD representative had no comment. The AMD spokesperson who answered the phone was obviously disturbed by our revelation and repeatedly attempted to deflect our inquiries by insisting that we choose a number between 1 and 7.
Given AMDs apparent refusal to address the issue TheOnionSecurityResearch recommend that the US Govt immediately take the minimal sensible step of initiating Global Thermonuclear War against Russia.
 
Back
Top