CPU Security Flaws MELTDOWN and SPECTRE

Read the whole article and supporting benchmarks which show slight degradation for both Intel and AMD. The lag is felt on both Intel or AMD cpu's.

Their conclusion for both Intel and AMD processors was:
What lag? As far as i can see, they dont ever say the have lag. Disk performance with the new sec patches is a well benchmarked thing. AMD doesn't see the same performance loss as intel does and intel only seems to loose performance as you increase queue depth which in a desktop PC is going to be an extremely uncommon situation.
 
Why is it that switching between kernel and user mode has such a heavy performance impact anyway?

Because of Meltdown mitigations. Kernel space now has to be mapped on a syscall, and unmapped upon return from said call. Earlier kernel space was always mapped along with user space.

However Meltdown is only a problem on shared kernel machines. These new exploits can also access data in different VMs (and the Hypervisor). That's bad.

Cheers
 
Disk performance with the new sec patches is a well benchmarked thing. AMD doesn't see the same performance loss as intel does and intel only seems to loose performance as you increase queue depth which in a desktop PC is going to be an extremely uncommon situation.

The Meltdown mitigation patches add a cost to every syscall. The more syscalls, the bigger the cost. With normal file operations the number of syscalls is so low it doesn't make much of a difference. Only when you do mass file (or network) operations, like a unix find or du do you see a significant cost. I benchmarked a du -s * on my projects folder before and after Meltdown patches where applied in Ubuntu and saw a 25% increase in execution time. Crowdstrike's cloud based firewall solution saw a 45% increase in processor load when the Meltdown patches where applied.

The recent HPET debacle at Anandtech is also caused by Meltdown mitigations. HPET allows for fairly high resolution timing measurements using either counters or interrupts. The counters are memory mapped but privileged so you can only read them through a syscall. The interrupts also pay the syscall tax. The higher the resolution the higher the cost.

Cheers
 
The difference being is that the Meltdown patches aren't applied to AMD CPUs.
 
Maybe unprivilege the timers then? Like, how could you rely on a timer result that's been heavily taxed anyway...?
 
HPET isn't supposed to be some nearly cycle accurate counter/timer. Better than PIT, yes.
There's a reason newer windows version don't use it by default (on new enough cpus at least). Timestamp counter (TSC) is what you typically use nowadays, and that is readable by userspace. But note old cpus (before Core2 or so) didn't have TSCs which were really usable for "serious" timing - nowadays they run at constant frequency (not whatever happens to be the current boost frequency...), aren't stopped in higher C-States, and are synchronized across all cores.
That said, some of the games anandtech was using showed a really massive difference with HPET. They must have queried time A LOT - probably more than they should, I suppose noone really cared since developers all were getting TSC used so it was cheap enough to not really be noticeable...
 
HPET isn't supposed to be some nearly cycle accurate counter/timer.
I didn't intend to imply that.

It's just that the performance impact we see with HPET on Intel CPUs are caused by the Meltdown mitigations. What should just be a syscall to escalate privilege levels and a read memory location is now a memory serializing event.

The same would be true whenever you poll hardware registers through syscalls.

Better than PIT, yes.
There's a reason newer windows version don't use it by default (on new enough cpus at least). Timestamp counter (TSC) is what you typically use nowadays, and that is readable by userspace. But note old cpus (before Core2 or so) didn't have TSCs which were really usable for "serious" timing - nowadays they run at constant frequency (not whatever happens to be the current boost frequency...), aren't stopped in higher C-States, and are synchronized across all cores.

HPET facilitates two things, counters and timers. the new TSC counters (without bugged C-state behaviour) are much better than HPET; Much higher resolution and much lower overhead. HPET event timers, though, are both lower overhead (less latency) and higher resolution than the alternative, ACPI. Some, mostly audio related, applications/drivers won't run without HPET.

I'm guessing forcing HPET on in Windows 10 as Anandtech did, forces both timers and counters to utilize HPET. Ideally you'd want TSC counters (if present) and HPET timers, I'm guessing this is default Windows behaviour if HPET is enabled in the chipset. So yeah, the Anandtech measurements do have an issue that needs to be adressed.

Cheers
 
HPET facilitates two things, counters and timers. the new TSC counters (without bugged C-state behaviour) are much better than HPET; Much higher resolution and much lower overhead. HPET event timers, though, are both lower overhead (less latency) and higher resolution than the alternative, ACPI. Some, mostly audio related, applications/drivers won't run without HPET.

I'm guessing forcing HPET on in Windows 10 as Anandtech did, forces both timers and counters to utilize HPET. Ideally you'd want TSC counters (if present) and HPET timers, I'm guessing this is default Windows behaviour if HPET is enabled in the chipset. So yeah, the Anandtech measurements do have an issue that needs to be adressed.
I thought for timers local apic is used these days? The TSC even has a nifty tsc deadline mode, which will raise an interrupt in the lapic when it reaches some value. Albeit I'm not sure everybody implements this, might be intel only, but lapic certainly is not.
 
windows update didn't install the microcode update for me, even installing 1803 recently, but, donwloading it from the link it applied with no problem and it's working, also on my quick benchmarks (Geekbench) there was no performance loss, but that's expected for this sort of thing
 
Google and Microsoft disclose new CPU flaw, and the fix can slow machines down
May 21, 2018

Microsoft and Google are jointly disclosing a new CPU security vulnerability that’s similar to the Meltdown and Spectre flaws that were revealed earlier this year. Labelled Speculative Store Bypass (variant 4), the latest vulnerability is a similar exploit to Spectre and exploits speculative execution that modern CPUs use.
...
“If enabled, we’ve observed a performance impact of approximately 2-8 percent based on overall scores for benchmarks like SYSmark 2014 SE and SPEC integer rate on client 1 and server 2 test systems,” explains Leslie Culbertson, Intel’s security chief.
...
“Microsoft previously discovered this variant and disclosed it to industry partners in November of 2017 as part of Coordinated Vulnerability Disclosure (CVD),” says a Microsoft spokesperson. Microsoft is now working with Intel and AMD to determine performance impacts on systems.
https://www.theverge.com/2018/5/21/...nerability-speculative-store-bypass-variant-4
 
I think the discovery, research on fixes, and design considerations of preventing further vulnerabilities is far more important than a 5-10% performance loss overall. I know I'm speaking from a consumer perspective and there are possibly significant commercial interests involved but these are simply things that need to be fixed and the spotlight on them is great.
 
Just imagine when the first generation of processors is on the market with HW fixes for S/M vulnerabilities. Suddenly, you'd get a performance increase gen-2-gen of not only 5 % but 10-15%!! ;)
 
Just imagine when the first generation of processors is on the market with HW fixes for S/M vulnerabilities. Suddenly, you'd get a performance increase gen-2-gen of not only 5 % but 10-15%!! ;)

That's assuming that hardware fixes don't come at the cost of performance enhancing hardware features (design choices, design shortcuts, design assumptions that are no longer valid in the face of these security threats, etc.).

Regards,
SB
 
Yes. Normally though, everything you have dedicated circuits for is much quicker than emulating it in software or via macros.
 
Another side-channel attack disclosed.

https://www.blackhat.com/us-18/brie...rotecting-your-cpu-caches-is-not-enough-10149
We present TLBleed, a novel side-channel attack that leaks information out of Translation Lookaside Buffers (TLBs). TLBleed shows a reliable side channel without relying on the CPU data or instruction caches. This therefore bypasses several proposed CPU cache side-channel protections. Our TLBleed exploit successfully leaks a 256-bit EdDSA key from libgcrypt (used in e.g. GPG) with a 98% success rate after just a single observation of signing operation on a co-resident hyperthread and just 17 seconds of analysis time.
 
Latest Win10 Cumulative update has some more related fixes https://support.microsoft.com/en-us/help/4343909/windows-10-update-kb4343909
  • Provides protections against a new speculative execution side-channel vulnerability known as L1 Terminal Fault (L1TF) that affects Intel® Core® processors and Intel® Xeon® processors (CVE-2018-3620 and CVE-2018-3646). Make sure previous OS protections against Spectre Variant 2 and Meltdown vulnerabilities are enabled using the registry settings outlined in the Windows Client and Windows Server guidance KB articles. (These registry settings are enabled by default for Windows Client OS editions, but disabled by default for Windows Server OS editions.)
  • Addresses an issue that causes high CPU usage that results in performance degradation on some systems with Family 15h and 16h AMD processors. This issue occurs after installing the June 2018 or July 2018 Windows updates from Microsoft and the AMD microcode updates that address Spectre Variant 2 (CVE-2017-5715 – Branch Target Injection).
 
So, we can expect Intel PR do a 180 in a few months time when the new SKUs are available; From downplaying Spectre/Meltdown mitigation performance impacts to emphasizing how the new SKUs won't suffer massive performance degradation, thus forcing an upgrade cycle

Cheers
 
Back
Top