Predict: Next gen console tech (9th iteration and 10th iteration edition) [2014 - 2017]

Status
Not open for further replies.
Maybe PlayGo system is a part of reserved RAM on PS4. Sony use a lightweight BSD system, sometimes I ask myself what are they doing with all the reserved RAM...
Haha, I think we're all asking that! :yes: But I strongly object to judgement without knowledge. It may well be its largely unused and therefore wasted, but that's knowledge we don't have.
 
Xbox one shows why we may not get huge memory, they opted for 3gb for OS and apps. As noted apps tombstone and close and a bit more memory would have been nice, the main mem is ddr3 so no doubt they could have gone huge if they wanted but cost gets in the way. They want to make the bom as low as possible.

I can see a split memory setup next tho, not because esram is poor but if apps become the big thing isolating them from game memory will probably be wise to ensure memory bandwidth availability.
 
Xbox one shows why we may not get huge memory, they opted for 3gb for OS and apps. As noted apps tombstone and close and a bit more memory would have been nice, the main mem is ddr3 so no doubt they could have gone huge if they wanted but cost gets in the way. They want to make the bom as low as possible.
But unlike Sony, we know that Microsoft has a plan for some of their OS memory reservation, which is support of universal apps. That's the frustrating thing about Sony's 3.5Gb reservation - it serves no clear purpose.

Stacked DRAM will probably change everything but the same problems will exist. Maximum bandwidth shared between two (or more) high performance resources (GPU and CPU cores) will need to be split and arbitrated, and coherence still needs to be managed.
 
8 gigs of DRAM raises the hackles of OS developers, but breaking double or triple digits in capacity with volatile non-ECC memory makes a lot of system designers paranoid. HPC nodes and single servers deal with large capacities, but when there's that much capacity, the chances of someone having something important subject to transient events like interference, upsets, or power interruption rises, and 128 GB is a lot to leave hanging around, possibly for weeks or years with the latest suspend modes.
HPC is very interested in providing non-volatile node storage, particularly if any of the non-volatile RAM techs take off, although the pricing situation or true long-term reliability may not be there for that quantity for this market (for whatever time frame we're considering).
Consumer tech has been lax in considering the reliability and data integrity of all that cheap storage, and adding two orders of magnitude to the vulnerability surface might make people notice.

Aside from resistance to power delivery concerns, there are some implications to having large amounts of DRAM. The consoles' suspend to RAM mode raises their standby power to 10W or over. If a measurable fraction of that is dedicated to keeping their 8GB pools refreshed, upping the ante by 16x to satisfy a numerological twitch would require some elaboration from a technology standpoint to indicate why it won't be drawing the Wii U's active power at standby.
 
8 gigs of DRAM raises the hackles of OS developers, but breaking double or triple digits in capacity with volatile non-ECC memory makes a lot of system designers paranoid.
I manage a server farm where none of the servers have below 512Gb and where we are able to monitor ECC bit error detection over time and error numbers are low. But we have calculated that in real terms, that is the cost of the number of bits in a server being used for actual storage rather than parity error detection (one bit per 64-bit word) and re-running a computation if an error is detected, that no parity RAM would serve us better with software error detection - which we're mandated to have as well.

I'm curious who is getting paranoid about this. Personally, the only machine with 8Gb in my house is an 11" Macbook Air! Everything else has 16Gb or 32Gb. Obviously the RAM supplier, your electromagnet environment and power supply are factors for big data environments.
 
I manage a server farm where none of the servers have below 512Gb and where we are able to monitor ECC bit error detection over time and error numbers are low. But we have calculated that in real terms, that is the cost of the number of bits in a server being used for actual storage rather than parity error detection (one bit per 64-bit word) and re-running a computation if an error is detected, that no parity RAM would serve us better with software error detection - which we're mandated to have as well.
I am not sure consumer devices can count on the power quality of service for a server farm, or what level of software error detection we're expecting from game consoles.

I'm curious who is getting paranoid about this. Personally, the only machine with 8Gb in my house is an 11" Macbook Air! Everything else has 16Gb or 32Gb. Obviously the RAM supplier, your electromagnet environment and power supply are factors for big data environments.
Linus Torvalds is one. He used the recent Rowhammer exploit disclosures to hammer on that point again, although he's been a major proponent for more pervasive ECC for far longer.
 
I am not sure consumer devices can count on the power quality of service for a server farm, or what level of software error detection we're expecting from game consoles.
There is no commercially available equipment in the server farm and for the memory capacity that we buy, non-ECC RAM is impossible to source.

Linus Torvalds is one. He used the recent Rowhammer exploit disclosures to hammer on that point again, although he's been a major proponent for more pervasive ECC for far longer.

Can you link? My, admittedly dated, reading of the row hammer exploit was that it was limited to very specific hardware and which could be fixed with better hardware rather than paranoia about large RAM configurations. Indeed it could work on low-RAM configurations and the density of the RAM seemed irrelevant.
 
Can you link? My, admittedly dated, reading of the row hammer exploit was that it was limited to very specific hardware and which could be fixed with better hardware rather than paranoia about large RAM configurations.

Torvalds' point was that row hammer would have had serious problems getting off the ground if the integrity of DRAM data had been taken seriously since capacities have become so large. He's harped on this for far longer than the latest densities for commodity DRAM, and it is a widespread problem for some major manufacturers for DRAM manufactured in recent years.
http://users.ece.cmu.edu/~omutlu/pub/dram-row-hammer_kim_talk_isca14.pdf (pg. 20)


It can be at least mitigated by increased refresh rates or changes in memory controllers or DRAM chips.
LPDDR4 appears to have some kind of mitigation, either for the row hammer scenario or it has some additional hardware measures that reduce the chance of row coupling inverting bits. DDR4 might be out of luck, depending on undisclosed and uncertain changes for any number of memory controllers and vendors.

The base point that DRAM is increasingly prone to physical problems that compromise integrity is part of his larger point, which the row hammer exploit only served as a point of emphasis. ECC or parity for on-chip registers and caches is already highly widespread, even though that used to be a predominantly big-iron feature. Physical exploits also compromise any sort of virtual memory protections. I do not know how client software can be expected to recover if a physical row that happens to map to kernel space experiences an uncorrected and undetected bit flip.

Some of Torvalds' statements with regards to client ECC.
http://www.realworldtech.com/forum/?threadid=148351&curpostid=148363
http://www.realworldtech.com/forum/?threadid=148351&curpostid=148395
There are more, and over quite some time in the past.

Indeed it could work on low-RAM configurations and the density of the RAM seemed irrelevant.
The time frame for the most heavily affected manufacturers is rather late in the game relative to when Linus Torvalds started campaigning for ECC as standard. The emphasis on density, which means lower voltages, tightly packed rows, and stretching refresh periods has made it easier to perform physical compromises of the devices, particularly with the cost aspect for commodity products. Stacked DRAM worsens a number of elements, such as how it uses a single array to service a whole burst--which means you can't tack on extra chips for ECC bits. Their density and thermal environments are also worse than they would be for DIMMs.
Row hammer is more of a spectacular example of a general trend that irritates Torvalds.
 
Torvalds' point was that row hammer would have had serious problems getting off the ground if the integrity of DRAM data had been taken seriously since capacities have become so large.

While he has a point, I don't know that these types of exploits are high on Microsoft and Sony's issue lists. The vulnerability itself is limited to specific hardware configurations and being able to purposefully exploit this requires specific knowledge of the system in question. I also understand that rowhammer qualification testing is now becoming more common among OEMs and there's no reason console manufacturers couldn't specific this qualification to their DRAM supplier.

Torvald's second post is definitely interesting, he suggests that data corruption is a big problem although there's no data to support the claim, nor is there any context. Is he talking about an 8Gb machine working for a day or running for a year. Our data suggests different although it's quite possible that ECC DRAM is less error prone than non-ECC DRAM - excepting the parity detection built in.
 
While he has a point, I don't know that these types of exploits are high on Microsoft and Sony's issue lists.
The DRM-happy platform holders have a vested interest in mitigating exploits that could lead to access to the privileged stores or system reserve that might be holding their validation signatures and encryption keys. The monetary incentive to getting access is significant for well-funded criminal interests. OtherOS was enough motivation to compromise the PS3, and this was helped in part by successful bus glitching--something not related to row hammer but another case where assuming all is well physically compromises assumptions made by system software.

The vulnerability itself is limited to specific hardware configurations and being able to purposefully exploit this requires specific knowledge of the system in question.
I suppose it depends on what is meant by specific when one component of this specificity is every DIMM manufactured for years on end. The consoles are very specific hardware configurations, and there are those that will learn them very well.

I also understand that rowhammer qualification testing is now becoming more common among OEMs and there's no reason console manufacturers couldn't specific this qualification to their DRAM supplier.
Until there's another exploit they didn't think to test for. ECC and system monitoring would have provided more defense in-depth, because the system would have been able to correct or at least detect the mass of errors related to physical manipulation of the system, which would allow it to clamp down on the iteration rate.
If there is one area where I expect significant interest in either on-die memory, stacking under the APU, or to a lesser extent interposer solutions, it is making it physically difficult to compromise the signal lines or probe memory without seriously risking the destruction of the test system.

This would also be a counter pressure to my earlier reference to RAM types that can persist without power, since systems have been compromised by cryo-cooling RAM and yanking system power. Non-volatile RAM makes it even easier.

Torvald's second post is definitely interesting, he suggests that data corruption is a big problem although there's no data to support the claim, nor is there any context. Is he talking about an 8Gb machine working for a day or running for a year. Our data suggests different although it's quite possible that ECC DRAM is less error prone than non-ECC DRAM - excepting the parity detection built in.
It's possible it's some indeterminate amount of time across many devices that make up the full range of the platform his organization winds up needing to bug fix or research.
Given the range of DRAM products out there, I'm not sure how many borderline bins are given to server-targeting products. Some value brands seem to be dodgier in the consumer space.
 
The DRM-happy platform holders have a vested interest in mitigating exploits that could lead to access to the privileged stores or system reserve that might be holding their validation signatures and encryption keys.

What I mean is, this is an exploit that can be prevented at the design stage. Only an oversight would allow this exploit to compromise future machines.

I suppose it depends on what is meant by specific when one component of this specificity is every DIMM manufactured for years on end. The consoles are very specific hardware configurations, and there are those that will learn them very well.

Google did comprehensive testing and found that not all machines were susceptible, although they anonomized the results. Like I said, specific configurations. And no doubt future consoles will have some form of exploit, but this particular exploit and it's particular relevance to the 'paranoia' about large RAM sizes, can certainly be protected against. You'll also note from the Google article linked to above that JEDEC are mandating rowhammer mitigation measures (next row refresh) as part of LPDDR4 and this will no doubt find it's way into other DRAM standards.

Kernel ASLR also helps. Not implemented as standard in linux but it is in Windows and OS X.
 
Google did comprehensive testing and found that not all machines were susceptible, although they anonomized the results. Like I said, specific configurations.
They declared their sample size was not representative, and they could not determine the year of the DRAM's manufacture. There are laptops from the same vendor, same CPU, and same year with different results. From those numbers, 5 of 8 models are compromised and it is possible for there to be multiple false negatives for the same configuration.
 
From those numbers, 5 of 8 models are compromised and it is possible for there to be multiple false negatives for the same configuration.
Last post because this is an egregious thread derail. There's no commonalty which pinpoints where issue lies although I wouldn't like to own Model #5 with DRAM vendor B :nope:. I don't think you can say a model is compromised where even using the same model it's the CPU architecture and DRAM vendor that introduces variations in testing.

But as I said above, the fundamental cause of this is known and can be engineered around. Nobody should be worrying about this being an issue in the next generation of consoles or using it as justification why they shouldn't have a lot more RAM than today's consoles.
 
Do consoles support virtual memory ?
PS4 has (or had at one point) 512mb of "paged' RAM may could be a yes but it was all bit vague.

edit: reading your question do they support, rather than can they theoretically support.
 
Well yes, do they support. do games at the moment absolutely have a memory limit that they must never exceed or are they free to use a swap file.
Could a dev if they so wish release a game that needed 10gb of memory to run (for example)
 
What do you mean by 'needs 10 GB to run'? You can have 10 GBs of assets and resources in a level and stream them in/out as needed - no need for PC-style virtual memory because the console devs can control this explicitly.
 
Even the concept of swap file makes my skin crawl, just thinking of the days when my PC had 32MB of RAM and it kept dumping data to HDD, completely halting any sort of operation for minutes, hours, until I rebooted the bloody thing.
 
Well yes, do they support. do games at the moment absolutely have a memory limit that they must never exceed or are they free to use a swap file.
Could a dev if they so wish release a game that needed 10gb of memory to run (for example)

The PS4 leaks had a flexmem allocation that was 1GB, which is set aside for the game but managed by the OS virtual memory system. It purportedly has .5 GB in RAM and .5 GB paged, so it permits some swapping for a fixed minority of the game's memory space.
Nothing has been mentioned about the Xbox One doing something like this, but both consoles are physically able to handle paging and their OS infrastructure is likely able to use it.
 
Status
Not open for further replies.
Back
Top