Predict: Next gen console tech (9th iteration and 10th iteration edition) [2014 - 2017]

Status
Not open for further replies.
Aren't there already console games that bypass the traditional ROPs and use GPGPU instead? For those games, ROPs would be lost APU space that could have being better used with CUs, texture units or bigger caches.

I think I read @sebbbi's post talking about it.
Media Molecule's Dreams is PS4 exclusive and it leaves all the 32 ROPs fully unused. If you don't render triangles, ROPs don't matter to you.
 
I'm so confused now.
Dreams doesn't draw triangles so doesn't need the ROPs to transform them. Instead it uses compute to take volumes (defined as signed distance fields) and draw lots of 'sprites' over the surface to represent the surface. Loose, indistinct 'sprites' (properly known as splats) give them more impressionistic results seen in Dreams screenshots. Very small, dense splats give surfaces distinct edges.

There's only one game doing this not using triangles though. Everything else uses triangles and will do for a while, I'm sure, as discussed in the SDF discussion - it's a very interesting technique with incredible results, but it's not a cure all and Dreams is being made around the tech as much as the tech is being made around the vision of Dreams. We're still going to need to rasterise triangles this gen and the next.
 
Is that comfirmed? The quad channel DDR4 alone is 102.4 GB/s bandwidth. That is 3x more than any PC chip with an integrated GPU. To use this bandwidth, AMD would need to have at least 3x bigger GPU in that APU (taking into account delta compression and bigger GPU caches). That GPU would be comparable to next gen consoles. But... they also have an additional 512 GB/s HBM bandwidth for the GPU. To fully utilize that, the integrated GPU part would need to be as big as Fiji.

The source of this slide was apparently really AMD, but it wasn't a description of a real product they are already doing, but an example configuration of a custom HPC APU they tried to sell to various parties building supercomputers. That configuration would definitely not come out in a console.

AFAIK, all the big customers are going with nVidia, so that monster APU will likely never get made.

(Interesting aside -- that slide was published before the fake leaks, contradicted them, and a lot of people, including me, thought that the slide was less credible than the fake leaks. That slide is still one of the very few sources of information about Zen that comes from AMD, including the cache organization.)
 
Is that comfirmed? The quad channel DDR4 alone is 102.4 GB/s bandwidth. That is 3x more than any PC chip with an integrated GPU. To use this bandwidth, AMD would need to have at least 3x bigger GPU in that APU (taking into account delta compression and bigger GPU caches). That GPU would be comparable to next gen consoles. But... they also have an additional 512 GB/s HBM bandwidth for the GPU. To fully utilize that, the integrated GPU part would need to be as big as Fiji.

And in addition to this, they have crammed 32 MB of L3 cache, two fast memory controllers, and 16 Zen cores (that should be bigger than Steamroller cores). I am a bit sceptical.

But if this is our next gen console, I am positively surprised :)

As already mentioned, that APU is just AMD's example of what they could do. It's not a console APU. But we can see where AMD is headed and we know AMD has the ability to customize whatever technology they have in terms of CPU and GPU for their customers and semi custom what their industry customers want.
 
If they could make an APU with 8 Zen cores (16 threads), 16 Mb L3 cache, fully HSA, a bigger graphics "magenta area" and 16 Gb HBM, I will be happy enough.
 
"Conservative rasterization" should be called "liberal rasterization."
I was confused by the name too, but I guess 'conserving' has two meanings and in this case triangle info is being conserved, not discarded (as opposed to the 'reduced' or 'limited' connotation of conservative).
 
I said previously that i expect full on SSD in the next gen consoles. Changed my mind, i was thinking about this and it makes more sense if they go for 1TB SSHD with let's say 64GB mlc nand(or whatever is the go-to cost/longevity at the time, i hear Samsung's 3D tlc V-Nand is pretty good in the 850 evos) + regular HDD space and when a game boots it transfers it to the SSD partition. Would make more sense cost wise for both Sony and MS and it would provide a huge enough boost to make a difference. That is if it's viable by 2020 which i expect it would be but you never know.

Why would they do this? The IO performance of the harddrive is not really markable. What is, is disk space, if anything. Which is why they are not going to launch on anything remotely a SSD because they are just too expensive and will be smaller in capacity to anything comparable. Not going to happen.

What they will do is make the drive interchangeable like it is on the PS4. So people wanting to have quicker load times can buy an upgrade on their own. The key will be to have adequate disk space. Performance will not matter, as the games are not run directly off the harddrive anyway. They will rather go for more RAM than faster disk space. Remember; They want people to buy lots of content, so prioritizing disk space (at a lower cost) actually makes sense.
 
The Xbox already has 8gb flash storage for the OS / apps to alleviate the stock HDD I assume, an evolution of this seems a small but higher performance cache for games to buffer into to reduce IO contention and speed up game loading times, it might also make installing to playing games quicker off the disk which I assume will still be there, master that and a cheap SKU may become possible similar to the 360 Arcade but without the complexities for the developers, and with USB3 external storage the upgrade path is fairly cheap, and of course a branded internal disk in a caddy may well also be offered as an upgrade, then I suppose we must assume there will be a 2TB or more base console also.

tldr:
Price will be key, given all games have to go to local storage whatever it will be, could we see a return to the 2 SKU model to try and drive price down initially with just small flash storage and usb storage compatability? Storage has never been easier to upgrade post purchase now so many of the initial issues are gone..
 
Why would they do this? The IO performance of the harddrive is not really markable. What is, is disk space, if anything. Which is why they are not going to launch on anything remotely a SSD because they are just too expensive and will be smaller in capacity to anything comparable. Not going to happen.

What they will do is make the drive interchangeable like it is on the PS4. So people wanting to have quicker load times can buy an upgrade on their own. The key will be to have adequate disk space. Performance will not matter, as the games are not run directly off the harddrive anyway. They will rather go for more RAM than faster disk space. Remember; They want people to buy lots of content, so prioritizing disk space (at a lower cost) actually makes sense.
Tell that to XB1 Fallout 4 owners. :yep2:
 
Why would they do this? The IO performance of the harddrive is not really markable. What is, is disk space, if anything. Which is why they are not going to launch on anything remotely a SSD because they are just too expensive and will be smaller in capacity to anything comparable. Not going to happen.

What they will do is make the drive interchangeable like it is on the PS4. So people wanting to have quicker load times can buy an upgrade on their own. The key will be to have adequate disk space. Performance will not matter, as the games are not run directly off the harddrive anyway. They will rather go for more RAM than faster disk space. Remember; They want people to buy lots of content, so prioritizing disk space (at a lower cost) actually makes sense.

Who cares whats marketable? Your average attachment rate doesn't jibe well with employing super sized HDDs or SDDs as a core component of a console. Especially when the option to upgrade the HDD is pretty much standard outside of Nintendo.

Employ a size that will accommodate most owners and allow core and hardcore owners to float to their own solution.

I would like to see MS expand and Sony include flash on their next consoles to facillitate data to RAM at greater rates, which would lower load times while allowing more flexibility when expanding storage with third party drives.
 
Last edited:
As already mentioned, that APU is just AMD's example of what they could do. It's not a console APU. But we can see where AMD is headed and we know AMD has the ability to customize whatever technology they have in terms of CPU and GPU for their customers and semi custom what their industry customers want.
I would be very surprized if any chip company could produce that chip in 2016. Current console APUs were already close to the limits in chip size, and we are talking about 4x+ increase in CUs (assuming it has FuryX sized GPU to match that BW). That 32 MB L3 cache would take more area than the Xbox One ESRAM, as caches require extra space for tags. One Zen CPU core should be at least 2x bigger than one Jaguar core. So with twice as many cores, we are talking about at least 4x bigger area. I am just a bit sceptical whether this all is possible at 14nm and what kind of cooling is required for this beast. TDP must be quite extreme :)

That APU would meet my next gen demands, if next gen was launched in late 2017. 4x faster GPU and 6x faster CPU (assuming 1.5x clocks and 2x IPC, counting HT) would not exactly reach the usual generational gains, but it is clear that we are not going to see 10x+ gains anymore, since we are power limited nowadays.
I was confused by the name too, but I guess 'conserving' has two meanings and in this case triangle info is being conserved, not discarded (as opposed to the 'reduced' or 'limited' connotation of conservative).
Conservative rasterization means that it can rasterize too many pixels (but always includes each pixel that is covered). The "is pixel inside a triangle" test is conservative.
 
Last edited:
If we can post Dream Tech for the next generation then for me it would be whatever hardware that can process the Reyes software stack so games can finally look as good if not better than the original Toy Story. A 4K res output at 120 hz on the rendering would also be pretty awesome : >

Well, I can dream I guess..
 
I would be very surprized if any chip company could produce that chip in 2016. Current console APUs were already close to the limits in chip size, and we are talking about 4x+ increase in CUs (assuming it has FuryX sized GPU to match that BW). That 32 MB L3 cache would take more area than the Xbox One ESRAM, as caches require extra space for tags. One Zen CPU core should be at least 2x bigger than one Jaguar core. So with twice as many cores, we are talking about at least 4x bigger area. I am just a bit sceptical whether this all is possible at 14nm and what kind of cooling is required for this beast. TDP must be quite extreme :)

Would you say that large sums of embedded memory are a thing of the past for consoles? Seems like no one is willing to fab dram (minus intel) on any cutting edge process and sram is difficult to shrink. Large pools of fast external memory seem like the way forward for high performance embedded systems.
 
Dreams doesn't draw triangles so doesn't need the ROPs to transform them. Instead it uses compute to take volumes (defined as signed distance fields) and draw lots of 'sprites' over the surface to represent the surface. Loose, indistinct 'sprites' (properly known as splats) give them more impressionistic results seen in Dreams screenshots. Very small, dense splats give surfaces distinct edges.
If I understood properly, they are plotting pixels with 64 bit atomic min. The depth is packed in high bits, meaning that the closest pixel remains. The point cloud is generated from the SDF presentation. They use temporal AA to hide the noise.
There's only one game doing this not using triangles though. Everything else uses triangles and will do for a while, I'm sure, as discussed in the SDF discussion - it's a very interesting technique with incredible results, but it's not a cure all and Dreams is being made around the tech as much as the tech is being made around the vision of Dreams. We're still going to need to rasterise triangles this gen and the next.
Alternative rendering technologies will become more common in the future. GCN is a very well suited architecture for GPU compute. The "ancient" Radeon 7970 is beating Kepler in many compute benchmarks and is better than Maxwell in async compute. If next gen is based on AMD GPU, we will certainly see games that render everything with compute.

Actually, compute is already beating hardware rasterization (in triangle rendering) on GCN in some cases:
http://www.joshbarczak.com/blog/?p=1012

Ecen if the compute was faster in shadow map rendering, it would still be better to rasterize the shadow maps with hardware ROPs and use the compute units at the same time with async compute (for example perform lighting or post processing). This would reduce the total frame time more.
 
I would be very surprized if any chip company could produce that chip in 2016. Current console APUs were already close to the limits in chip size, and we are talking about 4x+ increase in CUs (assuming it has FuryX sized GPU to match that BW). That 32 MB L3 cache would take more area than the Xbox One ESRAM, as caches require extra space for tags. One Zen CPU core should be at least 2x bigger than one Jaguar core. So with twice as many cores, we are talking about at least 4x bigger area. I am just a bit sceptical whether this all is possible at 14nm and what kind of cooling is required for this beast. TDP must be quite extreme :)

That APU would meet my next gen demands, if next gen was launched in late 2017. 4x faster GPU and 6x faster CPU (assuming 1.5x clocks and 2x IPC, counting HT) would not exactly reach the usual generational gains, but it is clear that we are not going to see 10x+ gains anymore, since we are power limited nowadays.

Conservative rasterization means that it can rasterize too many pixels (but always includes each pixel that is covered). The "is pixel inside a triangle" test is conservative.

I haven't read much about the Zen APU specs but could it be that they are going the same route as Intel with a self booting PCI many core chip?

http://www.pcworld.com/article/3005...ore-supercomputing-chip-into-workstation.html



If I understood properly, they are plotting pixels with 64 bit atomic min. The depth is packed in high bits, meaning that the closest pixel remains. The point cloud is generated from the SDF presentation. They use temporal AA to hide the noise.

Alternative rendering technologies will become more common in the future. GCN is a very well suited architecture for GPU compute. The "ancient" Radeon 7970 is beating Kepler in many compute benchmarks and is better than Maxwell in async compute. If next gen is based on AMD GPU, we will certainly see games that render everything with compute.

Actually, compute is already beating hardware rasterization (in triangle rendering) on GCN in some cases:
http://www.joshbarczak.com/blog/?p=1012

Ecen if the compute was faster in shadow map rendering, it would still be better to rasterize the shadow maps with hardware ROPs and use the compute units at the same time with async compute (for example perform lighting or post processing). This would reduce the total frame time more.
Don't say that or they all are gonna think you're crazy :yes:.
 
Last edited:
Don't say that or they all are gonna think you're crazy :yes:.
No one thinks that. Everything sebbbi has ever mentioned is a known quantity in the internet public space, he has never broken an NDA and avoids questions where he is in risk of breaching them.
 
Status
Not open for further replies.
Back
Top