PS4 Pro Official Specifications (Codename NEO)

Status
Not open for further replies.


Here's a quote that caught my attention:

Mark Cerny said:
We doubled the GPU size by essentially placing it next to a mirrored image of itself, rather like the wings of a butterfly. That gives us an extremely clean way to support the 700 existing titles, because we can turn off half the GPU and just run something that's very close to the original GPU.

Like it was said above, I wonder if this is only the back-end or does it mean twice the ROPs too. Doubling the ROPs makes a lot of sense if a higher resolution is indeed the primary objective.

Together with the 2*FP16 rate (and if they're properly taken advantage of), then this could put the PS4 Pro head and shoulders above Polaris 10 in raw capabilities.

Then again, if the ROPs are directly connected to the 8*32bit memory channels, then disconnecting half the GPU for the compatibility mode would result in not being able to even access half the memory, so I guess they would need a ring bus to achieve such a thing.
 
Like it was said above, I wonder if this is only the back-end or does it mean twice the ROPs too. Doubling the ROPs makes a lot of sense if a higher resolution is indeed the primary objective.

Together with the 2*FP16 rate (and if they're properly taken advantage of), then this could put the PS4 Pro head and shoulders above Polaris 10 in raw capabilities.

The RX480 has 40% more FP32 throughput, 40% more raw geometry throughput (although I understand the Pro has some tessellation improvements), 40% more texturing throughput and almost double the memory bandwidth. I'm not sure why even with 64 ROPS the Pro would be considered head and shoulders above Polaris 10 merely for beating it by 44% in ROP throughput and the same in FP16. Unless you're expecting the bottleneck on a majority of future games to switch to ROPS and FP16. Which seems pretty unlikely.
 
The RX480 has 40% more FP32 throughput, 40% more raw geometry throughput (although I understand the Pro has some tessellation improvements), 40% more texturing throughput and almost double the memory bandwidth.
Hum? The RX 480 8GB has 256GB/s and the PS4 Pro has 224GB/s UMA. The Jaguars IIRC couldn't access more than 20GB/s and I doubt that has changed, mainly because those now have an extra 1GB DDR3 in 32 or 64bit (so another 6 or 12GB/s).
If anything, the RX 480 8GB has 25% more bandwidth.

I'm not sure why even with 64 ROPS the Pro would be considered head and shoulders above Polaris 10 merely for beating it by 44% in ROP throughput and the same in FP16. Unless you're expecting the bottleneck on a majority of future games to switch to ROPS and FP16. Which seems pretty unlikely.

If your objective is to boost up the resolution to as much as you can (1800p to 4K), then yes I think ROP count will make a pretty big difference. The R9 290/390 seem to beat the RX 480 at higher resolutions, despite losing in FP32 + texture throughput and geometry performance.
 
Hum? The RX 480 8GB has 256GB/s and the PS4 Pro has 224GB/s UMA. The Jaguars IIRC couldn't access more than 20GB/s and I doubt that has changed, mainly because those now have an extra 1GB DDR3 in 32 or 64bit (so another 6 or 12GB/s).
If anything, the RX 480 8GB has 25% more bandwidth.

Yeah that's my bad, I accidentally used a 384 memory bus for the RX 480 in error. 2x didn't sound right but hey, it was late. The actual figure once you remove the 20GB/s accessible by the CPU (which I think is fair to remove the full amount since we're already aware of additional memory contention issues the PS4 has on account of it's UMA) comes in at 197.6GB/s since the PSP Pro's total bandwidth for graphics is 217.6GB/s. So the RX480 can be argued to have about 30% more bandwidth than the PS4P.

If your objective is to boost up the resolution to as much as you can (1800p to 4K), then yes I think ROP count will make a pretty big difference. The R9 290/390 seem to beat the RX 480 at higher resolutions, despite losing in FP32 + texture throughput and geometry performance.

The 290/390 also have a lot more memory bandwidth than the 480. And memory bandwidth is also a huge influencing factor on high resolution performance.

And of course that's assuming the PS4P even has 64 ROPS to begin with. Since the 480 is more than capable of holding 30fps at 3840x1080 (2x 1080p or just slightly more pixels than 1440p) at console settings with only 32 ROPs, there doesn't seem a lot of point in going with 64.
 
The actual figure once you remove the 20GB/s accessible by the CPU (which I think is fair to remove the full amount since we're already aware of additional memory contention issues the PS4 has on account of it's UMA) comes in at 197.6GB/s since the PSP Pro's total bandwidth for graphics is 217.6GB/s. So the RX480 can be argued to have about 30% more bandwidth than the PS4P.
You're missing the extra 1GB of DDR3 in the Pro for apps (and some OS routines assuming they're smart enough about it), which should be at least 6GB/s (1600MT/s using 32bit). That's a bandwidth that will not be required to take from the GDDR5 pie, so if you're taking out the full 20GB/s limit from the Jaguars, then it'd be equally fair to put those 6GB/s back. So 217.6 - 20 + 6 = 203.6. 256/203.6 = 1.257.
I think we should stick to the 25% figure ;)


The 290/390 also have a lot more memory bandwidth than the 480. And memory bandwidth is also a huge influencing factor on high resolution performance.
Raw memory bandwidth yes, effective memory bandwidth not that much. AMD claims 35% bandwidth savings in Polaris due to delta color compression. The 290 has 320GB/s and the 390 uses 6GT/s memory for 384 GB/s. The RX 480 is 256 GB/s, and 256*1.35 = 345.6, which puts them in a similar position.
Synthetic benchmarks actually support this:

6m3ex8.gif
OVUKzc.png



Source 1, 2


And of course that's assuming the PS4P even has 64 ROPS to begin with.
Yes, that is the initial premise of the hypothesis being discussed. That and taking advantage of FP16 for pixel shaders (not all), as suggested by several developers.
 
You're missing the extra 1GB of DDR3 in the Pro for apps (and some OS routines assuming they're smart enough about it), which should be at least 6GB/s (1600MT/s using 32bit). That's a bandwidth that will not be required to take from the GDDR5 pie, so if you're taking out the full 20GB/s limit from the Jaguars, then it'd be equally fair to put those 6GB/s back. So 217.6 - 20 + 6 = 203.6. 256/203.6 = 1.257.
I think we should stick to the 25% figure ;)

That assumes the DDR3 is available for the CPU to access during gameplay which from what I've read is not the case. That memory is used for storing open applications for quick access while a game is running, so there's no reason why the CPU would be accessing it (especially utilising it's full bandwidth) while also running a game. And that's completely ignoring this old gem:

http://static1.gamespot.com/uploads/scale_super/823/8237367/2935986-5333225429-PS4-G.jpg

So I'd say the 30% figure is actually pretty generous. Either way though, whether it's 25%, 30% or more, the RX480 clearly has a healthy chunk of additional bandwidth over the PS4P on top of it's other advantages, so I still don't see how the PS4P could ever be reasonably considered head and shoulders above it based on some likely rarely used FP16 advantage and the slip possibility of a ROP advantage.

Raw memory bandwidth yes, effective memory bandwidth not that much. AMD claims 35% bandwidth savings in Polaris due to delta color compression. The 290 has 320GB/s and the 390 uses 6GT/s memory for 384 GB/s. The RX 480 is 256 GB/s, and 256*1.35 = 345.6, which puts them in a similar position.
Synthetic benchmarks actually support this:

6m3ex8.gif
OVUKzc.png



Source 1, 2

35% is going to be a best case scenario which indeed those synthetic benchmarks do support. i.e. even with a single black texture the RX 480 can only just match the throughput of the R290x despite having in theory more effective bandwidth using the 35% figure. Using the much more realistic random texture example, it's no-where near the R290x's throughput. How much lower would it be in an actual game?

Also, it's worth bearing in mind that the PS4P isn't targeting a crazy high resolution. It's only 2x 1080p or a little more than 1440p. At 1440p the RX480 is about the same speed as the 390 which actually puts it a little faster than the 290:

https://www.techpowerup.com/reviews/MSI/GTX_1050_Gaming_X/27.html
 
Also, it's worth bearing in mind that the PS4P isn't targeting a crazy high resolution. It's only 2x 1080p or a little more than 1440p. At 1440p the RX480 is about the same speed as the 390 which actually puts it a little faster than the 290:

https://www.techpowerup.com/reviews/MSI/GTX_1050_Gaming_X/27.html
Considering the most important aspect of the Pro's rendering efficiency comes from new techniques enabled by the ID buffer, is it still applicable to assume 4K checkerboard is hitting the ROPs in a similar way as a 1440p render on older architectures?
 
“But at the same time I’m going back and changing our last Ratchet & Clank to also supporting HDR and 4K output and, so far, what I’ve found is because it wasn’t authored with 4K in mind the addition of HDR… has a lot more punch than the improved resolution.”

I DIE. R&C in HDR.
DEAD.
 
As the PS4 Pro apparently now supports SATA III, would a hybris SSHD still give a noticeable boost in load times (and streaming in games?), or would a 7200 rpm drive be better option?
Or does the SATA III give any performance boost to the standard 5400 rpm HDD?
 
SATA3 wouldn't improve seek or sustained transfer on mechanical HDDs.
 
As the PS4 Pro apparently now supports SATA III, would a hybris SSHD still give a noticeable boost in load times (and streaming in games?), or would a 7200 rpm drive be better option?
Or does the SATA III give any performance boost to the standard 5400 rpm HDD?
Again, and again and again ... I don't know why that myth is so strong ...
SATA 3 won't improve the performance in any meaningfull way. Even for a SSD Sata 2 is more than enough. Only for really really high performance SSDs, the interface would make a difference. This is only in non-realworld benchmarks. Or if you copy gigabytes of data at once. Than it would make a little difference.
For everything else, SATA 2 is more than enough. You wouldn't even notice if the consoles had a sata 1 interface in it, just because the HDDs (with 5200 spins) are not fast enough.
Also we saw that the PS4 hasn't that much better loading speeds if you insert an SSD. It is a bit better, yes, but there is another limiting factor for consoles HDD performance, not yet known.
It might be just a limitation of some chips that checks if the data is valid and came from a secure place (copy protection).
The external drive on the xbox one is just faster, because the external drive has no OS that's loaded from it (so it is game-exclusive) and it is basically the drive itself that is faster than the buildin HDD, most times.
 
So what exactly is the equivalent of PS4P all said and done? A slightly higher clocked 380 X? Or will it be 970 gtx level?
 
No-one can answer that. Depends how it's used and how well the custom features work. It'll definitely punch above its weight regards what's on screen, but no-one can predict by how much.
 
So what exactly is the equivalent of PS4P all said and done? A slightly higher clocked 380 X? Or will it be 970 gtx level?
Would probably vary wildly between games, and be very dependent on the optimizations exclusively available on PS4P. These are not minor additions.

If, say, half the shaders can be switched to FP16, this is 1.5x faster than the raw TF suggests (if compute bound). It can also be up to 2x faster with their unique checkerboard rendering and whatever the ID buffer allows. It all breaks down if the game is memory bound, or CPU bound. Memory bandwidth looks like an important limitation.

So on which criteria is something "equivalent"?
 
It can also be up to 2x faster with their unique checkerboard rendering and whatever the ID buffer allows.

That doesn't sound realistic. Sebbbi and others have already said that checkerboard rendering is possible on any DX10 level architecture, so the 2x performance increase that it allows (it's not really 2x since it's not producing as high a quality output as native resolution but we'll call it 2x for the sake of argument) is not tied to the availability of the ID buffer.

The interesting question is what does the ID buffer allow in terms of performance or quality over the already available techniques that don't require it. Speaking purely for checkerboard of course. I get that it may allow new techniques in future that haven't even been thought up yet.
 
At this point we have to introduce a subjective quality factor. If it looks the same on screen, or close enough no-one can really tell the difference, then its 'equivalent'. And of course other hardware being compared to could use the same/similar techniques and get improved performance. But just as MP3 and JPG work to use lossy methods that work within the limits of human perception, similar rendering techniques can produce the 'same' results by taking clever shortcuts.
 
That doesn't sound realistic. Sebbbi and others have already said that checkerboard rendering is possible on any DX10 level architecture, so the 2x performance increase that it allows (it's not really 2x since it's not producing as high a quality output as native resolution but we'll call it 2x for the sake of argument) is not tied to the availability of the ID buffer.

The interesting question is what does the ID buffer allow in terms of performance or quality over the already available techniques that don't require it. Speaking purely for checkerboard of course. I get that it may allow new techniques in future that haven't even been thought up yet.
Up to. I literally meant somewhere between 0 and 2. There might be some games which have zero artifacts and others with a huge amount of visual issues. We don't really know yet, the way some devs talked about it makes me think it will vary that much.

I could have misunderstood, but from Cerny's statements, the ID buffer allows much better algorithm than previously possible.

Subjectively, Eurogamer said it's indistinguishable from native 4K, 4 feet from a 60" TV. Months ago they made an "equivalent" PC and said it will never run anything close to 4K, they went back on their words to say the PS4P does, in person, produce an image convincingly 4K. That's enough to think the ID buffer is probably a significant advantage.
 
At this point we have to introduce a subjective quality factor. If it looks the same on screen, or close enough no-one can really tell the difference, then its 'equivalent'. And of course other hardware being compared to could use the same/similar techniques and get improved performance. But just as MP3 and JPG work to use lossy methods that work within the limits of human perception, similar rendering techniques can produce the 'same' results by taking clever shortcuts.

I don't disagree. If checkerboard really is as close to native 4K as has been claimed then rendering natively at 4K seems like a crazy thing to do.

Subjectively, Eurogamer said it's indistinguishable from native 4K, 4 feet from a 60" TV. Months ago they made an "equivalent" PC and said it will never run anything close to 4K, they went back on their words to say the PS4P does, in person, produce an image convincingly 4K. That's enough to think the ID buffer is probably a significant advantage.

But you're comparing apples with oranges there. Eurogamers claim was based on native rendering. Clearly, that is beyond the PS4P's capability in cutting edge games. But those same "equivalent" PC's that they built to represent the PS4P should also be entirely capable of producing a checkerboard based 4k output if developers choose to implement it (which is another question entirely).

The real question is what quality or performance disadvantages would they suffer on account of not having the ID buffer available.
 
Digital Foundry said that Skyrim on the PS4P will render in 4K natively in their Skyrim Remastered article over the weekend.
 
Status
Not open for further replies.
Back
Top