PS4 Pro Official Specifications (Codename NEO)

Status
Not open for further replies.
On GCN 3 & 4 the only FP16 support in the shader cores is for storing FP16 data as FP16, rather than having to promote it to FP32 (i.e. it halves register pressure). GCN 3/4 don't have any fast FP16 math modes; FP16 operations are processed at the same speed as FP32 math.
Just based on your article on Anandtech, isn't that the same issue with GM104 cards as well (unless I read it wrong)? Making only GM100 the only one so far that we know of 2:1 for FP16?
 
On GCN 3 & 4 the only FP16 support in the shader cores is for storing FP16 data as FP16, rather than having to promote it to FP32 (i.e. it halves register pressure). GCN 3/4 don't have any fast FP16 math modes; FP16 operations are processed at the same speed as FP32 math.

That's true for GCN3/4, but eurogamer's article clearly claims the Pro's GPU is capable of doing FP16 operations at twice the speed:

As we understand it, with the new enhancements, it's possible to complete two 16-bit floating point operations in the time taken to complete one on the base PS4 hardware.

If true, what I wonder is why Sony didn't bother to claim 8.2 TFLOPs FP16, or why Cerny didn't mention it during the presentation. It would seem to me like such an important feature to brag about.
Of course, with the feature being apparently so distant from Liverpool, one has to wonder how much it will be used. I gather there's a large difference in development effort between rendering at a higher resolution + applying checkerboard upscale and choosing + modifying the shaders that can be done using FP16.
 
That's true for GCN3/4, but eurogamer's article clearly claims the Pro's GPU is capable of doing FP16 operations at twice the speed:



If true, what I wonder is why Sony didn't bother to claim 8.2 TFLOPs FP16, or why Cerny didn't mention it during the presentation. It would seem to me like such an important feature to brag about.
Of course, with the feature being apparently so distant from Liverpool, one has to wonder how much it will be used. I gather there's a large difference in development effort between rendering at a higher resolution + applying checkerboard upscale and choosing + modifying the shaders that can be done using FP16.
If it's a Vega feature, maybe they must wait for AMD to announce Vega officially before they can talk about it.

Intelligent journalists will ask them if it's a custom Sony addition or from Vega, they can't just say "no comment".

... or maybe the dev misspoke and it's not even in the Pro. "as we understand it" is not a 100% convincing comment.
 
I'd like to see a direct quote from the developer on there. It's possible for misunderstandings too.
 
I'd like to see a direct quote from the developer on there. It's possible for misunderstandings too.

How can one misunderstand something and write this instead?
it's possible to complete two 16-bit floating point operations in the time taken to complete one on the base PS4 hardware.
If so, what could have being lost in translation? "2 something instead of one on PS4". Seems pretty straightforward.
 
How can one misunderstand something and write this instead?

If so, what could have being lost in translation? "2 something instead of one on PS4". Seems pretty straightforward.
Packing two FP16 vs double rate.

Given the discussion over the last page, surely, you'd rather have a direct quote than the interpretative understanding.

Either way, what's the harm in wanting the direct quote?:rolleyes:
 
Last edited:
How can one misunderstand something and write this instead?

If so, what could have being lost in translation? "2 something instead of one on PS4". Seems pretty straightforward.
"as we understand it" means he's not quoting the dev directly, and might be using their own interpretation and speculation (or this would be the first dev to spill this, while all other devs so far respected the NDA until Cerny talks about it). There's been a huge amount of speculation recently about doubled FP16. A lot of people (including myself :oops:) completely misunderstood the supposed FP16 support from polaris.
 
Oh.. so it's probably a developers-only thing without press so even if he mentions it, doesn't mean it'll come out to the public.

We'll have to wait until a decent tech journalist gets the opportunity to ask the question directly to someone who knows it for sure, then.
 
Yeah but that part of the interview was about the usual "sorry we can't talk about it yet". NDA have always been implicitly that devs can only talk about what was already said publicly, so anything Cerny says in that seminar will be public. Hence why he mentioned it, I suppose everyone will then be allowed to talk about this freely.
 
Obvious question of course:
- Mr Cerny.. can you confirm that the PS4 Pro GPU has a peak performance of 8.4 TFflop using FP16 ops ?
- Mr Cerny... how does this relate to your competitor friends in Redmond who quote 6 TFlops... does this mean we'll get 12 Tflops from them ?
(assuming you guys use the same secret sauce... ?)

Stupid.. question... if I "massage" my shader code using a mixture of FP16 and FP32.. I can get to 6 TFlops on a PS4 Pro for an optimized game ? So in other words, no loss of IQ ?

Even more stupid question... are we sure Phil Spencer quoted apples and not oranges when he talked about 6 TF ?
 
Nothing like. Devs will ask what the state with FP16 issue and execution is on PS4P and they'll get a decent answer about when it works, when it doesn't, BW concerns, etc., and no silly 8 'teraflops' marketing figure.
 
This isn't as revolutionary as a couple of you guys think. There was a time when all 3dfx accelerator pci cards could only processes fp16.

Here is a comparison of alu possessing and blending color at fp32 (left) vs fp16 (right).
PowerVRGPU_PowerVR_SGX_RK3168_RGBA888-vs-RGB565-1.jpg


When you start using 16 bit precision in various aspects of your visuals you are limited to certain things otherwise you'll end up with visuals reminiscent of early 2000s games. Probably don't want to do too much shading at 16bit precision.
 
Last edited:
Is there a compliant logluv /Nao32 type of format that would have the same memory footprint as FP16 and could get decent results?
 
Status
Not open for further replies.
Back
Top