DemoCoder,
Short response: read my post again.
...
That isn't usually successful, so I'll explain in more detail:
Welll...I targeted partial precision normalization specifically because...the NV40 is stated to have "free" partial precision normalization. It seems clear that discounting fp16 normalization as a factor would try to isolate fp16 IPC aside from fp16 normalization. I labelled that attempted data, and included it...that's pretty much the story.
I'm confused as to why you are asking about POW, LIT, SINCOS when I explicitly stated I was isolating partial precision normalization.
Further explanation, if that isn't clear...
I don't see how the reason is mysterious at all, nor do I think the 1.96 and my explanation of what it seems to indicate were obscure for you to make this the issue you do.
Stating what I thought was obvious: 1.96 is a higher IPC than anything else listed, and my stated presumption was that it was due to the NV40's partial precision normalization showing benefit. I didn't find an IPC benefit surprising, or feel a need to point out that 1.96 was greater than various other numbers, as it seems obvious to me that the normalization functionality is a boon when it can be used and a higher IPC goes along with that idea.
I listed the IPC without adjustment first, prominently, and with an explanation of what I think primarily accounted for the jump for partial precision in comparison to full precision.
I then made sure to provide the other _pp figure only after an explanation of exactly what it discounted, to give some indication about partial precision IPC outside of this beneficial partial precision normalization feature for context (and accordingly annoted: "for partial precision performance outside of the assumed normalization functionality...").
Did this really need to be explained again?
...
Perhaps if someone didn't read anything but "1.27", there might be some unfairness in what that person took away from the figures. As it stands, I only see another piece of data for analysis, listed after other data, with an explanation of what each piece of data was.
However, my purpose here was limited to a context for the R420 evolution from the R3xx.
Short response: read my post again.
...
That isn't usually successful, so I'll explain in more detail:
DemoCoder said:Why would you extract free norm vs any of the other "special functions" like POW, LIT, SINCOS, etc?demalion said:The NV40 was 1.32 full precision, and 1.96 partial precision, I presume due to extracting normalization for its special functionality. Based on that assumption: for partial precision performance outside of the assumed normalization functionality, calculated by counting normalization as 1 extracted instruction instead of however many assembly instruction, indicates an IPC of about 1.27.
Welll...I targeted partial precision normalization specifically because...the NV40 is stated to have "free" partial precision normalization. It seems clear that discounting fp16 normalization as a factor would try to isolate fp16 IPC aside from fp16 normalization. I labelled that attempted data, and included it...that's pretty much the story.
I'm confused as to why you are asking about POW, LIT, SINCOS when I explicitly stated I was isolating partial precision normalization.
Further explanation, if that isn't clear...
Hmm? I only tried to isolate fp16 normalization specifically, as I tried to make clear by...mentioning that straight out.If you just want to measure vector IPC, you also need to correct for special functions of ATI's scalar ALUs as well. That doesn't seem to be a fair comparison.
I don't see how the reason is mysterious at all, nor do I think the 1.96 and my explanation of what it seems to indicate were obscure for you to make this the issue you do.
Stating what I thought was obvious: 1.96 is a higher IPC than anything else listed, and my stated presumption was that it was due to the NV40's partial precision normalization showing benefit. I didn't find an IPC benefit surprising, or feel a need to point out that 1.96 was greater than various other numbers, as it seems obvious to me that the normalization functionality is a boon when it can be used and a higher IPC goes along with that idea.
I listed the IPC without adjustment first, prominently, and with an explanation of what I think primarily accounted for the jump for partial precision in comparison to full precision.
I then made sure to provide the other _pp figure only after an explanation of exactly what it discounted, to give some indication about partial precision IPC outside of this beneficial partial precision normalization feature for context (and accordingly annoted: "for partial precision performance outside of the assumed normalization functionality...").
Did this really need to be explained again?
...
Perhaps if someone didn't read anything but "1.27", there might be some unfairness in what that person took away from the figures. As it stands, I only see another piece of data for analysis, listed after other data, with an explanation of what each piece of data was.
Yes, 1.96 is indeed higher than the other IPC numbers I mentioned. Also, 1.27 is fairly close to 1.32. Things about the NV40 could be proposed if it was the topic under the discussion.Normalization takes 3 instructions. This lower IPC, it would raise it.
...
However, my purpose here was limited to a context for the R420 evolution from the R3xx.