Recent content by rikrak

R
Variable Rate Shading vs. Variable Rate Rasterization

Thanks, I have learned something today!
- rikrak
- Post #10
- Feb 8, 2021
- Forum: Rendering Technology and APIs
R
Variable Rate Shading vs. Variable Rate Rasterization

Thank's, this really clears things up. So in retrospect, VRS's main purpose is to improve the utilization of shader resources as it allows you to adaptively and independently set shading rate for individual portions of the render target. It does not affect rasterization or the layout of the...
- rikrak
- Post #8
- Feb 8, 2021
- Forum: Rendering Technology and APIs
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

Ah, now I understand what you mean. Yes, Metal argument buffers are typed aggregate objects and require more precise type declaration of their components. Vulkan/DX12 use weaker type bindings, possibly to support a wider hardware range and more flexible descriptor table juggling. There is indeed...
- rikrak
- Post #169
- Jan 23, 2021
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

Forgive my confusion, but how can you create a texture with an "unknown" type? What does it even mean? Edit: I looked up DXGI_FORMAT_UNKNOWN. It appears to be just a badly chosen name for something like "choose a default format for me". It still chooses a concrete format according to a...
- rikrak
- Post #167
- Jan 21, 2021
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

Well, duh, if you want to bind (encode) a resource, you have to create the resource first. And to create a resource, you need to know it's size and type. I don't understand why you would call it a limitation, sounds to me like a logical way to design a binding API? Is there any API out there...
- rikrak
- Post #165
- Jan 21, 2021
- Forum: Architecture and Products
R
Variable Rate Shading vs. Variable Rate Rasterization

Could any of you knowledgeable people offer a more detailed explanation of the difference between Variable Rate Shading (as used in DX12 and Vulkan) and Variable Rate Rasterization (as used by Apple in Metal)? If I understand it correctly, VRS simply allows the fragment/pixel shader to run at a...
- rikrak
- Thread
- Jan 21, 2021
- Replies: 9
- Forum: Rendering Technology and APIs
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

That's news to me. Where did you see that? Multisampled textures do have a different data type from regular textures, I would speculate because their layout is hardware-dependent. Same for depth textures. But that's about it?
- rikrak
- Post #162
- Jan 21, 2021
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

A complete amateur opinion here, but maybe they refer to the fact that transformed triangle data must be collected prior to rasterization, so a large amount of very small triangles will quickly fill up the buffers, causing premature flushes and additional memory operations? TBDR only really...
- rikrak
- Post #157
- Dec 31, 2020
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

I have the feeling that we already had this very conversation few pages back? Tiling certainly adds additional per-primitive cost to the process, but this cost should be proportional to the number of tiles a primitive intersects. Small primitives should actually be the cheapest. Not that this...
- rikrak
- Post #152
- Dec 29, 2020
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

This has been confirmed by Apple GPU driver team leader on Twitter I have also run some benchmarks (look in the posts above) that show FMA throughput on M1 and A14 GPUs. I would guess mainly because it's a very intricate engineering puzzle. Getting the deferred rendering behavior in...
- rikrak
- Post #144
- Dec 28, 2020
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

Now I am very confused. If the GPU cores are basically identical how has Apple managed to double the FP32 rate on M1 relative to A14? Is this an artificial limitation on A14?
- rikrak
- Post #138
- Dec 27, 2020
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

Depends on what you understand under "microarchitecture". They have identical feature set, yes, but there should be little doubt that their ALUs are physically different (different FP32 compute throughput, different size on die). What's even more interesting that Metal RT works across all...
- rikrak
- Post #136
- Dec 27, 2020
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

For full disclosure, I have no idea how these things can be implemented in hardware. It was pointed out (https://www.realworldtech.com/forum/?threadid=197759&curpostid=197993) that fusing two FP16 ALUs to perform a FP32 operation or splitting a single FP32 ALU to perform two FP16 operations per...
- rikrak
- Post #133
- Dec 11, 2020
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

Upgrade on this — new benchmark results are up, including A14 results (iPhone 12): https://www.realworldtech.com/forum/?threadid=197759&curpostid=197985 To summarise: normalised per GPU core, A14 has exactly half the FP32 throughput of the M1, while their PF16 throughput is identical. My...
- rikrak
- Post #130
- Dec 9, 2020
- Forum: Architecture and Products
R
Apple (PowerVR) TBDR GPU-architecture speculation thread

I completely agree with you that there are sizable benefits of having fast FP16 operations even on desktop GPUs. This is definitely not something I am debating. I am just pointing out that M1 appears to have identical throughput for both FP32 and FP16 operations, which I hypothesize is due to...
- rikrak
- Post #128
- Dec 7, 2020
- Forum: Architecture and Products