Recent content by nAo

N
GTC 2024

Damn, I wish I was down in CA! Enjoy GTC :-)
- nAo
- Post #6
- Mar 18, 2024
- Forum: Graphics and Semiconductor Industry
N
AMD Execution Thread [2023]

Both TensorRT and TensorRT-LLM are open source: https://github.com/NVIDIA/TensorRT https://github.com/NVIDIA/TensorRT-LLM
- nAo
- Post #835
- Dec 16, 2023
- Forum: Graphics and Semiconductor Industry
N
AMD Execution Thread [2023]

How big is MI300?
- nAo
- Post #72
- May 10, 2023
- Forum: Graphics and Semiconductor Industry
N
AMD made a mistake letting NVIDIA drive Ray Tracing APIs

The idea of opening up the BVH formats is appealing on paper, especially for very simple HW implementations, but there is huge hidden cost in doing so. Once it's out, like with an ISA, you have to support it forever, on all past, present and future HW. This makes even less sense for something...
- nAo
- Post #69
- Apr 19, 2023
- Forum: Rendering Technology and APIs
N
AMD made a mistake letting NVIDIA drive Ray Tracing APIs

How would exposing a ray-triangle intersection instruction help accelerating Nanite? It would affect a fraction of some code that already is a relatively small fraction of the total frame time. Doesn't sound like much big of a deal. Moreover, would it really accelerate anything when it would...
- nAo
- Post #68
- Apr 19, 2023
- Forum: Rendering Technology and APIs
N
GART: Games and Applications using RayTracing

DMMs are not based on this paper.
- nAo
- Post #1,145
- Dec 27, 2022
- Forum: Rendering Technology and APIs
N
AMD RX 7900XTX and RX 7900XT Reviews

The fact it can achieve very high clocks on irregular workloads (blender?) where the CUs might be stalling a lot (i.e. not consuming as much power..) could suggest that it is using more power than expected. OTOH on gaming workloads clocks are lower because it's much better utilized and it's more...
- nAo
- Post #250
- Dec 19, 2022
- Forum: Architecture and Products
N
AMD RX 7900XTX and RX 7900XT Reviews

It takes significant architectural effort to add 30% more cores and get 30% more performance. It ain't easy :)
- nAo
- Post #152
- Dec 13, 2022
- Forum: Architecture and Products
N
AMD RDNA3 Specifications Discussion Thread

Are there any numbers out there for the power overhead due to going with 6 chiplets for I/O?
- nAo
- Post #929
- Dec 13, 2022
- Forum: Architecture and Products
N
GPU Ray Tracing Performance Comparisons [2021-2022]

The SER API is public, you can download the SER SDK and use it right away: https://developer.nvidia.com/blog/improve-shader-performance-and-in-game-frame-rates-with-shader-execution-reordering/ SER in-depth whitepaper...
- nAo
- Post #2,127
- Dec 12, 2022
- Forum: Architecture and Products
N
RDNA 2 Ray Tracing

Not all implementations are the same :)
- nAo
- Post #63
- Nov 1, 2022
- Forum: Architecture and Products
N
RDNA 2 Ray Tracing

I wouldn't call a 16 entry stack a short stack. There a few papers out there showing you can get great perf with fewer than 7-8 entries, for instance: https://www.embree.org/papers/2019-HPG-ShortStack.pdf
- nAo
- Post #61
- Nov 1, 2022
- Forum: Architecture and Products
N
AMD: RDNA 3 Speculation, Rumours and Discussion

You don't go very far with pure brute force RT as the computational and bandwidth costs would be insanely high :)
- nAo
- Post #2,170
- Nov 1, 2022
- Forum: Architecture and Products
N
AMD: RDNA 3 Speculation, Rumours and Discussion

No they don't. Whether upscaling is used or not the tensor cores also predict how to best fuse together the output of the optical flow generator with the optical flow/motion vectors coming from the application. It's well described here...
- nAo
- Post #2,169
- Nov 1, 2022
- Forum: Architecture and Products
N
Polygons, voxels, SDFs... what will our geometry be made of in the future?

I am not sure I follow you here, but intersecting a triangle or an AABB should not be any worse, latency wise, than issuing a texture sampler instruction. AFAIR RDNA2 sends the ray and AABB/triangle data to the HW intersectors, so it's not like the latter have to fetch anything from memory...
- nAo
- Post #301
- Oct 25, 2022
- Forum: Rendering Technology and APIs