Next gen lighting technologies - voxelised, traced, and everything else *spawn*

I'm a bit worried after seeing this benchmarks: https://www.anandtech.com/show/13923/the-amd-radeon-vii-review/15
Prev gen is overall faster in compute it seems, except Geekbench. A sign of compute stagnation?
How are you interpreting these benchmarks? Radeon VII is the same transistor count as RTX 2080 and performs better and worse given different workloads. Previous gen is typically slower, although I don't know about performance per transistor. In fact that's actually an important thing to consider - it's not just raw performance but performance per dollar per W, with gaming workloads needing to be well balanced (certainly for consoles).
 
How are you interpreting these benchmarks?
At best you can make comparisons between the same vendor, if at all.
But neither this, nor transistor count, nor GFLOPs allow to compare NV vs. AMD.
The only way to do is is doing yourself, and you must be kind of game devoloper to optimize properly for both. I doubt the devs of these benchmarks care - i would not.

Radeon VII is the same transistor count as RTX 2080 and performs better and worse given different workloads.
AMD compute performance is always very easy to predict, and always is as expected. Just take core count times frequency and you know. (VII is different because of bandwidth which is out of the norm)
NV is a lottery. You never know if you have an overall increase, and if it does not become slower than i'm more than happy. This is only compute. They constantly improve rasterization performance but i can't say much about this. (see game benchmarks)
Also i need to mention here again AMD compute performance is - in my experience, excluding newest GPUs - better than NVs. Much better, which is what i miss to see in above benchmarks.
GTX 10xx did surprisingly well for me (FuryX only 1.66 x faster than GTX1070).

with gaming workloads needing to be well balanced (certainly for consoles).
This is very difficult. Games mostly profit from rasterization performance, because they have not found a real use for compute yet. (They need a GDC talk just to tell them: "Hey, you can use compute to cull stuff! Awesome, isn't it?") But there is constant increase in compute usage.
So if had to decide: Build console with AMD APU or NV with strong ARM cores, assuming similar ratio of raster vs. compute, then i would say: Raster of AMD is not so great but it's good enough. But NVs compute is really slow and this could prevent increasing game quality over the years.

But of course i'm biased. While most people actually see AMDs offerings as boring, or even 'underwhelming', from my perspective they have the lead in performance since the existence of GCN, and they never struggled. (Working mainly on compute! of course i like them!)
Vega wasn't bad although broken, and VII addresses the major bottleneck of our time which is bandwidth. Maybe boring for advertisement but it's the fastest GP-GPU you could get i guess.
What they fail at is software and marketing. They have geometry stuff but do not expose, while NV exposes mesh shaders on day one. Same for various extensions. They do not teach gamedevs what they could do with compute, while NV sends army with rayguns.
It's just as if they would not care to sell their stuff.
Also they have no research, which is where NV is really good at.

But in this case, maybe not having 10 years of raytracing XP, maybe that's the best thing that can happen. Maybe they leave it to the devs to a big amount, because that's where it belongs to. :D

Radeon VII is the same transistor count as RTX 2080 and performs better and worse given different workloads.
Likely i got this question wrong.
The reason those benchmarks show confusing and inconsistent results is probably they are very small tasks, and so it's like throwing a dice if the GPU likes your code or not.
There are other benchmarks like Luxmark which are more serious, but even this is useless, because they make raytracing work not fast. Fast means not only more work with optimization (which is where AMD benefits more - NV runs any crap pretty well), it also means utilizing restrictions you can afford, but Luxmark can not afford restrictions, its an offline tool.
I don't know any useful Compute benchmark, and so i know nothing about Turing compute performance.
 
Last edited by a moderator:
Yes, i do not expect classical raytracing implementation to be the kind of optimized code we use in games. Random memory access is a no-go, but NV is traditionally better at it. I'm also not surprised NV beats AMD in Radeon Rays, after seeing the source.
We can throw compute benchmarks against each other the whole day - useless and confusing results. It may well be that RTX ends up faster for me than VII, but i would be very surprised. Send me both and i'll tell :D
 
... but i notice 20xx is twice as fast as 10xx in your benches! I like this :D

Edit: Notice it's NV CUDA vs. AMD OpenCL. Onother reason you can't compare vendors here.
 
At best you can make comparisons between the same vendor, if at all.
But neither this, nor transistor count, nor GFLOPs allow to compare NV vs. AMD.
The only way to do is is doing yourself, and you must be kind of game devoloper to optimize properly for both. I doubt the devs of these benchmarks care - i would not.
By all means say in your experience working with compute, AMD is better. but you can't cite a source and then ignore its data when making an argument. ;) Looking at these benchmarks, the ones you linked to, I see no data to suggest last-gen GPUs were faster at compute. It's unfair to NV to make such bold assertions, especially with your data showing the complete opposite - I'd classify those remarks as FUD.
 
By all means say in your experience working with compute, AMD is better. but you can't cite a source and then ignore its data when making an argument. ;) Looking at these benchmarks, the ones you linked to, I see no data to suggest last-gen GPUs were faster at compute. It's unfair to NV to make such bold assertions, especially with your data showing the complete opposite - I'd classify those remarks as FUD.

I hear you, but i said nobody should draw conclusions about such benchmarks, also in VII threat.
I admit i made contradicting assumptions about Turing compute performance: For once i have big hope in new parallel inter / float execution, which should help me a lot (I assumed equal AMD vs. NV perf initially),
but then seeing Turing partially decreases in some benchmarks made me worry.
That's just personal feeling, and i should not have mentioned it, but Davids comment 'Turing isn't lacking in compute at all' seems pure assumption too, or did he test it out?

But there IS data that shows 1080Ti is faster than 2080 where last gen appears, except Geekbench as mentioned (i interpret this as an outlier). Link again: https://www.anandtech.com/show/13923/the-amd-radeon-vii-review/15
Please look twice. Did you confuse something here?


Then, after all this, pharma posted:
Which looks a lot better, and i have recognized this. Believe me i like to see this, because it makes my worries MUCH smaller. And i have worries. RTX and tensors DO reduce enything else, you all know this.

At the same time i feel a need to defend myself, again, because i am the only guy here who seriously thinks AMD is faster if only in compute, as i'm the only guy here who seriously doubts a need for HW RT FF.


Regarding my personal preference of AMD compute performance vs. NV, i would not have mentioned this again if you would not have asked for it:

Quote: "How are you interpreting these benchmarks?"

This is a question about my personal opinion, so accusing me to spread FUD is not fair from your side.
Sounds a bit of a misunderstanding - no problem in any case. But i can't help there are no proper compute benchmarks around.

What i do not understand here is this: Why do you guys react so sensible to someone saying just directly and honest that and why he prefers one vendor over another? Although many here take any chance to praise the chosen and putting the other down?
This here is like: Go and find a bench where good vendor beats bad vendor, and then present it as valid data, not FUD, valid data because it's in the internet, haha!
Seriously?
I can only repeat what i've learned myself the recent 5 years: AMD is faster in compute, excluding most recent GPUs which i don't know yet. I have preferred NV before, but after testing GCN i changed my mind in one second. And it will be this way until i see it the other way around. I'll let you know, promised.
Feel free to ignore this and trust benches more than me. Feel free to share your opposite experience. Feel free to confirm my findings. Whatever.
I also repeat again what i've heard from other Devs: NV is faster with rendering. And i believe this. Just i do not work on rendering. And i do no consider it a bottleneck, selfish as i am.

That's not bold, its the personal opinion you asked for. And of course there is plenty of compute stuff that runs faster on NV. Just not for me, not for a single one of dozens of shaders. I can't help it, and i won't lie just to feign vendor neutrality like it's the gold standard here.

by the way, i think i'm more neutral than pretty much anybody else here!
...sigh
 
But there IS data that shows 1080Ti is faster than 2080 where last gen appears, except Geekbench as mentioned (i interpret this as an outlier). Link again: https://www.anandtech.com/show/13923/the-amd-radeon-vii-review/15
Please look twice. Did you confuse something here?
1080 is faster than 2080 in only 1/8 of those charts that I'm seeing. Therefore, it's not faster overall. It's faster in some cases.

This is a question about my personal opinion, so accusing me to spread FUD is not fair from your side.
Sounds a bit of a misunderstanding - no problem in any case. But i can't help there are no proper compute benchmarks around.
The reason I say it sounds like FUD is because, had I not checked the links, I'd have taken your word for it and walked away thinking, "gosh, last-gen nVidia GPUs are better at compute than the new ones." It's fairly typical not to check every link and to trust them to be valid. Having checked your link, I see limited situations where 1080 is faster so overall, your statement is plain wrong - 1080 isn't faster at compute.

What i do not understand here is this: Why do you guys react so sensible to someone saying just directly and honest that and why he prefers one vendor over another? Although many here take any chance to praise the chosen and putting the other down?
I'm not reacting to anything of the sort. Vendor doesn't come into is, as I mentioned just a few posts above. My remark is about data and logical debate. Assertions that can be backed by facts should be, and if those facts don't show what the argument they are supposed to be supporting are saying, something's very wrong with the discussion. ;)

I can only repeat what i've learned myself the recent 5 years: AMD is faster in compute, excluding most recent GPUs which i don't know yet.
The 'FUD' here is nVidia's last-gen GPUs are better at compute than their new GPUs. If true, no-one interested in performance compute should buy RTX and should instead buy 1080s. It's the kind of significant assertion that needs backing up.

by the way, i think i'm more neutral than pretty much anybody else here!
...sigh
I've both been accused of being biased in favour of AMD and biased in favour of nVidia in this one thread, so I think the neutrality crown goes to me because I seem to be on the wrong side of everyone. :p
 
It's still specifically about shooting rays through a specific kind of BVH and finding hits against polys.
In the world of different acceleration structures, scene representations, traversal types (and etc.) we live at today, this flavor of classic ray tracing Nvidia chose to accelerate is too specific for modern graphics devs.
99.9% of games are triangle-based. It's the best choice for now.

Never seen a 10 times factor in practice, and the specific workload is not always what we need for games.
The goal is not to 'beat' 2060 with a cheaper chip, the goal would have been to achieve similar quality and performance using algorithms tailored to the specific need games require.
Nothing of this is out of reach for consumers or would have been. RT would have happened in any case, i'm sure of that. Not the classical approach from the 80's but something tailored to realtime requirements.
And i know exactly how this 'something' would look like for me - there is no bottleneck, no incoherent memory access - it would be blazingly fast.
The latter is the reason our perspective on this is so different. I'm not sure if i could convince you, if you would be happy with reflections that show only an approximation of the scene. For GI it makes no difference for sure.

Look at the Danger Planet video i've posted some posts above and then look ant Q2 RTX. And tell me what looks better, if there is a signifcant difference that justifies a GPU with three times to cost of 10 years old consoles which did this in real time already. (I'm really interested in your opinions here.)

I won't any longer repeat my opinion here - if we want RTX yes or no depends on our wallets, not our opinions. It's pointless to repeat ourselves without new arguments. But we can still discuss the alternatives, which still make sense even with presence of RTX!
I use RT as well for GI. RT part takes 1-2 milliseconds. I don't need RTX to speed this up - it would only slow down most likely. This is why i am not excited about the 'new era' here, because i am in this era since ten years. This is no arrogance from my side, but of course it affects my point of view.
If there are faster algorithms for GI than pathtracing and denoising, and there are, then we can still use them. Most of them involve RT, some not. (Danger Planet does not)
What we want is to find the most effective combination of tech, now including RTX. RTX alone does not bring photorealims to games - we would have seen this already. It can help a lot, but we need to find out it's strength, which is accuracy, not performance IMO.

(also i'm no hardware enthusiast, forgive me inaccuracies with given example numbers - it does not matter so much for me how the speed up is exactly)
I looked at the Danger Planet tech (link). It's very limited compared to ray tracing. Doesn't even handle occlusion, it requires manual artist input to prevent light leaking. In any case, PICA PICA already uses surfels for GI and doesn't have noise issues.

The argument is that the short-term gains come with notable long-term losses in reduced R&D and dead-end/restricted algorithm developments.

I don't know whether that's true or not, but hopefully people can understand the argument and talk about it in terms of the choices, rather than just trying to make everything about how to get RT out there now. Hypothetically, if the choices were:

1) Get fixed function HW and realtime RT hybrid games out there now, and lose access to the more efficient Future Methods for 10 years, resulting in slower RT effects in games for 10 years
2) Have a slower introduction of RT technologies that are more flexible and get weaker gains now, but through more flexible solutions gain more significant gains through Future Methods, resulting in significantly faster solutions 5 years from now and going forwards

Which would people choose? And how about different timelines? Iroboto's argument that you need iterative advances to uncover new tech is a strong one, but we also have compute enable exploration in an 'almost fast enough' way without the need for specific hardware, and that could lead to the best game solutions rather than following a 1970s image construction concept.

My theory is that it's fine to put in these RT cores for professional imaging. I'm not sure they're ideal for gaming. It also doesn't matter one way or another - lives aren't in the balance. ;) Worst case, nVidia set a precedent for how acceleration is handled and we get 5 years of slower reflections in games. Big deal. For those working in the field though, I can see them feeling more animated about the choices and implementation.
False dichotomy. If anything, Microsoft and NVIDIA pushing RT is what will generate the research that will yield faster algorithms. Without this push there would be little interest and therefore little research into the topic.

Impossible: Both worlds need their own acceleration structures - twice the work. DXR BVH is blackboxed and only accessible by tracing DXR rays, which further do not allow parallel algorithms.
It's not impossible. Remedy's Control uses both voxel lighting AND ray tracing.
 
1080 is faster than 2080 in only 1/8 of those charts that I'm seeing. Therefore, it's not faster overall. It's faster in some cases.
No! How do you get 1/8, if there are only 4 cases????

"gosh, last-gen nVidia GPUs are better at compute than the new ones."
But i never said this. I am afraid of that, but i don't know. Again, sorry for the noise - i see my fault here.

Ok, i'll list hat i see:

1st test:
1080ti: 15511
2080: 12776 (much less)

2nd:
2080: 81
1080: 73 (slightly worse)

3rd:
2080: 61
1081: 61 (tie)

Geekbench:
2080: 417000
1080: 229000 (only half? outlier? likely using tensors?)

That's my trail of thoughts. Doesn't it make sense? It does, or really not?
And i said in each post this is crap and only allows assumptions.


The 'FUD' here is nVidia's last-gen GPUs are better at compute than their new GPUs. If true, no-one interested in performance compute should buy RTX and should instead buy 1080s. It's the kind of significant assertion that needs backing up.

Which is pure personal assumption an fear of mine. Not presented as fact, or at least not meant this way.
Kepler WAS worse than 5XX. People did buy it anyway. I did too. 'Draws less power and still has more FPS', was my thought.
Even if RTX would be slightly worse with compute, people would still want to buy it because it could compensate with awesome RT and tensor power.
It would not be a bad product just because of inferior compute, considering those features.

I should just go and buy 2060 - would make life here a lot easier. Until then i can not provide any 'facts', and even then you can not proof my claims.

I seem to be on the wrong side of everyone.
That's your job. Well done. I know how it feels ;)
 
It's not impossible. Remedy's Control uses both voxel lighting AND ray tracing.

I knew someone would say this. Ok. Voxel crap or SDF crap is NOT raytracing crap.
The suggestion was: Implement your compute RT together with RTX RT, stop whining and enjoy full flexibility.
And this suggestion is pretty much crap, which i meant with: This would require two implementations of BVH, so its not the best of both worlds, it is just both worlds side by side.
What is so hard to understand when i say: Replicating Fixed Function functionality is just stupid?

I looked at the Danger Planet tech (link). It's very limited compared to ray tracing. Doesn't even handle occlusion, it requires manual artist input to prevent light leaking. In any case, PICA PICA already uses surfels for GI and doesn't have noise issues.
It does handle occlusion by negative light. That's the whole (brilliant) idea.
An image like Cornell Box looks like a path traced image of the same scene. You or I could not spot which is which.

The problem is: The negative light does not exactly match the effect real occlusion would have, and this causes leaks and very ugly color shifts at interiors. I don't think this could be solved. If so, i could stop working, and NV could focus on 16XX.
Bunnells suggestion of sectors to prevent the leaks is crap.
But for outdoors and some simple houses it would work and nobody would notice the color shifts. It would look much better than Metro or Q2.

Thanks for paying attention on alternatives!

alse dichotomy. If anything, Microsoft and NVIDIA pushing RT is what will generate the research that will yield faster algorithms. Without this push there would be little interest and therefore little research into the topic.
You still don't get it: Research on faster RT algorithms will reduce to 1%. Only NV and other Vendors will do it. No faster algorithms! Only faster blackboxed hardware.
All it will spur is moving offline tech to games. An a way that barely makes sense from efficiency perspective. But that's how progress works nowadays. It's no longer important to be fast. Photorealism isn't important either. Only selling GPUs is. Progress only slightly fast enough to dominate competition.

99.9% of games are triangle-based. It's the best choice for now.
The best choice for primary visibility O(N^2) is not necessarily the best choice for lighting O(N^3). And they do not need to be the same.
 
Impossible: Both worlds need their own acceleration structures - twice the work. DXR BVH is blackboxed and only accessible by tracing DXR rays, which further do not allow parallel algorithms.

Why would it be impossible for exampel BFV or any other game to use the RT cores for reflections and compute for GI and or shadows? Turing sure can do some compute as the chip is quite the monster. I dont see it being inferior to an 5950, or 7870 or even Vega for that matter.

by the way, i think i'm more neutral than pretty much anybody else here!

Im understanding that NV's current solution isnt optimal, and that a more flexible solution to RT probally is much better using compute, but right now nobody else has come with something, nvidias fixed but fast function RT cores wont hamper AMD from releasing something better, if they can.

I should just go and buy 2060

Get a 2080TI or even a Titan RTX if you can and want, im sure it will suffice for your RT needs for now both compute, hw RT, and normal rasterization wise, i suspect even double PS5's GPU power.

I dont think theres any reason for fear imo, im sure Nvidia, AMD, maybe even Intel are working on this. Nvidia just wanted to be first, and with compute it would be too slow for modern games, their 'easy' solution was their RT cores as they probably also serve in the pro-markets.

I seriously doubt we would see RT in modern games like BFV without the RT cores though, things would be too slow and people would complain about that instead.

I've both been accused of being biased in favour of AMD and biased in favour of nVidia in this one thread, so I think the neutrality crown goes to me because I seem to be on the wrong side of everyone. :p

And being Sony bias in another :p (doesnt that make you neutral in a sence?) Just feels like that sometimes, personally, your handling it nicely though :)
 
Why would it be impossible for exampel BFV or any other game to use the RT cores for reflections and compute for GI and or shadows? Turing sure can do some compute as the chip is quite the monster. I dont see it being inferior to an 5950, or 7870 or even Vega for that matter.
This is exactly what i'm working on (and i expect Turing compute to be more then fast enough to be clear). In this case i still have two BVH implementations, but it's worth it and i have no choice anyways.

My 'Impossible' is not related to GI, it is only related to the thought: 'I have a dynamic LOD representation of the scene in place already. Because of this it likely is at some distance faster to trace than fully detailed triangles. So should i implement reflections using RTX, my samples hierarchy, or both?'
And there is no good solution possible. If i try to get the best of both, the relative cost of RTX BVH build is much higher than for a game that uses only RTX. But not using RTX is no option, even if my stuff would be faster. I have to use RTX in any case, and the support of perfectly sharp reflections might be the only argument.
Replacing my BVH and tracing with RTX is no option either, because it would be surely slower.
Also i already have reflection support for rough materials (which are the most common), so using RTX 'just for reflections' appears to have a very high relative cost for its application.

The only good option would be to use my stuff as a fallback for RTX based 'path tracer' (similar to my assumptions about Minecraft). Then RTX is fully justified, but i'm not sure if it's fast enough for this in a detailed open world game. Likely not, so i'll do exactly your quote. That's surely possible and will look good.

Im understanding that NV's current solution isnt optimal, and that a more flexible solution to RT probally is much better using compute, but right now nobody else has come with something, nvidias fixed but fast function RT cores wont hamper AMD from releasing something better, if they can.

Many of you think there would have been no interest and work on RT in games (or alternatives that aim for the same results) before RTX has appeared. I don't think that's true. But those that did work on this likely have similar problems now.
On one hand you criticize 'lazy' game development not showing progress, on the other hand you reject our criticism on forced and restricted solutions which may contradict our designs. We do no criticize just to rant.
The reason you do not see much progress recently is the ending life cycle of current gen consoles. XBox One is the lowest denominator. It makes no sense to make games just for PC equipped with high end GPUs.

AMD is forced to do something, quickly. Unlikely it will be better. RT has been introduced in a surprising rush, without any discussion with game developers, AFAIK. The harm has been done and is impossible to fix now. My hopes on AMD are just tiny.

Get a 2080TI or even a Titan RTX if you can and want, im sure it will suffice for your RT needs for now both compute, hw RT, and normal rasterization wise, i suspect even double PS5's GPU power.
Of course i want the GPU that is closest to next gen consoles, and it's features and performance will answer all my questions. Who knows - maybe MS comes up with some 2060 alike, but surely no Titan :)

I seriously doubt we would see RT in modern games like BFV without the RT cores though, things would be too slow and people would complain about that instead.
No, surely not. Unlike SSAO SSR is totally unacceptable and next gen would have allowed to address this in any case. I don't see any other solution here than RT.
Also i work an complete GI solution, not just reflections, and i targeted last gen consoles. The hardware is not too slow, just the development is.
There also was a video here of a racing console game using path tracing. They do not work on this just for cut scenes and surely have more in mind. There's work going on behind closed doors.
Edit: My doors are open only because i'm independent. No NDAs. You don't know what all the others do.
 
What's the cost of building a BVH tree? What's the overhead of having to build two in independent lighting + RTX versus the idea of having just one structure that's used by both?
 
What's the cost of building a BVH tree?
Hard to guess.
But the cost must be noticeable, remembering the optimiztaion plans prior BFV update:

"Yasin Uludağ: One of the optimisations that is built into the BVHs are our use of “overlapped” compute - multiple compute shaders running in parallel. This is not the same thing as async compute or simultaneous compute. It just means you can run multiple compute shaders in parallel. However, there is an implicit barrier injected by the driver that prevents these shaders running in parallel when we record our command lists in parallel for BVH building. This will be fixed in the future and we can expect quite a bit of performance here since it removes sync points and wait-for-idles on the GPU.

We also plan on running BVH building using simultaneous compute during the G-Buffer generation phase, allowing ray tracing to start much earlier in the frame, and the G-Buffer pass. Nsight traces shows that this can be a big benefit. This will be done in the future."

If the cost would be negligible, they would not list this as the first planned optimization maybe. (next they mentioned SSR)
However, i hope it is not that bad and just the price to pay to have hardware acceleration at all.
One BVH for multiple purposes could make sense only on consoles, because vendors probably use different data structures.
Only vendor extensions could bring it to PC. (Another argument towards per vendor APIs instead DX, GL, VK, Metal....)
 
Back
Top