Nvidia Turing Product Reviews and Previews: (Super, TI, 2080, 2070, 2060, 1660, etc)

Didn''t Jensen implicate that rest of the GPU would idle when tensor cores are active?
Probably a misunderstanding.
AFAIK, tensor program flow is controlled by regular compute cores, but the heavy math operations are executed on tensor units. (Same as with RT cores) Tensors can not do any logic, they only can do math.
Also the diagram above does not make much sense to me, the term 'int32 shading' alone really is a bit pointless.

The proper question more likely is: When the warp issues tensor or RT commands, does it become available for other work while waiting?
If yes, then DLSS or RT can not only run along with other async compute or rendering tasks, also the warps working on RT / DLSS could help with those tasks while waiting.
Even if we could see analysis of a BFV / Metro frame, and DLSS would run alone as in the diagram, this would not proof it's not possible. (often doing such stuff async hurts performance due to cache trashing or other bottlenecks, and the devs decide against async)

Edit: To be more clear, DLSS can not run 'for free'. Even if it could, it would at least share bandwidth with other tasks.
 
I'm wondering though actually what fp16 operations turing can do with twice the rate of single precision, that is, can they do more than mul/add/fma? Obviously for the tensor operations you don't really need anything else, but otherwise things like comparisons would be quite desirable.
Not a proof of the actual hardware implementation, but the PTX ISA has fp16x2 instructions for fma/add/sub/mul/neg and comparisons (which is mostly subtraction with NaN detection and a tiny amount of bit logic), starting from Jetson TX1 and Pascal.
 
NVIDIA Quadro RTX 4000 Review: Turing Powered Pro Graphics
February 27, 2019
The NVIDIA Quadro RTX 4000 is based on the Turing TU106 GPU, similar to the GeForce RTX 2060 and RTX 2070. Its GPU will boost up to 1,545MHz and it is paired to 8GB of GDDR6 over a 256-bit interface, with an effective data rate of 13Gbps. At that speed, the card offers up to 415GB/s of peak memory bandwidth.
https://hothardware.com/reviews/nvidia-quadro-rtx-4000-review?page=1


I came here to post exactly this.
I might be needing a new laptop soon for solidworks and I tend to use raytraced images and videos for presentations.
If you are going to use solidworks 2019 make sure your GPU supports the application you will be using.
Video cards designed for “gaming” or multi-media applications, such as NVIDIA GeForce or AMD Radeon cards, do NOT offer maximum performance or stability for SOLIDWORKS. Game/multi-media cards are optimized for a low number of polygons displayed on the screen, but at a high frame rate. CAD applications have the opposite requirement, where polygon count is high (the detail in your design model) but the image does not change rapidly so high frame rates are not as critical. Using a certified graphics card and driver combo will yield the most stable platform for running SOLIDWORKS.

Something to consider is a product called SOLIDWORKS Visualize that was introduced in SOLIDWORKS 2016. This product has been developed to solely use NVIDIA Technology. As such, Visualize will not use any non-NVIDIA GPUs and will revert to using only the CPU if no NVIDIA graphics card is installed. In this case, investing in higher end NVIDIA graphics cards will improve render times. This should be taken into consideration before purchasing a graphics card.

Also, new products on the 3DExperience platform may have different supported hardware. Products like SOLIDWORKS ConceptuaL Designer and SOLIDWORKS Industrial Designer have specific certified hardware, and NVIDIA hardware outweighs AMD hardware 4:1. This should be taken into account if you are considering these products.
https://www.javelin-tech.com/blog/2018/11/solidworks-2019-hardware-recommendations/
 
nervous twitch.gif
chainsaw.gif
hug.gif
shifty.gif


Thread-bans will be duely considered if the pooping continues.

kthnxbai


ninja.gif
 
Last edited:
nervous-twitch-gif.2915


hug-gif.2916

shifty-gif.2917
Am I weird if that post cracked me up so much? :LOL:
 
No Slowing Down: How TITAN RTX Brings High-Quality Images to Gameplay Design
February 11, 2019
Käy Vriend is the co-owner of Icebreaker Interactive, an indie game studio based in Denmark. He’s working on Escalation 1985, an online first-person shooter game set during the Cold War. For Vriend, it’s important to have realistic details to enhance the player experience, and his own experience as a war veteran plays a big part in building out the game.
...
In Substance Designer, Vriend uses three bakers that leverage ray tracing: ambient occlusion for local shadowing, bent normal for illumination and reflection, and thickness for subsurface scattering and translucency.
When he receives a 3D model from an artist, Vriend bakes the maps and analyzes them in 3D. From Substance Designer, he bakes all materials to properly run on lower resolution meshes before using Substance Painter to paint on the low-poly mesh. Then he exports the material with the model to Unreal Engine, where all the assets are knitted together to produce beautiful visuals.
..
Without the TITAN RTX, Vriend said that baking one map with local shadowing took around 14 minutes. With RTX, it was done in 16 seconds. Now he’s able to crank up parameters and bake models at a higher quality, check if something is wrong and still have enough time to rebake if needed.
https://blogs.nvidia.com/blog/2019/02/11/titan-rtx-brings-high-quality-images-to-gameplay-design/
 
Which IMO is what RTX is really about and for. 16 seconds is ridiculously long for realtime applications, but for professional imaging and content creation, it's an insanely beneficial improvement.
 
Just checking to see if this forum (and others) are still live as the last post here was 8 days ago.

The GTX 1660 was released today and lots of reviews went live today but not even a peep here.
 
Well it's not exactly an exciting product. Not that it's bad in any way really, just... yawn.

I think the mobile versions might bring the largest step up.
The GTX 1650 might bring GTX 1060 performance to the GTX 1050/1050Ti bracket (sub-$900 laptops), and that's a very nice improvement.

Regardless, 12nm aren't doing any wonders here.
 
So yesterday nVidia launched the desktop GTX1650 which is a TU117 with 14 SMs enabled and the mobile GTX1650 that is a TU117 with all 16 SMs enabled.
It uses a 128bit bus with 4GB GDDR5.
They're announcing it as a replacement for the GTX 950, and they're selling the desktop part for $150/150€.

Apparently nVidia decided to block all pre-release reviews by not sending review units to anyone, and by not releasing any supported driver until the cards were in the shelves.
In the meanwhile, reviews started popping up here and there with retail units/drivers and the general sentiment is the desktop 1650 is hard to recommend. It costs more than the RX570 4GB which goes for €130 and is substantially faster, and costs the same as the RX570 8GB which is not only faster but also more future-proof. Of course the new card consumes a lot less, but a RX570 only needs a 400W PSU anyways so it hardly makes any difference. Maybe that's why it nVidia tried to block the reviews.

OTOH, the mobile version might be a bit more interesting because the new chip is going into laptops that previously had the GTX1050/1050Ti, so those will be getting a "free" performance upgrade.

What puzzles me the most is that this new chip TU117 is 200mm^2. That's the exact same size as the GP106 GTX1060, which performs significantly better.
So similar to what we saw with TU116 vs GP104+GGDR5 (GTX1660 Ti vs. GTX 1070 Ti), again we see a Turing card that has worse performance/area than its Pascal predecessor. This seems to be happening because they're trading the higher transistor amount / die area of the Turing SMs for a lower SM count and less PHYs, and the end result is worse performance.
 
Honestly I think the pricing is just too high; it’s ridiculous that RX570 has a slightly larger die and a 256-bit memory bus but is actually cheaper right now.

I think TU117 is cheaper/better for NVIDIA than GP106 because 128-bit/4GB is going to be a lot cheaper than 192-bit/6GB but the pricing doesn’t really reflect that right now.

As for die size, I suspect we’re seeing a combination of multiple things:
1) Higher focus on perf/W than perf/mm2 at the architectural level, which makes a lot of sense for the high-end given thermal constraints and power cost in data centres, but isn’t so important in the low-end.
2) Forward-looking compute features, e.g. the new SIMT model which I genuinely think is brilliant (but useless for existing workloads).
3) Forward-looking graphics features, including mesh shaders and FP16 ALUs. It’s pretty obvious that FP16 is a perf/mm2 loss on existing games but hopefully a gain in future content.
4) Better memory compression resulting in higher area but lower total cost (cheaper memory) for a given level of performance. Unfortunately memory speeds/sizes/buses are quite coarse, so it’s impossible for every chip to hit the sweet spot or be comparable between generations.

Anyway whatever the reasons, the reality remains that Turing’s perf/mm2 is disappointing. And their pricing is even more disappointing but at least that gives them some room for manoeuvre against Navi. Hopefully AMD’s perf/mm2 increases significantly in Navi which gives them a chance to finally catch up...
 
Nvidia have built a huge brand and don't need to worry about competing heavily on pricing even at the low end. Consumers will buy them anyway even if it's a worse product.
 
Honestly I think the pricing is just too high; it’s ridiculous that RX570 has a slightly larger die and a 256-bit memory bus but is actually cheaper right now.
AMD is trying to burn off their inventory.
I think TU117 is cheaper/better for NVIDIA than GP106
GP106 also retailed for more.
Consumers will buy them anyway even if it's a worse product.
Plus laptops and OEM boxes.
 
Nvidia have built a huge brand and don't need to worry about competing heavily on pricing even at the low end. Consumers will buy them anyway even if it's a worse product.
Not this again... The 1650 is the same lowend as the 1050 was. It targets OEM pre-builts, notebooks and also "gaming" market in developing countries. Having no PCIe connector and low power consumption is a big win in those specific segments. Look how popular the 1050 was...
 
Not this again... The 1650 is the same lowend as the 1050 was. It targets OEM pre-builts, notebooks and also "gaming" market in developing countries. Having no PCIe connector and low power consumption is a big win in those specific segments. Look how popular the 1050 was...
Yep, I agree. It's power consumption is considerably better than the competition, making it ideal for OEMs, laptops etc. Note that the gaming versions built by MSI, Gigabyte etc being reviewed and indicating perf just below 570 are versions with power connectors, not 75w limited versions so they're clocked higher.

I have a 1050ti without a power connector for my kids gaming PC. It was great for the price.

I'm not sure why this changes my opinion though? There are still box products at BestBuy etc for these by major brands, it won't be limited to OEMs and China. I was merely commenting that Nvidia are in a position where they don't really have to compete on pricing.
 
I think TU117 is cheaper/better for NVIDIA than GP106 because 128-bit/4GB is going to be a lot cheaper than 192-bit/6GB but the pricing doesn’t really reflect that right now.
Considering the price they sell GP106-based products, I really, really doubt that. 64-bit worth of traces on a PCB and 1 - 2 x 8 Gb memory chips (for 3 & 6 GB models) can't be expensive enough to cover the difference. The chips themselves are the same size and even 3GB 1060 still retails for 20 eur more than 1650 (in Finland anyway, comparing cheapest 1060 3GB to cheapest 1650 listing, 1650's seem to have slight price premium over the official MSRP, too)
 
Back
Top