Speculation and Rumors: Nvidia Blackwell ...

  • Thread starter Deleted member 2197
  • Start date
Where can I read more about Nvidia's effort to push AI users to workstation cards? I'm a bit skeptical about this, since they market their Geforce GPU's for LLM and image generation use.

Also, 32GB isn't alot for AI use. I bet you won't see the 5090 being recommended on r/localllama for the same reason that no one recommends Nvidia workstation cards either, they're just way too expensive for the amount of VRAM they offer. A system with 2x3090 will be a lot cheaper and has +50% VRAM capacity. It can for example run Llama 3 70B at 4 bits per weight whereas a single 5090 won't be able to.
It should be self evident from the pricing and the fact that they gimp the drivers on gaming cards. They’re obviously aware the x90 cards are used by creators and for AI, but they clearly prefer you to buy the cards with higher markup. Why else would they have dropped the NVLink?
 
It should be self evident from the pricing and the fact that they gimp the drivers on gaming cards. They’re obviously aware the x90 cards are used by creators and for AI, but they clearly prefer you to buy the cards with higher markup. Why else would they have dropped the NVLink?
Lol ... not sure where you get your facts but there are not any gimped drivers. If fact if you choose you can install the Studio drivers if you decide though the difference is the amount of QA involved for stability. Same games, same applications.
 
It should be self evident from the pricing and the fact that they gimp the drivers on gaming cards. They’re obviously aware the x90 cards are used by creators and for AI, but they clearly prefer you to buy the cards with higher markup. Why else would they have dropped the NVLink?
The only artificial driver level limitation on a 4090 that I'm aware of is that P2P (GPU-to-GPU memory access through PCIe) isn't supported. This and NVLink have to do with multi-GPU setups only and even there, not having these features doesn't limit or slow down stuff like multi-GPU LLM inference.

The direct GPU-to-GPU linking is more helpful with AI training however. So if there is a push to drive "AI users" towards workstation GPU's, it's aimed at professionals.

Also, doesn't the pricing make it pretty clear that Nvidia isn't pushing your average Stable Diffusion enjoyer towards a workstation card? I mean, what kind of a hobbyist AI user can afford them?

Getting back to Blackwell, I'll point out that the way AI models are advancing means that during it's lifespan, the 32GB will be much more limiting to the 5090 than the 24GB is/has been to the 4090. The first Stable Diffusion model, released two years ago, was 2GB in size. The current sota T2I model that can be run on a home computer, Flux.1, is 22GB in size.

I would be surprised if Nvidia finds it necessary to gimp the 5090 AI performance in any other way than keeping the current multi-GPU limitations in effect (if even that), the 32GB memory capacity is gimped enough already.
 
The only artificial driver level limitation on a 4090 that I'm aware of is that P2P (GPU-to-GPU memory access through PCIe) isn't supported. This and NVLink have to do with multi-GPU setups only and even there, not having these features doesn't limit or slow down stuff like multi-GPU LLM inference.

The direct GPU-to-GPU linking is more helpful with AI training however. So if there is a push to drive "AI users" towards workstation GPU's, it's aimed at professionals.

Also, doesn't the pricing make it pretty clear that Nvidia isn't pushing your average Stable Diffusion enjoyer towards a workstation card? I mean, what kind of a hobbyist AI user can afford them?

Getting back to Blackwell, I'll point out that the way AI models are advancing means that during it's lifespan, the 32GB will be much more limiting to the 5090 than the 24GB is/has been to the 4090. The first Stable Diffusion model, released two years ago, was 2GB in size. The current sota T2I model that can be run on a home computer, Flux.1, is 22GB in size.

I would be surprised if Nvidia finds it necessary to gimp the 5090 AI performance in any other way than keeping the current multi-GPU limitations in effect (if even that), the 32GB memory capacity is gimped enough already.
I don’t think I’m being clear with what I originally meant - just that NVIDIA would prefer professional users to buy the professional cards (because of course they do), and adding the extra bus width and ram makes the pro cards less appealing from a price performance perspective without helping gaming. The takeaway I get is that they’re probably embracing its truly “prosumer” role, more so than with the 4090, and will price it accordingly. I just hope it’s not completely insane (over $2500). They must have calculated that cutting the bus wouldn’t push enough people to the pro cards, so it probably made more sense to give it the full bus and charge an extra $500+ dollars over a 448-bit bus (which was rumored as their initial plan with the 512-bit being reserved for a possible Titan).
 
It should be self evident from the pricing and the fact that they gimp the drivers on gaming cards. They’re obviously aware the x90 cards are used by creators and for AI, but they clearly prefer you to buy the cards with higher markup. Why else would they have dropped the NVLink?
My understanding was that the NVIDIA drivers essentially disabled P2P access for multi GPU setups. I admit I haven’t tried this at home so I could be wrong!
 
The way I define “caring about gaming” from Nvidia’s perspective is having executives who are still passionate about the advancement of game graphics. I’m not referring to selling a billion 4060’s to the mass market. So yes there’s an element of catering to the enthusiast DIY crowd implicit in that.

Just sticking to this discussion can we really say Nvidia is not advancing graphics? I would think it's hard to argue that they haven't been putting the most effort in that respect. We can complain about the relevance of said technologies and the product stack from a consumer stand point, as well has maybe the anticompetitive issues with some of their pushes, but I do think it's pretty clear they are putting in the effort and resources to advance said graphics in the PC space.
 
The direct GPU-to-GPU linking is more helpful with AI training however. So if there is a push to drive "AI users" towards workstation GPU's, it's aimed at professionals.

Also, doesn't the pricing make it pretty clear that Nvidia isn't pushing your average Stable Diffusion enjoyer towards a workstation card? I mean, what kind of a hobbyist AI user can afford them?

I think there's discussion to be had here for the inbetween, by that I mean consumers who aren't going to be doing full on training of models but looking to do fine tuning and how accessible that should be in the consumer space.

Take Stable Diffusion for instance. At least my understanding is that pure inference with consumer GPUs has been long been optimized enough to work well even on the low 8GB VRAM, but fine tuning is where you run into issues on Nvidia's current consumer stack in terms of VRAM in terms of cheap access.
 
Just sticking to this discussion can we really say Nvidia is not advancing graphics? I would think it's hard to argue that they haven't been putting the most effort in that respect. We can complain about the relevance of said technologies and the product stack from a consumer stand point, as well has maybe the anticompetitive issues with some of their pushes, but I do think it's pretty clear they are putting in the effort and resources to advance said graphics in the PC space.

They’re absolutely putting in the effort and producing useful results too. However there’s a reason they show off their tech on 4090s and not 4060s. It’s just more viable on high end hardware. So it follows that they would want to promote their gaming tech on 5090s too. They couldn’t do that with any credibility if they price it too high.
 
They couldn’t do that with any credibility if they price it too high.
In the same way Ferrari can't promote their car tech in cars that cost too much? I think you just create an air of elite product if you price it out of the hands of the mainstream, and generally that just means good margins. There'll still be enough wealthy people able to buy and use and show off.
 
Further to the point, the tech that works fine on a 4090 and somehow doesn't (?) on a 4060 I will presume is a performance limitation rather than a technical one. As such, NVIDIA brings in new tech which I'll rephrase and say "works best" on their highest performance cards of the generation -- and then this technology trickles down. What was feasible but only useable on a 2080Ti should be entirely feasible AND usable on a 4060 this gen. Perhaps, for example, raytracing at 1080p at more than 30fps average? I'd have to go dig up some Cyberpunk benchies to really make the point, and honestly I can't be arsed at this moment.

It's not that NVIDIA is bifurcating technology at the very top end, insofar as I'm aware the entire SKU stack for the same gen all supports the same silicon-level functionality. The only real differences between the SKUs of the same gen are near-always total compute power, total bandwidth, and memory capacity. Am I just forgetting something stupid where the highest-end card literally picked up a technology which the lower end cards of the same gen were physically incapable of supporting? It could be true, I just don't remember one right now.

And then, why couldn't "lower cards" of future generations then support it just as well?
 
Last edited:
In the same way Ferrari can't promote their car tech in cars that cost too much? I think you just create an air of elite product if you price it out of the hands of the mainstream, and generally that just means good margins. There'll still be enough wealthy people able to buy and use and show off.

Does Ferrari market their tech to regular bros?
 
Does Ferrari market their tech to regular bros?
Yes they do. Very specifically, the bros who have money.

And if you wait long enough, that boutique technology has a strong tendency to find its way down into lower cost vehicles in the following years. You're not paying Ferrari for the actual BOM, you're paying for the namesake and everything that comes with it. In the end, the pure technology isn't that expensive...
 
Further to the point, the tech that works fine on a 4090 and somehow doesn't (?) on a 4060 I will presume is a performance limitation rather than a technical one.

Yes but the people buying the 4060 know that there are actually many people out there gaming on the 4090 so the tech is “real”. If the tech is demo’d on some kit that nobody is using then it’s just a dog and pony show. That’s what I mean by credibility. You can’t market stuff to gamers unless gamers can actually use it.

Imagine Nvidia posted videos o( 4090s running Cyberpunk PT but 4090s cost 5K. Would the gaming community have given them props for that?

Yes they do. Very specifically, the bros who have money.

At what price point do “bros with money” stop being relevant to the gaming community? $10K, $50K?
 
...That’s what I mean by credibility. You can’t market stuff to gamers unless gamers can actually use it.
This is where we disagree.

They can absolutely market it, and the gamers can absolutely use it, so long as the gamers have the money. Just exactly the same as how Ferrari can market to both of us about a car that will do 0-60 in 1.9 seconds, which we absolutely CAN have, so long as we're willing to fork over the cash.

And yeah, if a 5090 can play Cyberpunk at 4k 120Hz HDR seamlessly and has a $5000 pricetag? Welp, every gamer on the planet with $5000 to burn can play it that way. For the rest of y'all, you don't get 4k 120Hz HDR, you get something less. Sorry, but that's how all of life works.

What I feel like you're missing is the underlying price curve: if enough gamers are willing to pay $5000 to play CP2077 at 4k 120Hz HDR, then NVIDIA will continue to push this profitable angle. If there simply aren't enough people to pay that sort of money for such a card, NVIDIA will eventually figure out the market can no longer support the price, and so the price has to come down or else they'll be sitting on a lot of capital expenses without income to cover.
 
Yes but the people buying the 4060 know that there are actually many people out there gaming on the 4090 so the tech is “real”. If the tech is demo’d on some kit that nobody is using then it’s just a dog and pony show. That’s what I mean by credibility. You can’t market stuff to gamers unless gamers can actually use it.

Imagine Nvidia posted videos o( 4090s running Cyberpunk PT but 4090s cost 5K. Would the gaming community have given them props for that?



At what price point do “bros with money” stop being relevant to the gaming community? $10K, $50K?
Ferrari actually do just what Nvidia does, just at a higher plane.

They build flagship hypercars that get all the envy of the car enthusiast world even though most everybody knows they'll never be able to attain one, even with quite a successful life. This sets the tone for the rest of their lineup, which become more desirable as a result. Just as with Ferrari/cars, tech trickles down, so it's not even a completely irrational sort of perspective.

Flagship effect. Nvidia definitely exploits this. And like Albuquerque points out - their flagships dont actually have any features that lower end parts in the same lineup dont have. If you're talking lowest end, some may be too practically weak to use these features with high framerates, but that's a different talking point. The tech is still there and often still usable in the right situation.

There doesn't need to be any massive market for these flagship GPU's. So long as Nvidia thinks they can sell enough to the right people at a given price, they're going to do that. Nvidia is not really in the business of caring what 'most' consumers want in terms of value anymore.
 
I feel like you're arguing about a scenario which doesn't exist - there are more than just 4060 and 4090 on the market.
With 5090 I kinda expect them to hit a higher price than that of 4090 (closer to $2000 probably) but that doesn't mean that whatever they will be showing working on a 5090 won't work on a 5080 or 5070 even.
The entry level never manages to run newest features well enough so arguing over how people would see 5090's features on their 5060s is a bit weird - when was the last time a "low end" card managed to run all its features as well as the very top end one? This is nothing new to the market.
 
By the way, I did want to point out I'm not a fan of $5000 video cards, any more than I'm a fan of $2000 video cards, or even $1000 video cards.

Like every consumer, I'd prefer everything were better, faster, AND cheaper. Sadly, that's also not the world any of us live in. I'm part of the problem of course, having purchased a 3080Ti, followed by now owning a 4090, and surely I'll be in the mix somewhere for a 5090. Funny thing is, the vast majority of my video card use these days is AI and Folding at Home, with a smattering of gaming a few hours per week.
 
I feel like you're arguing about a scenario which doesn't exist - there are more than just 4060 and 4090 on the market.
With 5090 I kinda expect them to hit a higher price than that of 4090 (closer to $2000 probably) but that doesn't mean that whatever they will be showing working on a 5090 won't work on a 5080 or 5070 even.
The entry level never manages to run newest features well enough so arguing over how people would see 5090's features on their 5060s is a bit weird - when was the last time a "low end" card managed to run all its features as well as the very top end one? This is nothing new to the market.

At launch the 4090 was only 30% faster and 33% more expensive than a 4080. The 4090 isn’t some unattainable thing.

The scenario being proposed earlier in the thread is for a 5090 to be in much higher price tier. I’m simply saying that will not work for marketing to gamers.
 
Back
Top