NVIDIA discussion [2024]

I wonder what could they improve besides AV1/AV2 support, and maybe even better upscaling (despite already being best-in-class afaik)?

I don’t think NVIDIA has any incentive to improve gaming performance on Shield because of Geforce NOW but I did just realise GFN now supports AV1 which won’t work on current Shield which is unfortunate.

There's an odd issue right now in that to fully support all of Geforce Now and have the optimal experience you need a discrete RTX GPU.

GFNs new VRR support for instance requires an Nvidia RTX GPU I believe.

Even just full link HDMI 2.1 option to leverage GFN at 4k120 has limited options other then a discrete GPU.

Also GFN has some limitations in terms of feature support. GFN doesn't support RTX HDR for instance. You may need a new Shield device to implement it client side. RTX HDR would also be great just as a media playback device.

4k120 10 bit 4:4:4. Updated code and (maybe even h266). Updated wifi (maybe wifi 7). "RTX" GPU with tensor cores for AI features (eg RTX HDR) both current and in the future.

The edge AI push in general may also mean it might make sense to have an Shield AI device. Nvidia is also already rumored to either release SoC intended for Windows either themselves and/or in partnership with Mediatek. Seems that could also be repurposed for something other then a Windows laptop.
 
Last edited:
Couldn't we expect to see a repeat of the current situation where the next Shield and Switch 2 share a chip i.e. the mooted T239? These products are not competitors.

Using the switch 2 chip would make sense, but I'm curious if the shield line will continue. The current model hasn't had a software update in a year now. It's running a version of Android TV that's 2-3 major versions out of date. It's still great, but idk if they have quietly ended support or what.

I mention this somewhat above but Nvidia is rumored to have a SoC (or their technology going into a SoC) for Windows on ARM laptops, possibly in partnership with Mediatek.

The related possibility is that going forward they don't themselves develop a new Shield device but just offer a software package in conjunction with their SoCs or devices with their deployed GPU IP. Those could be deployed in anything from a tablet to a laptop or a set top box (like the Shield) to a full desktop.
 
I have the tube shield and it struggles with some content. Particularly Dolby Vision and Paramount+ for some reason. VRR support would be nice too but they’ve abandoned in home streaming so I have limited use for it now.

You should be able to use it nearly functionally identical with Moonlight connecting to Sunshine.

It's better in some ways, for example Sunshine works around turning on mouse acceleration by default in windows on the server unlike with connecting to Nvida Gamestream (this was annoying for me, as you had to manually turn it off every connection since I only play with kb/m).
 
I have the tube shield and it struggles with some content. Particularly Dolby Vision and Paramount+ for some reason. VRR support would be nice too but they’ve abandoned in home streaming so I have limited use for it now.
I'm sure there are areas where the old one could be improved. I just don't see Nvidia pursuing this market of all markets. A 3rd party device on the rumored SoC with MTK is way more probable IMO.
 
I mention this somewhat above but Nvidia is rumored to have a SoC (or their technology going into a SoC) for Windows on ARM laptops, possibly in partnership with Mediatek.

The related possibility is that going forward they don't themselves develop a new Shield device but just offer a software package in conjunction with their SoCs or devices with their deployed GPU IP. Those could be deployed in anything from a tablet to a laptop or a set top box (like the Shield) to a full desktop.
I think an "NVIDIA-powered" Windows on Arm ecosystem could be pretty neat.
 
NVIDIA appears determined to corner every segment of the emerging AI-led paradigm shift in its quest to become the go-to retailer of full-stack AI solutions. As a verifiable demonstration of this underlying strategy, look no further than the GPU manufacturer's recent release of a dedicated, open-source Large Language Model (LLM), dubbed the NVLM-D 1.0. What's more, given the model's near-parity with comparable proprietary offerings such as the GPT-4o, investors must ask the question: is OpenAI's new $157 billion valuation justified?
...
Well, NVIDIA officially unveiled its NVLM-D 1.0 LLM yesterday. The model is based on 72 billion parameters and offers near-parity in performance when compared with not only Meta's open-source Llama 3-V model, which is based on 405 billion parameters, but also those that belong in the rarefied, black box class, such as OpenAI's GPT-4o.

So, the question then emerges: if NVIDIA's 70-billion-parameter model is able to effectively compete with much larger and more complex models such as the Llama 3-V and GPT-4o, is OpenAI's stratospheric valuation justified?

This question becomes all the more critical when one considers NVIDIA's already-established, mammoth user-base and a vibrant developer ecosystem that almost guarantees success.
 
Well, OpenAI is not just doing LLM now. GPT-4o is a multimodal model, which understands not just texts but also images and sounds/voices. Obviously people will be making better models with less parameters over time, as we know more how the models works and becoming better at selecting training materials. Does that mean there's no room for models with much more parameters? I don't think so. I mean, it's not like public models such as Llama-3.2 is already suitable for every tasks. Even proprietary models such as the more recent o1 are not very good at some tasks, and people obviously want better models.

Of course, the question of valuations is always there, but I don't think the performance of public models (where many are actually quite good, of course) is the main reason for that. A 72 billion parameters model is out of reach for most ordinary people anyway (even the rumored 32GB 5090 won't be able to run something like that).
 
Wccftech is probably only reporting about NVLM-D 1.0 because Nvidia. They are painting quite a distorted picture about how significant this particular model actually is.

In reality NVLM-D 1.0 is just another late-fusion vision finetune of which there are many. The lion's share of the work is to create the base LLM which in this case is not Nvidia's work but Qwen2-72B created by a Chinese company.

Creating a good vision finetune might not be easy but it's something a smaller lab (university research group) can do whereas creating a state of the art 70B class LLM requires a huge lab and huge effort.

Some discussion about this topic on LLM enthusiast subreddit r/localllama (most upvoted comment basically reiterates my point):

PS. Not shitting on Nvidia and their R&D here, just correcting misinformation as I see alot of reporting on tech press that is quite out of the loop.
 
Wccftech is probably only reporting about NVLM-D 1.0 because Nvidia. They are painting quite a distorted picture about how significant this particular model actually is.

In reality NVLM-D 1.0 is just another late-fusion vision finetune of which there are many. The lion's share of the work is to create the base LLM which in this case is not Nvidia's work but Qwen2-72B created by a Chinese company.

Creating a good vision finetune might not be easy but it's something a smaller lab (university research group) can do whereas creating a state of the art 70B class LLM requires a huge lab and huge effort.

Some discussion about this topic on LLM enthusiast subreddit r/localllama (most upvoted comment basically reiterates my point):

PS. Not shitting on Nvidia and their R&D here, just correcting misinformation as I see alot of reporting on tech press that is quite out of the loop.
Yep, same goes for AMDs new (small language) models which are based on Meta's Llama
 
I think we will likely see more LLM model result comparisons in the future. Looking at the NVLM-D 1.0 GitHub site they provide some comparative results among some other models. I can see a need for independent third party LLM model analysis and results similar to what we have for ML Perf GPU benchmark and analysis given different sized models may reguritate the same result.
 
I think we will likely see more LLM model result comparisons in the future. Looking at the NVLM-D 1.0 GitHub site they provide some comparative results among some other models. I can see a need for independent third party LLM model analysis and results similar to what we have for ML Perf GPU benchmark and analysis given different sized models may reguritate the same result.

This is a hard problem, of course. It's pretty similar to GRE or other general exams. You can't have a fixed problem set, because people will train their models using these questions. So basically you need to use new exams. Using the same tests designed for humans is of course a good start, but right now it's not very easy to convert these exams into computer readable form (although with the proliferation of multimodal models it might be easier now).
Maybe soon we'll see that people will be benchmarking these models using various exams around the world to see how well they perform, and destroying teenagers' confidence at the same time ;)
 
Yep, same goes for AMDs new (small language) models which are based on Meta's Llama
The super duper tiny AMD model is based on Meta's Llama 2 architecture yes, but trained from scratch by AMD. So it's a different scenario as Nvidia is using pre-existing weights trained by Qwen.

I can see a need for independent third party LLM model analysis and results similar to what we have for ML Perf GPU benchmark and analysis given different sized models may reguritate the same result.
LMSYS arena is quite good. It has an ELO leaderboard (think Dota 2, Starcraft ranking) based on human blind tests. Nvidia model is not (yet) included however.


Select "Leaderboard" tab from top and "Arena (vision)" tab from a bit lower to see Vision model scores.

Screenshot 2024-10-03 at 14-37-48 Chatbot Arena (formerly LMSYS) Free AI Chat to Compare & Tes...png
 
This week, giant professional services company Accenture announced an expanded partnership with Nvidia that not only includes a new unit – the Accenture Nvidia Business Group – that has 30,000 people who will use AI agents to help enterprise scale agentic AI deployments, but also the Accenture AI Refinery platform that leverages Nvidia’s AI stack – Nvidia AI Foundry, AI Enterprise, and Omniverse – to make easier for them to adopt it.

Accenture also leverages Nvidia’s NIM (Neural Inference Microarchitecture) and NeMo offerings for better token efficiency and for fine tuning and evaluation, said Justin Boitano, vice president of enterprise AI software products at Nvidia.

Accenture-AI-Refinery.png


Accenture is making AI Refinery on both public and private cloud platforms and integrate it with its other business groups. There also will be network of engineering hubs to accelerate the use of agentic AI and Accenture is embracing agentic AI internally, initially with its Eclipse Automation business that it bought two years ago and within its marketing unit, which will lead to as much as 35 percent reduction in manual steps, 6 percent cost savings, and a 25 percent to 55 percent increase in speed to market.

Lan Guan, chief AI officer at Accenture, said during a briefing with journalists that generative AI demand drove $3 billion in bookings in its last fiscal year, and that agentic AI will help accelerate demand.
 
This is a repated post, but I am reposting it here for relevancy and better exposure.

According to Sep 2024 results, the 4060 is now the most used GPU on the Steam Hardware Survey. I think this is the first time in a while that a 60 class card climbs up the charts in the same generation. It only happened back with the 970 generation as far as I can remember.
  1. RTX 4060 Desktop + Laptop – (4.58%+4.37%) = 8.95%
  2. RTX 3060 Desktop + Laptop – (5.86%+3.00%) = 8.86%
  3. RTX 4060 Ti Desktop - 3.66%
  4. GTX 1650 Desktop + Laptop - 3.64%
  5. RTX 3060 Ti Desktop – 3.57%
  6. RTX 3070 Desktop – 3.31%
  7. RTX 2060 Desktop + Laptop – 3.30%
  8. RTX 4070 Desktop – 2.91%

Interestingly it seems desktop 3060 has higher percentages than desktop 4060, which is logical considering the 3060 existed for longer. However, laptop 4060 has higher percentages than laptop 3060, which could mean it was shipped in significantly higher volume than laptop 3060.
 
Last edited:
Back
Top