Nvidia shows signs in [2020]

Status
Not open for further replies.
Hyper-scale Infrastructure Services Accelerate
May 21, 2020

Accelerator share of instance types across Alibaba Cloud, AWS and Azure in March 2020 (Source: Liftr Insights)

Teich_2.png


In the processor market, reliability, availability and serviceability (RAS) has been one of the biggest impediments to Arm processor adoption. Accelerators are no different. Ensuring driver RAS at hyper-scale is a much different skill set than designing performant compilers. And it takes time to develop the skills and process control to demonstrate a history of stable behavior.

The result is Nvidia’s 86-percent share of instance types offered by the top four clouds. This share contrasts with a highly fragmented competitive field of FPGAs (Intel and Xilinx), GPUs (AMD legacy and very recently Radeon Instinct) and the clouds’ own in-house designs (today, that’s Google Cloud Tensor Processing Unit [TPU] and AWS Inferentia).

Here again, it is not enough to have performant compilers behind an accelerator’s developer tools. We assume every accelerator chip development team has access to reasonably good compiler developers and average developer tool designers.

Development tools must be usable by a large number of potential customers and must behave as developers expect them to.

Nvidia’s CUDA provides a flexible underpinning for tools developers to support a very wide variety of dev tools across Nvidia’s GPU product line. Nvidia’s share of the accelerator market increased slightly over the past year as overall accelerator-based deployments grew by almost 70 percent in the top four clouds.

Azure supports AMD’s Radeon Instinct MI25 in one type family (NVas v4) but only on Windows, and the type family’s fractional GPU-per-instance configurations are typical of virtual desktop environments. AMD has demonstrated solid support of actual enterprise desktop environments and its advanced GPU virtualization features make its GPUs competitive for virtual desktops.
https://www.eetimes.com/hyper-scale-infrastructure-services-accelerate/
 
Last edited:
OpenAI Presents GPT-3, a 175 Billion Parameters Language Model
May 29, 2020
OpenAI researchers today released a paper describing the development of GPT-3, a state-of-the-art language model made up of 175 billion parameters.

For comparison, the previous version, GPT-2, was made up of 1.5 billion parameters. The largest Transformer-based language model was released by Microsoft earlier this month and is made up of 17 billion parameters.
...
Natural language processing tasks range from generating news articles, to language translation, to answering standardized test questions.

“The precise architectural parameters for each model are chosen based on computational efficiency and load-balancing in the layout of models across GPU’s,” the organization stated. “All models were trained on NVIDIA V100 GPUs on part of a high-bandwidth cluster provided by Microsoft.”

OpenAI trains all of their AI models on the cuDNN-accelerated PyTorch deep learning framework.

Earlier this month Microsoft and OpenAI announced a new GPU-accelerated supercomputer built exclusively for the organization.

“The supercomputer developed for OpenAI is a single system with more than 285,000 CPU cores, 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server.,” the companies stated in a blog.
https://news.developer.nvidia.com/openai-presents-gpt-3-a-175-billion-parameters-language-model/
 
NVIDIA Shatters Big Data Analytics Benchmark
June 22, 2020
NVIDIA just outperformed by nearly 20x the record for running the standard big data analytics benchmark, known as TPCx-BB.

Using the RAPIDS suite of open-source data science software libraries powered by 16 NVIDIA DGX A100 systems, NVIDIA ran the benchmark in just 14.5 minutes, versus the current leading result of 4.7 hours on a CPU system. The DGX A100 systems had a total of 128 NVIDIA A100 GPUs and used NVIDIA Mellanox networking.

Today, leading organizations use AI to gain insights. The TPCx-BB benchmark features queries that combine SQL with machine learning on structured data, with natural language processing and unstructured data, reflecting the diversity found in modern data analytics workflows.

These unofficial results point to a new standard, and the breakthroughs behind it are available through the NVIDIA software and hardware ecosystem.

TPCx-BB is a big data benchmark for enterprises representing real-world ETL (extract, transform, load) and machine learning workflows. The benchmark’s 30 queries include big data analytics use cases like inventory management, price analysis, sales analysis, recommendation systems, customer segmentation and sentiment analysis.

Despite steady improvements in distributed computing systems, such big data workloads are bottlenecked when running on CPUs. The RAPIDS results on DGX A100 showcase the breakthrough potential for TPCx-BB benchmarks powered by GPUs, a measurement historically run on CPU-only systems.

The TPCx-BB queries were implemented as a series of Python scripts utilizing the RAPIDS dataframe library, cuDF; the RAPIDS machine learning library, cuML; and CuPy, BlazingSQL and Dask as the primary libraries. Numba was used to implement custom logic in user-defined functions, with spaCy for Named Entity Recognition.

These results would not be possible without the RAPIDS and broader PyData ecosystem.
https://blogs.nvidia.com/blog/2020/06/22/big-data-analytics-tpcx-bb/
 
Last edited:
Nvidia Nabs #7 Spot on Top500 with Selene
June 22, 2020
Nvidia’s new internal AI supercomputer, Selene, joins the upper echelon of the 55th Top500’s ranks and breaks an energy-efficiency barrier. With 27.5 double-precision Linpack petaflops, Selene landed the number seventh spot on the latest Top500 list released today as part of the ISC 2020 Digital proceedings.

Selene is the second most-performant industry system on the list, coming in one spot below Eni’s HPC5 machine, which was sixth with 35.5 HPL petaflops (and also uses Nvidia GPUs).
...
On the heels of Nvidia’s Ampere launch, Selene was constructed and up and running in less than a month, the company said.

Nvidia also runs internal workloads on three other machines that have made it into the Top500 ranking.

There’s the V100-based DGX Superpod machine, which came in 24th on the latest Top500 with 9.4 Linpack petaflops; the P100-based DGX Saturn-V, deployed in 2016 that’s currently in 78th place with 3.3 petaflops; and Circe, another V100-based Superpod that’s grabbed the 91st rung with 3.1 Linpack petaflops.
https://www.hpcwire.com/2020/06/22/nvidia-nabs-7-spot-on-top500-with-selene-launches-a100-pcie-cards/

 
Visualizing 150 Terabytes of Data
June 22, 2020
The amount of data produced during this simulation, which took approximately one week to run on Summit, was a massive 128 TB—equivalent to approximately 25,000 4K movies. The existing solution was to create a frame-by-frame video rendering of the data, a process that took hours. This demonstration uses two NVIDIA technologies—IndeX and GPUDirect Storage—to allow researchers to fly through the massive dataset in real time, volumetrically, and even navigate through it while the simulation data continuously updates.
 
But does it play crysis?

Mercedes-Benz and semiconductor giant Nvidia have inked a deal, joining forces to build "software-defined vehicles" across the German automaker's entire fleet. From the entry-level to the high-end, all of Benz's next-generation vehicles will be powered by standard Nvidia Drive AGX Orin technology, sensors and software starting in 2024.

Earlier this year, Nvidia launched its Nvidia Drive AGX Orin, a Herculean new vehicle architecture capable of powering everything from advanced driver-assistance systems to full, Level 5 autonomous driving from a system-on-a-chip (SoC). Mercedes-Benz isn't the first to adopt the platform -- that honor goes to Chinese EV startup Xpeng -- but it is certainly the largest yet.
Mercedes-Benz will standardize Nvidia's Orin platform and sensor suite for all of its next-generation vehicles.
The automaker will also license the complete Nvidia Drive Software stack and together Nvidia and Mercedes-Benz will jointly develop AI-powered SAE level 2 and 3 automated vehicle applications -- such as address-to-address automated driving -- as well as automated parking functions (up to level 4).
https://www.cnet.com/roadshow/news/mercedes-benz-nvidia-software-defined-partnership/
 
So who has money and skillz to go run some benchmarks? A100 is now available in google cloud.

“With our new A2 VM family, we are proud to be the first major cloud provider to market Nvidia A100 GPUs, just as we were with Nvidia’s T4 GPUs. We are excited to see what our customers will do with these new capabilities.”
Google Cloud users can get access to instances with up to 16 of these A100 GPUs, for a total of 640GB of GPU memory and 1.3TB of system memory.

https://techcrunch.com/2020/07/07/nvidias-ampere-gpus-come-to-google-cloud/

edit. Actual blog bost from google : https://cloud.google.com/blog/produ...e-cloud-a2-vm-family-based-on-nvidia-a100-gpu
 
https://finance.yahoo.com/news/nvidia-eclipses-intel-most-valuable-191419180.html

Nvidia’s stock is on a ridiculous run. The bigger they are the harder they fall....

Nvidia has overtaken Intel for the first time as the most valuable U.S. chipmaker.

In a semiconductor industry milestone, Nvidia's shares rose 2.3% in afternoon trading on Wednesday to a record $404, putting the graphic component maker's market capitalization at $248 billion, just above the $246 billion value of Intel, once the world's leading chipmaker.
 
FOX Sports Teleports Fans Into Major League Baseball Stadium Seats
July 23, 2020
Instead, viewers watching on FOX Sports will see “virtual fans” filling up the empty seats. Powering the virtual fans is a motion-capture, animation, and augmented reality (AR) technology developed by animators at New York-based Silver Spoon Animation.

The company relies on Unreal Engine to help power the virtual fans, with The Future Group’s GPU-accelerated Pixotope virtual production platform. The software is powered by NVIDIA RTX Servers consisting of NVIDIA Quadro RTX 6000 GPUs. Silverdraft supercomputing provided the RTX servers used in all phases of the project.

FOX plans to use the virtual setup starting this Saturday, July 25, during the Milwaukee at Chicago Cubs, San Francisco at Los Angeles Dodgers, and New York Yankees at Washington games.
https://news.developer.nvidia.com/f...ans-into-major-league-baseball-stadium-seats/
 
Soon Nvidia's servers will start to realize they don't need any real people at all... :runaway:
 
Bringing Tensor Cores to Standard Fortran
August 7, 2020
Tuned math libraries are an easy and dependable way to extract the ultimate performance from your HPC system. However, for long-lived applications or those that need to run on a variety of platforms, adapting library calls for each vendor or library version can be a maintenance nightmare.

A compiler that can automatically generate calls to tuned math libraries gives you the best of both worlds: easy portability and ultimate performance. In this post, I show how you can seamlessly accelerate many standard Fortran array intrinsics and language constructs on GPUs. The nvfortran compiler enables this acceleration automatically by mapping Fortran statements to the functions available in the NVIDIA cuTENSOR library, a first-of-its-kind, GPU-accelerated, tensor linear algebra library providing tensor contraction, reduction, and element-wise operations.
https://developer.nvidia.com/blog/bringing-tensor-cores-to-standard-fortran/
 
NVIDIA to Host Digital GTC in October Featuring Keynote from CEO Jensen Huang
August 18, 2020
NVIDIA today announced that it will be hosting its GPU Technology Conference, running Oct. 5-9, and featuring a recorded keynote address by CEO and founder Jensen Huang.

GTC will feature the latest innovations in AI, data science, graphics, high-performance and edge computing, networking, autonomous machines and VR for a broad range of industries and government services. Seven separate programming streams will run across North America, Europe, Israel, India, Taiwan, Japan and Korea — each with access to live demos, specialized content, local startups and sponsors.
...
GTC will include over 500 sessions, including live sessions and on-demand recordings. Live sessions will offer attendees the opportunity to ask questions and interact with experts in AI and other fields from a diverse lineup of companies and organizations. Many of the world’s leading technology organizations will be participating. Sponsors include AWS, Google Cloud, Microsoft, Oracle, Facebook, Dell Technologies, Hewlett Packard Enterprises, VMware, Cisco, Lenovo, ASUS, Booz Allen Hamilton, and IBM.

https://www.guru3d.com/news-story/nvidia-to-host-digital-gtc-in-october-featuring-keynote-from-ceo-jensen-huang.html
 
Key Points --

Overall GPU shipments increased 2.5% from last quarter, AMD shipments increased by 8.4%, Intel's shipments, decreased by -2.7%, and Nvidia’s shipments increased by 17.8%.

Typically, the second quarter is down compared to the previous quarter. This quarter was up as the influence of the pandemic has had unpredictable market effects.
 
Status
Not open for further replies.
Back
Top