I think it means Minister Bruno gets to take your shit if he feels like it.These people are hilarious. What exactly does fair competition in AI look like?
I think it means Minister Bruno gets to take your shit if he feels like it.These people are hilarious. What exactly does fair competition in AI look like?
An undergraduate student used an Nvidia GeForce GTX 1070 and AI to decipher a word in one of the Herculaneum scrolls to win a $40,000 prize (via Nvidia). Herculaneum was covered in ash by the eruption of Mount Vesuvius, and the over 1,800 Herculaneum scrolls are one of the site's most famous artifacts. The scrolls have been notoriously hard to decipher, but machine learning might be the key.
...
Luke Farritor, an undergrad at the University of Nebraska-Lincoln and Space-X intern, used his old GTX 1070 to train an AI model to detect "crackle patterns," which indicate where an ink character used to be. Eventually, his GTX 1070-trained AI was able to identify the Greek word πορφυρας (or porphyras), which is either the adjective for purple or the noun for purple dye or purple clothes. Deciphering this single word earned Farritor a $40,000 prize.
Nvidia GPU Used To Decipher Ancient Greco-Roman Scroll
AI helped a student win a $40,000 prize.www.tomshardware.com
AI will make labor intensive tasks way cheaper, imagine AI creating animations for game characters with solid results that are indistinguishable from real mocap. Speaking of which, I’d love to see cloud AI being used to enhance game world in non latency sensitive applications, such as:There are likely tons of undiscovered applications for AI but it’s far too early to say it’ll be more transformative than the internet. The internet enabled entirely new markets, professions and hobbies and had a tremendous impact on global society and culture. So far AI is just making existing things easier and/or faster.
“To create intelligence with generative AI and HPC applications, vast amounts of data must be efficiently processed at high speed using large, fast GPU memory. With NVIDIA H200, the industry’s leading end-to-end AI supercomputing platform just got faster to solve some of the world’s most important challenges.”
— said Ian Buck, vice president of hyperscale and HPC at NVIDIA.
The lift won't be nearly as massive for Nvidia since they are using G6X for the second generation now.512-bit never made any sense when they were already gonna get a massive bandwidth lift from GDDR7.
also the old 128MB L2 rumor is impossible with 384 bit. AD102 has 16MB of L2 per each 64bit memory controller, with 128MB that would be 21.333 MB for each 64bit MC which obviously doesn't work.512-bit never made any sense when they were already gonna get a massive bandwidth lift from GDDR7. I genuinely dont know how kopite didn't think give that one a little more skepticism from the get-go, especially if he didn't actually hear '512-bit' specifically in his information.
If we're just talking flagship parts, a 4090 uses 21gbps at stock, not 24. So even at the base 32gbps of GDDR7, that's over a 50% increase in bandwidth. Not that a mere 33% would be 'fairly minor' in either case.The lift won't be nearly as massive for Nvidia since they are using G6X for the second generation now.
Micron's roadmap shows early G7 launching at 32 Gbps which is a fairly minor jump over late G6X's 24 Gbps.
Which also explains why Blackwell seemingly adds even more L2 in comparison to Lovelace.
The more interesting part on G7 is the addition of 1.5 step capacity which will allow new memory sizes on the same bus widths: 12 GB on 128 bit, 18GB on 192 bit and 24GB on 256 bit.
Does the L2 have to be so linked with the memory controllers in terms of numbers? On AD103, it doesn't seem like they're laid out like they would be.also the old 128MB L2 rumor is impossible with 384 bit. AD102 has 16MB of L2 per each 64bit memory controller, with 128MB that would be 21.333 MB for each 64bit MC which obviously doesn't work.
ad102 has 48 slices of 2MB for a total of 96MB L2, for blackwell I guess they will just increase it to 2.5MB per slice for a total of 120MB L2.
I think so. in this die shot of GA102 each 64bit MC is linked to 8 slices of L2 cache. I think it's the same for Ada but each slice was increased from 128KB to 2MB.Does the L2 have to be so linked with the memory controllers in terms of numbers? On AD103, it doesn't seem like they're laid out like they would be.
I think the word you're looking for is "PHY"Ah, I was thinking you were meaning the main GDDR6 memory controllers.
It's a good talk but it's always interesting to see how research is disassociated from day-to-day hardware at practically every semiconductor company! In the Q&A, Bill Dally says that NVIDIA GPUs process matrix multiplies in 4x4 chunks and go back to the register file after every chunk which he says is "in the noise" and "maybe 10% energy". But that's definitely NOT the case on Hopper where matrix multiplication still isn't systolic but works completely differently (asynchronous instruction that can read directly from shared memory etc...) with a minimum shape of 64x8x16 (so they've effectively solved that issue): https://docs.nvidia.com/cuda/parall...tml#asynchronous-warpgroup-level-matrix-shapeThis is a quite long talk by Bill Dally about the future of deep learning hardware, including some interesting tidbits on log numbers and sparse matrices.