Apple is an existential threat to the PC

If you think you can fix SLI's in software and make it more efficient than Nvidia
Why would I think that? What I said in the previous post in our discussion and which you quoted :
I believe if SLI is implemented well, the bandwidth requirements vastly diminish. Most of the implementation belongs in the engine, not drivers.
Do you think NVIDIA controls the engine? My assertion was that bad software was the driving force behind bad scaling and ridiculous requirements for interconnect.
Implying companies like Intel and Nvidia, Intel and AMD aren't doing their engineering properly
As you clearly see from what I posted before, that was not what I was implying. Bad software is the main driver behind bad scaling, the point stands as of yet unopposed.

Sort middle parallel rendering is a hack to deal with bad software which requires massive interconnect bandwidth, alternate frame "SLI" can make do with less but is even more hacky. When NVIDIA wants to actually do parallel rendering right they make stuff like Optix and Omniverse RTX and I'm sure those scale well without M1 Ultra level interconnect.
 
Last edited:
Why would I think that? What I said in the previous post in our discussion and which you quoted
Well you said, which you quote next - "I believe if SLI is implemented well.." which I can only infer thanks you mean SLI is implemented badly. Otherwise what is the supposition?
 
It is implemented badly, it's implemented as well as NVIDIA can manage within the framework of other people's bad software they are forced into. If they had a scenegraph to work with in the driver, they could do better.

The supposition is that bad software is the more common case of bad scaling of embarrassingly parallel problems than lack of interconnect bandwidth. Also that PCs can't really afford to throw the money at interposers/die-stitching, because they don't really cater much any more to people who want to accelerate bad software a little faster than a single CPU can handle (ie. multiprocessor workstations). The hyperscalers just write good software instead. Renderfarms just write good software instead. Clusters need more bandwidth for all the nodes in general, so even when they need more bandwidth it doesn't help them much to have it for 2 reticule busting CPUs.

Apple sells very high margin products and they can afford the money just for the halo.
 
Last edited:
It is implemented badly, it's implemented as well as NVIDIA can manage within the framework of other people's bad software they are forced into. If they had a scenegraph to work with in the driver, they could do better.

The supposition is that bad software is the more common case of bad scaling of embarrassingly parallel problems than lack of interconnect bandwidth. Also that PCs can't really afford to throw the money at interposers/die-stitching, because they don't really cater much any more to people who want to accelerate bad software a little faster than a single CPU can handle (ie. multiprocessor workstations). The hyperscalers just write good software instead. Renderfarms just write good software instead. Clusters need more bandwidth for all the nodes in general, so even when they need more bandwidth it doesn't help them much to have it for 2 reticule busting CPUs.

Apple sells very high margin products and they can afford the money just for the halo.

You truly believe that all yourself?
 
You truly believe that all yourself?
Do I believe AMD uses an organic rather than silicon interposer with wider buses mostly to save cost and because it's good enough for their market? Sure.

Do I believe rasterization of first hits and shadow buffers can be parallelised across GPUs with minimal interconnect bandwidth? Sure.

Do I believe Apple mostly spends this money for halo and can still run a profit while doing so because of their outsized margins? Sure.

Pick one to dispute.
 
But I don't like Apple and I don't like that this interposer/stitching of two reticule busters is presented as some huge innovation, when really it's just Apple throwing money around.

PS. well I guess there could be one major innovation if they are stitching Cerebras style. If they are stitching CPUs, but can still dice them individually if one of them is faulty, that would be a nice trick.
 
Last edited:
It is implemented badly, it's implemented as well as NVIDIA can manage within the framework of other people's bad software they are forced into. If they had a scenegraph to work with in the driver, they could do better.
This is further nonsense detached from reality. You post like Nvidia don't already offer their own software platform (DGX OS) that is not immune to the realities of parellisation. On Windows they have the option of writing their own lower-level kernel drivers beyond their graphics drivers and they have the option of writing their own APIs like Vulkan. The same hardware limitations impact all operating system, all APIs and all engines.
 
Here's what an actual game can do with a little IHV aid and an API which explicitly exposes multiple GPUs, no TB/s needed.

Then again "SLI" scaling with demanding enough settings was never that bad to begin with (when it worked, but it's an atrocious driver hack of course, so it's fragile).
 
Last edited:
I may have lost track of the discussion but is all this SLI / CrossFire talk regarding the UltraFusion bridge and the M1 Ultra acting like a single chip from a OS perspective?
 
My main point was that all that bandwidth between two reticle busters isn't so much some amazing innovation, but a product made possible due to Apple's customer base. For other manufacturers it simply doesn't make sense, well except Cerebras I guess.
 
For M1's use-case its less of a problem aswell since these chips are mainly used for non-gaming etc related stuff, rendering/exporting/raw compute its a lesser issue. Even SLI was less of a problem for benchmarks, but in actual games it was.
 
Surely it’s a necessity due to constraints in chip sizes. I can’t imagine something near the reticle limit offering yields that would enable affordable prices. It’s a clever workaround while not championed by Apple but surely thrust into the mainstream by them nonetheless.

I could imagine the Apple Silicon Mac Pro would use two larger dies (at least larger than the current M1 Max) connected with the same technology.

For M1's use-case its less of a problem aswell since these chips are mainly used for non-gaming etc related stuff, rendering/exporting/raw compute its a lesser issue. Even SLI was less of a problem for benchmarks, but in actual games it was.

Yeah, it’s definitely used for general and creative computing. The lack of games on macOS are certainly a sore point. Something Apple is trying to solve by making the underlying architecture the same between iOS and macOS. It’s very easy to build for both now.

Larian Studios is doing great work with Baldur’s Gate 3 but so far that is about it for premium AAA games.
 
Surely it’s a necessity due to constraints in chip sizes.
It would be a luxury to use a silicon interposer, an organic substrate is going to be cheaper if you don't mind the lower density of connections. Obviously Apple can afford luxury though.

Given their unique relationship with TSMC they might be able to use extra masks to create connections between main reticles at relatively low cost though. You could envision a situation where you stitch together two Max's cross the main reticle, if one of them is faulty or you don't need as many ultra's, dice them up.

That would be fancy. A variation on the theme set by Cerebras, but still fancy.

PS. the die pictures do suggest this is the case.
 
Last edited:
So “reviews” are up for the Mac Studio with people doing the same 3 benchmarks. It guts me that YouTubers have such a large reach …

Let the shit-talking commence.
 
https://www.theverge.com/22981815/apple-mac-studio-m1-ultra-max-review

interesting review. I'm not a professional so I'm not understanding the requirements here by video editors. But pretty neat. The reviewers admits he's not an editor, so he plunked it on editors desks and just got their feedback.

Some tech things sound promising for quality of life improvements: the way they approached dual gpus as being seen as a single GPU. Even if scaling isn't as good as 2x etc. The idea that you can get applications to use dual gpus as though it were 1 is a win in itself.

The first thing I noticed was that every professional I gave this machine to was able to sit down and get cooking on this device basically immediately without any major problems. That was not at all the case with the Mac Pro, where people were constantly having to fix various software hiccups before they could really do their job.

The other thing I want to emphasize is that this computer is shockingly quiet for the power it offers. Even when we were doing elaborate things in Adobe After Effects and Blender, stuff that would have had the fans on any Intel desktop I’ve ever used absolutely roaring, the Studio was inaudible. I’d put my ear to the case, and while I could literally feel the fans vibrating beneath me, they were still silent. And the only time we ever felt it blowing hot air was during gaming, which we’ll get to later.

Color emphasis is a pretty important win. There's a lot about getting power from technology but being able to extract a good amount of power from your hardware with little headache is ideal.

Neat to read about the challenges video editors get hit by when trying to their work. Different from data scientists and different from rendering developers. Neat insight into the professional GPU cards. Which are different from professional DS cards etc. etc.
 
Last edited:
AMD 5950x/3090 outperforms, or matches the maxed Ultra at its own game (synthetic exporting/creation benchmarks), for less money and you can upgrade. In real world its going to be even more intresting. A little suprising the m1 ultra isnt totally crushing everything in the cinebench synthetic benchmarks.
And as many are mentioning (comments sections everywhere), the Ultra variants are being compared to a different group of hardware, you really arent in the regular consumer CPU/gpu market anymore with a starting price at 4000 us dollars. 3970x or up (or the zen3 variants/intel side) for example. In performance that is.
 
Last edited:
interesting review. I'm not a professional so I'm not understanding the requirements here by video editors. But pretty neat. The reviewers admits he's not an editor, so he plunked it on editors desks and just got their feedback.
The video encoding hardware is solid on the existing M1Pro and M1Max. At home I have a work MacBook Pro 14" with M1Max and I do a fair bit of video encoding for our outreach team.

This is mostly encoding endless variations of existing snippets of video presentations in 10-bit HEVC at 1080p and the M1Max encodes these at a pretty constant 280+ frames per second which is around 9 seconds of video every 1 second realtime, or 30 minute presentations in just over a minute-and-a-half. All the while the rest of the CPU and GPU is basically idle.
 
The video encoding hardware is solid on the existing M1Pro and M1Max. At home I have a work MacBook Pro 14" with M1Max and I do a fair bit of video encoding for our outreach team.

This is mostly encoding endless variations of existing snippets of video presentations in 10-bit HEVC at 1080p and the M1Max encodes these at a pretty constant 280+ frames per second which is around 9 seconds of video every 1 second realtime, or 30 minute presentations in just over a minute-and-a-half. All the while the rest of the CPU and GPU is basically idle.
Sounds like it definitely saves a lot of time. Not sure if @Dictator wants to chime in here, but I recall the DF team all having 3090s for faster video editing/encoding for their work. The FPS analyzer is a plugin for Premiere? that analyzes the video in real time while they cut stuff up I think.
 
Sounds like it definitely saves a lot of time. Not sure if @Dictator wants to chime in here, but I recall the DF team all having 3090s for faster video editing/encoding for their work. The FPS analyzer is a plugin for Premiere? that analyzes the video in real time while they cut stuff up I think.

It would depend on what DF has for use-cases. Lets summon Alex for some input ;) They are very much into gaming lately, so thats something to bear in mind aswell.
 
Back
Top