Software/CPU-based 3D Rendering

I don't even know what this means (and I hope it doesn't involve Tim Sweeney).

What's wrong with Tim Sweeney?
This thread certainly made me think of him, and his "get rid of the driver!" outcry (eventually wishing for a unified, heterogeneous and massively parallel architecture).
The draw calls and their system overhead is a problem on the PC to get "more, more, more!" and going CPU rendering is a way around it. But consoles just use the GPU by programming it closer to the metal..

As ERP says this favors middleware writers, though with pure CPU you could simply have libraries that recreate a way of working similar to DirectX/OpenGL and not have to license the big name stuff.

But what is this "CPU" that is worth using? Let's say nvidia Maxwell, Haswell/Broadwell successor, Steamroller successor meet a loose definition of a heteregeneous CPU. They are CPU+GPU but the GPU gets a lot CPU-like (even Tesla K20 seems meant to run "software" and not "shaders").

You have to write your program three times, each time catering to the architecture so you get high performance (given real time constraints and big workloads). For added fun, one has ARM, and CUDA libraries, the others have x86-64, AMD has a "HSA" ecosystem, Intel I don't know what they will be doing.
 
And please don't call it a software abstraction layer. Implementing a graphics API on top of GPU hardware requires a lot of software layers. Likewise, implementing it on top of CPU hardware is just that, an implementation. You're not using the same software layers as the GPU implementation and then emulating a GPU. In fact for many graphics operations there can be a much shorter 'distance' between the application and the (CPU) hardware. Everybody in the industry agrees that shoehorning everything into using an API is a limitation that comes with overhead. Getting more direct access to the hardware, as is already common with a CPU, will open up a new era of possibilities.

I agree. If we go with all identical, "full CPU" cores with wide SIMD we also get the simplest architecture to use, along with replacing the current hoops with mere .dlls.

But if we go for a really pure CPU, what are we using? Maybe we can settle on a future Intel's Knight Corner or a future beefed up Haswell variant, but then, maybe only Intel is making it. AMD has to license it (even if just the instruction sets) and throw away all they have been doing. Also, Intel wants to sell the really good stuff to professional/scientific market with a big fat mark up, and anything for consumers will have to make do with 95W TDP for the fastest part.

In the practical world we seem to be headed towards APUs and heterogeneous chips. And well, everyone is doing its own sauce. Intel, AMD and nvidia all on their own tracks. Will ARM + mobile GPU go away? Do we get new ARM APUs incompatible with nvidia ARM APU, chinese MIPS APU, or other new architectures.
What happens in this decade is uncertain and this is disturbing, we've had it easy with 20 years of regular x86 CPUs from multiple sources dominating the home computer maket.
 
By this point I thought this GPU vs. CPU talk was over. It dates to before this gen of consoles...
To me, and pretty much everybody in the industry that I've recently heard of, the future lies not in one paradigm nor the other, but a mix of the two.
Whatever solution ends up becoming the standard, Highly parallel cpus, heterogeneous computing, gpgpus, or whatnot, it is going to be something in between the two. And whatever that is, it won't become the standard the next day, it is a gradual process that has been happening in many fronts for many years now, and has been evolving at a good pace so far, with no signs of ending terribly soon.
 
Unlimited Detail, involving a certain much derided Bruce Dell.
SVO rendering wasn't invented by Bruce Dell. There have been many SVO renderers before it.

Unlimited detail is very limited. They don't seem to have any real instancing, as all objects in their youtube demos (trees, ground pieces, columns, rocks, elephants, etc) are aligned to octree cells (simple trick of SVO child pointers pointing to same data). There's seems to be no rotation or precise placement in the instanced objects. There's no animation either. Not a single moving object is present (not even simple rigid bodies instead of complex skinned models). There's plenty of grass and leaves, but all of them are completely static (it doesn't feel realistic at all). And if you look at the terrain from a distance, it's just a grid of prebaked cells (and frankly looks very much like Minecraft, but that's not a bad thing since Minecraft just topped MW3 in popularity on Xbox Live :) ).

I am much more impressed about Atomontage, as it has dynamic objects, destructible environments, physics simulation and huge unique data sets (not repeatedly instanced octree child nodes).

http://www.atomontage.com/
http://www.youtube.com/watch?v=1sfWYUgxGBE

Gigavoxels (from 4 years ago) is still looking pretty good as well: http://www.youtube.com/watch?v=HScYuRhgEJw
 
I'm sorry to argue with you, sebbbi: you're a very intelligent person. Still, if you have followed Mr. Dell you will agree that he hardly did read anything about octrees. He discovered them, maybe in trying to cull/distribute points among the view frustum's subfrusta (corresponding to viewplane quadrants) faster (perhaps the primeval rendering scheme: a bucket sort where the frusta are the buckets).
There's no animation either. Not a single moving object is present (not even simple rigid bodies instead of complex skinned models).
In fact, already in its early stages, UD had animated characters (quite a feat, actually).
completely static (it doesn't feel realistic at all)
More detailed volumetric data + high frame rates completely floors the unjustly fashionable and untrue eye candy + destructible environment: kinetic depth effect.
 
I'm sorry to argue with you, sebbbi: you're a very intelligent person. Still, if you have followed Mr. Dell you will agree that he hardly did read anything about octrees. He discovered them, maybe in trying to cull/distribute points among the view frustum's subfrusta (corresponding to viewplane quadrants) faster (perhaps the primeval rendering scheme: a bucket sort where the frusta are the buckets).
Are you saying we should congratulate him because he didn't even do his homework and reinvented the wheel ? Seriously ?


In fact, already in its early stages, UD had animated characters (quite a feat, actually).
I believe there's a video online showing a few animated things in isolation. (Likely because it can't run fast in a landscape.)

More detailed volumetric data + high frame rates completely floors the unjustly fashionable and untrue eye candy + destructible environment: kinetic depth effect.
That's an opinion, other people will disagree, destructible environments in Battlefield change the dynamic of the game a lot.

Anyway if he wants to be taken seriously he must get a patent, publish his algorithm and be done with all this nonsense.
 
Are you saying we should congratulate him because he didn't even do his homework and reinvented the wheel ? Seriously ?
Of course.
Let me therefore beg of thee not to trust to ye opinion of any man concerning these things, for so it is great odds but thou shalt be deceived. Much less oughtest thou to keep to rely upon the judgment of ye multitude, for so thou shalt certainly be deceived. (I. Newton)
J'ai souvent remarqué, que des personnes, qui ne font pas tout à fait profession du métier, ont coutume de fournir des pensées plus singulières, concetti più vaghi et più pelegrini, où l'on ne s'attend pas... une personne qui n'était pas géomètre du tout et qui fit imprimer quelque chose de géométrie donna quelque occasion à ma quadrature arithmétique, sans parler d'autres exemples. (G. W. Leibniz)
Beginning in the late '90s, the focus shifted from HSR – the noblest part of CG – to a multitude of texturing schemes (poudre aux yeux) and everybody was complying with the SGI style of doing 3D. The result is what we have today, an ugly body (texture mapped triangles) in pompous robes (effeminate eye candy). This is an example of the bad effects of GPUs.
Anyway if he wants to be taken seriously he must get a patent
http://www.ipaustralia.com.au/applicant/euclideon-pty-ltd/patents/AU2012903094/
 
Last edited by a moderator:
Honestly I think you'd see more games that looked the same with CPU based rendering.

You'd reduce the pool of programmers capable of writing a good renderer even further (and it's not a deep pool now), most people would just defer to middle ware. Those that didn't would probably write once and use many times.
I'm sorry but that is clearly false. Applications that don't use/need the GPU overall don't look/feel very alike, despite having much more low-level access to the CPU. There's really no reduction of the pool of programmers capable of creating something valuable. There's just more of a split between low-level, middle-level and high-level development. You don't have to know x86's encoding format to be part of some innovative application development. Only a handful of compiler programmers do.

Likewise, making the GPU more programmable has never led to games that look more alike. It's certainly true that it has caused a shift from everyone using the API directly, to more people using middleware, but there's more diversity among this middleware than among the legacy APIs.
I think you have to ask what it is you want to do with your CPU renderer that you can't do on a GPU?
It's futile to ask that. It reminds me of discussions on forums like these about what people would do with floating-point pixel processing, back in the GeForce 3 days. John Carmack had a few ideas but many considered that not to be worth the transistors. We now know that what Carmack had in mind was just the tip of the iceberg of what can actually be done with the technology. It's utterly unthinkable to have a GPU without floating-point pixel processing today. So don't worry too much about what the unification of CPU and GPU can result into specifically. There will be a revolutionary application for everyone.
 
I'm sorry to argue with you, sebbbi: you're a very intelligent person. Still, if you have followed Mr. Dell you will agree that he hardly did read anything about octrees. He discovered them, maybe in trying to cull/distribute points among the view frustum's subfrusta (corresponding to viewplane quadrants) faster (perhaps the primeval rendering scheme: a bucket sort where the frusta are the buckets).

In fact, already in its early stages, UD had animated characters (quite a feat, actually).

More detailed volumetric data + high frame rates completely floors the unjustly fashionable and untrue eye candy + destructible environment: kinetic depth effect.

If you want an artistic and static environment, Rage's megatexturing is already working, you can go license id tech 5 and make a 100GB game if that's what you want (assuming id/bethesda would be open to negociation for you).

UD did not talk much about storage requirements, a tech demo with a dozen assets procedurally copy pasted is fine but how much data do you need for reasonable game content : 20GB? 100GB? 1TB?
 
Oh, I think it's a lot more than that, mainly that GPUs aren't just very wide...
Yes there's more involved than gather support. I just said that it's currently the main reason why there's still a large gap between CPUs and GPUs for graphics workloads. AVX2's gather support won't close that gap entirely, but it will make software rendering adequate for far more applications. And like I also said, AVX can be extended up to 1024-bit. There really is no reason the CPU can't become as "wide" as the GPU.
...but also that they pipeline everything extremely effectively. There's no way in hell you're going to be able to hide memory latency effectively enough on a CPU to get near performance of a GPU...
CPUs are doing absolutely fine at dealing with memory latency with out-of-order execution, large caches, prefetching, and 2-way SMT. Especially for something as regular as graphics, it's a non-issue. In fact it's really the GPU you should be worried about. A modest amount of memory access irregularity can cause the GPU's cache hit ratio to drop to zero, causing it to be bottlenecked by bandwidth, register space, or work size. The CPU can deal with increasing complexity very elegantly.

If the latency hiding really isn't good enough, there's always the solution of executing ultra wide vector instructions on less wide execution units. That would also benefit power consumption.
...even if latency wasn't a problem - well, you're still only doing one stage of work at a time on a CPU, while a GPU works on many stages at once. Texture address generation, geometry setup, clipping, texture filtering, shading and so on... All parallel. Serial, on the CPU.
By that reasoning GPU manufacturers should bring back separate vertex and pixel pipelines, because then more things would run in parallel, right?

The fact that things have to be done serially is not an issue in and of itself. Unified processing of the programmable graphics pipeline stages on the GPU has even enabled higher performance through better balancing, and a wide range of new possibilities. It is certainly true that additional CPU performance is needed to cover for fixed-function functionality, but I can tell you it's relatively minor, some of it can be assisted by a handful of new instructions, there are benefits to the additional programmability itself, and again there is still huge untapped potential for increasing the CPU's throughput.
I don't see how you could possibly get remotely similar performance at the same quality level.
I feel sorry for you.
I'm not into arguing semantics. :D
The point was that based on your wording you probably have a wrong idea about software rendering. It's not an emulation. It's an implementation. In fact it's really hardware rendering, since the CPU is hardware too. Unless there's no confusion about that, we should call it GPU rendering and CPU rendering instead.
 
I think it is reasonable to think that if today's cpu architacture was more efficient then gpu's, amd, ati, game consoles, smartphones and the like would be using them intead.
 
Because despite what certain individuals tell you, avx2 will not make CPUs adequately equipped to handle complex 3D rendering.
I assume you're talking about me? I never actually said that AVX2 will make the CPU adequately equipped though. Haswell is a monumental step forward, by bringing several important pieces of programmable GPU technology into the CPU cores, but it's only the beginning.

I will make one bold claim about AVX2 though: it means the end of GPGPU in the consumer market.

It also seems a little misguided that you're talking about "complex" 3D rendering. The CPU can handle 3D rendering that is far more complex than the GPU can handle. Performance-wise the gap between the CPU and GPU was much larger when the rendering was still fixed-function. In relative terms the CPU is better at rendering Crysis 2 than it is at Max Payne 2. So the gap is closing, and more complex rendering is in favor of the CPU. Unreal Engine 4 will even spend the majority of time on general compute algorithms, rather than the traditional graphics pipeline.
 
Thats if it amounts to anything
It doesn't necessarily have to amount to anything.

People are trying a lot of different things beyond mere polygons today. But they're limited by the GPU's inflexibility and by the CPU's throughput. Fixing either problem, or both, means a convergence between the two architectures, which eventually will lead to unification.

So don't base your conclusions about the hardware, on the success of one algorithm. There are countless possibilities and some will be more successful than others. They all point in one direction though.
 
That's ATI's marketing term for their Shader Model 1.4 hardware. I'm not sure if I should feel flattered or a little insulted. ;)

Oops, mr SwiftShader.
on a related note Amd should bring back smartshaders especially HdRish wich was awesome.
 
But what is this "CPU" that is worth using? Let's say nvidia Maxwell, Haswell/Broadwell successor, Steamroller successor meet a loose definition of a heteregeneous CPU. They are CPU+GPU but the GPU gets a lot CPU-like (even Tesla K20 seems meant to run "software" and not "shaders").

You have to write your program three times, each time catering to the architecture so you get high performance (given real time constraints and big workloads). For added fun, one has ARM, and CUDA libraries, the others have x86-64, AMD has a "HSA" ecosystem, Intel I don't know what they will be doing.
No, you don't have to write your program three times. You just compile it for a different ISA, and let the compiler figure out the details of optimizing for the architecture. AMD is thinking of using HSA as a virtual homogeneous ISA, but that will fail miserably if they keep the CPU and GPU heterogeneous, since compilers can't deal with that and application developers don't want to deal with that. What is needed instead is cores that combine a legacy scalar CPU core and a very wide SIMD engine, which both feed off the same physically homogeneous instruction stream. Intel announced the VEX encoding to be extendable to 1024-bit from the start, and they seem to be pretty well en route with Haswell.
 
Back
Top