Software Rendering Will Return!

If we could get software rendering going again, that might be just the solution we all need.

Which just shifts the problem of people still using 4core cpu's while what they actually need is one that has atleast 12... Isnt that what will happen? There will always be a big gap between what the enthousiast has and what the general consumer who wants a complete pc for 500 or 600 euro's has. If you take away the gpu that will just mean the enthousiast has a couple of hundred euro's more to spend on his cpu while the average consumer will stick with what they had because their pc's didnt had much spend on the gpu anyway.
 
This is what he had to say on the subject in May 2006:

Tim Sweeney said:
This is just the very beginning of a long term trend in which GPUs will become more and more generalised and eventually will turn into CPU like computing devices.

Obviously the problem that GPUs are optimized to solve will always be a very different problem to CPUs, but they will certainly become more and more general purpose, to the point where you can write a C program and compile it for your GPU in the future.

Taken from an nVnews video interview.

Seems he might have changed his tune a litte?
 
I don't think he meant that the future is only GPGPU.
Fact is that for a GPGPU you need a CPU (a very fast CPU!), but that is too expensive and useless also.
CPU will become more parallelized (like a GPGPU), so we won't need a video card anymore (except for video acquisition and TV).
If NVIDIA doesn't turn to the CPU market, it'll die.
 
The lines between software and hardware rendering seem awful blurry. Filtering jokes aside, is it really fair to call rendering on a Larrabee any less 'hardware' than rendering on a current GPU?
 
Isn’t this Sweeneys dream for years? And doesn’t he always believe that CPUs that made it come true are only five year away?

The lines between software and hardware rendering seem awful blurry. Filtering jokes aside, is it really fair to call rendering on a Larrabee any less 'hardware' than rendering on a current GPU?

If the rumors that Larrabee have only additional filter units are true current GPUs use still more “hardware” to get the job done. This may change if more and more elements are moved to the shaders.
 
And doesn’t he always believe that CPUs that made it come true are only five year away?
In the past five years CPUs have done poorly and GPUs scaled fantastically. But CPUs have recently gotten back on track with performance scaling thanks to multi-core and wider execution units. Previously we only had a 200 MHz faster Pentium 4 every half a year or so. Now the number of cores double in less than two years, IPC increases every major generation, and clocks again go up as well. GPUs on the other hand are now maxed out in computational power and bump into bandwidth, latency and thermal limitations, slowing their pace a little.

By the way, Pixomatic and SwiftShader are perfectly capable of running a game like Unreal Tournament 2004, exactly four years old now. So in a way Tim Sweeney is right. The gap doesn't close in five years time, but it isn't widening either. And unless you only enjoy the latest shooters run on the most expensive hardware, the range of games that could be rendered in software is growing. World of Warcraft and The Sims 2 were the most popular games of 2007, and for 2008 it might be Spore. Not exactly cutting edge in terms of visualization. Also look at the popularity of the Wii and mobile platforms to realize exactly how much computing power you need to create games that sell... Now imagine CPUs with instructions to optimize texture sampling (e.g. gather) and you'll realize that Sweeney might be right not in the last five years but maybe in the next. He just expected things to shift more quickly.

Just to be clear, I don't expect GPUs to dissapear any time soon. They will just become as programmable as CPUs to produce movie quality scenes in real-time. There will always be a gap with CPUs, but you won't need a 3D graphics card to enjoy some casual gaming (as you still do today).
If the rumors that Larrabee have only additional filter units are true current GPUs use still more “hardware” to get the job done. This may change if more and more elements are moved to the shaders.
As has already been discussed extensively in other threads there are little reasons to assume that having "more hardware to get the job done" is automatically more efficient. And Larrabee won't only have sampler units but also 16-wide vector units with likely an ISA very suited to perform all sorts of graphics tasks.

The gaming industry is also clearly demanding absolute programmability, so GPUs will most definitely move more tasks to "the shaders". Looking at performance scaling from G70 to G80 (and their transistor counts), it does seem that increasing programmability hurts performance. Larrabee and CPUs can increase their efficiency at stream processing, but GPUs have to take some steps back (relatively). At the convergence point other architectural aspects will determine who's the fastest...
 
Now imagine CPUs with instructions to optimize texture sampling (e.g. gather) and you'll realize that Sweeney might be right not in the last five years but maybe in the next.
Unless consumers can (and want to) use these cores as the primary ones for general-purpose computation, including running the OS, then those are coprocessors - not CPUs. And yes, this does matter, and for a very fundamental reason: if you need two kinds of cores and one of them is only used for gaming and massively parallel workloads, your cost saving compared to a GPU in the low-end is a grand total of $0.00 and Epic's points are moot.

Either way, if NV/AMD really want to be able to compete against Larrabee effectively, what they'd ideally need is the capability to offload physics, sound, AI, etc. - basically make the GPU a 'Game Processing Unit'. If they don't do it, eventually Intel will and then I'm not sure what they can do. Larrabee doesn't do that yet though, so there definitely is an opportunity gap... If someone could make anything more than a dual-core CPU useless even for gaming and video encoding, they win (if they can get the software guys on their side too).
 
Just to be clear, I don't expect GPUs to dissapear any time soon. They will just become as programmable as CPUs to produce movie quality scenes in real-time. There will always be a gap with CPUs, but you won't need a 3D graphics card to enjoy some casual gaming (as you still do today).

You may be right from a technical point of view. But will the systems that the target group for such games buys have a strong enough CPU for this? It would not help much if the OEMs replace they mother boards with IGPs with a CPU that just can “emulate” this IGP at the same performance level. Beside of this it may be questionable if such an “stronger CPU no IGP” system could be build cheaper.

At the end of the day I currently see the strong CPUs that can produce 3D graphics fast enough in systems that will have a strong GPU too.

As has already been discussed extensively in other threads there are little reasons to assume that having "more hardware to get the job done" is automatically more efficient. And Larrabee won't only have sampler units but also 16-wide vector units with likely an ISA very suited to perform all sorts of graphics tasks.

I haven’t predicted any performance differences. It was just a description of the current situation.

The gaming industry is also clearly demanding absolute programmability, so GPUs will most definitely move more tasks to "the shaders". Looking at performance scaling from G70 to G80 (and their transistor counts), it does seem that increasing programmability hurts performance. Larrabee and CPUs can increase their efficiency at stream processing, but GPUs have to take some steps back (relatively). At the convergence point other architectural aspects will determine who's the fastest...

We had although demand much faster Core. And have learned again that “you don’t get always what you want”. Comparing G70 and G80 is IMHO not valid here. This two chips don’t follow the same base design. Therefore we don’t know how fast a “SM3 G80” would be.

Either way, if NV/AMD really want to be able to compete against Larrabee effectively, what they'd ideally need is the capability to offload physics, sound, AI, etc. - basically make the GPU a 'Game Processing Unit'. If they don't do it, eventually Intel will and then I'm not sure what they can do.

As long as NV, AMD and Intel would not provide a common API for such tasks I don’t expect that many developers would jump on.
 
Unless consumers can (and want to) use these cores as the primary ones for general-purpose computation, including running the OS, then those are coprocessors - not CPUs.
I'm talking about all fully general-purpose CPU cores. Possibly simplified ones. Single-threaded performance can be kept at the same level as today. Applications that are not performance intensive don't need anything more and modern performance intensive applications can use the extra cores. Besides, single-threaded performance isn't going to increase much anyway, so you might as well invest the transistors in extra cores. Any developer waiting for a 5 GHz CPU any time soon to be able to run his application has been living under a rock.
Larrabee doesn't do that yet though, so there definitely is an opportunity gap...
Why would Larrabee not be capable of doing, well, everything?
 
As long as NV, AMD and Intel would not provide a common API for such tasks I don’t expect that many developers would jump on.
There's nothing that prevents CUDA from being run on x86, although it's not very efficient; similarly, there's nothing that'd prevent this from running efficiently on x86 either. It'd just be much much slower for massively parallel workloads given the differences in architectures.
Nick said:
I'm talking about all fully general-purpose CPU cores. Possibly simplified ones. Single-threaded performance can be kept at the same level as today.
Then you are describing a fantasy chip and nothing from Intel's roadmap in the next several years. Larrabee isn't going to have anywhere near the per-clock single-threaded performance of Conroe or Nehalem...
Nick said:
Why would Larrabee not be capable of doing, well, everything?
Because most of its processing power lies in its SIMD unit. Good luck running quite a few kinds of workloads efficiently on that...
 
There's nothing that prevents CUDA from being run on x86, although it's not very efficient; similarly, there's nothing that'd prevent this from running efficiently on x86 either. It'd just be much much slower for massively parallel workloads given the differences in architectures.

I thought more about an API that everyone supports. I don’t expect that AMD/ATI would support CUDA anytime soon.
 
You may be right from a technical point of view. But will the systems that the target group for such games buys have a strong enough CPU for this? It would not help much if the OEMs replace they mother boards with IGPs with a CPU that just can “emulate” this IGP at the same performance level. Beside of this it may be questionable if such an “stronger CPU no IGP” system could be build cheaper.
March 12, 2008. The cheapest Dell laptop, the Inspiron 1525, comes with a Core 2 Duo 2 GHz. That's the same CPU as in my Precision M65 I bought a year ago for four times the money. Single-cores are dead and buried. Even Celerons are dual-core today.

So clearly Intel has no problem offering ever faster CPUs including in the lowest price range. And I don't see why people won't be willing to buy quad-cores and beyond when the prices drop to the same level. People have always wanted faster computers so there will always be a demand for faster CPUs.

So within a year or two we'll have cheap systems with no less than 100 GFLOPs of CPU power. On the other hand, IGPs have almost stagnated. Sure, faster ones appear every year, but the older and slower ones still find their way into systems with relatively powerful CPUs simply because they're cheaper and some people don't demand anything more.

So it makes a lot of sense to me to make the CPU a little more suited for graphics and get rid of the IGP. Every dollar matters in the low-end market. Half of these systems won't even be used for casual games.

And that's phase one. My crystal ball gets more blurry when looking beyond that. ;)
At the end of the day I currently see the strong CPUs that can produce 3D graphics fast enough in systems that will have a strong GPU too.
I have a desktop with a Q6600 (2.4 GHz), which is hardly two times faster than my laptop even for the most optimal multi-threaded software. So the gap between the slowest and the fastest CPU is pretty small. This has always been the case. Nobody can live with a CPU ten times slower than his neighbor's, especially if they exchange software. GPU's on the other hand range from insanly fast to barely adequate, and graphics is quite flexible to run on your and your neighbor's GPU. If this trend of low-end CPUs trailing not far behind high-end CPUs continues, they'll soon surpass those barely adequate IGPs. Especially with some additional instructions there is no reason why this won't happen sooner or later.
 
Then you are describing a fantasy chip and nothing from Intel's roadmap in the next several years.
Nehalem will do fine for next year. And Sandy Bridge might very well include ISA extensions useful for software rendering.

Remember that only Conroe/Merom or better are sold today. Before 2012 you'll see nothing less than a 4 GHz quad-core Sandy Bridge in low-end systems. Yes that's 128 GFLOPs. Yes that's 2000 FLOPS per pixel at 1680x1050 at 36 FPS. No match for a discrete GPU but definitely something to for system designers think about when comparing with the cheapest IGP.
Because most of its processing power lies in its SIMD unit. Good luck running quite a few kinds of workloads efficiently on that...
Compared to what? CUDA? You were talking about physics, sound, A.I., etc., which doesn't seem harder to program for Larrabee than any other stream processor. Also remember that many x86 tools can be used immediately for Larrabee to shorten the development time.
 
Nehalem will do fine for next year.
Nehalem is a ~265mm² chip on 45nm. Even after a straight shrink to 32nm it should be as big as Conroe was on 65nm, and with higher wafer costs (although it should clock much higher)... It could be in mainstream PCs with less cache in late 2010, but that depends a lot on your definition of mainstream...
Compared to what? CUDA?
Compared to nothing. This is what they should do, not what they have done.
You were talking about physics, sound, A.I., etc., which doesn't seem harder to program for Larrabee than any other stream processor.
You won't be able to run quite a few parts of AI on a stream processor. Larrabee is better than a stream processor there, but not by that much; most of the processing grunt is still in the SIMD unit.

I was going to news this but still haven't - anyway, PowerVR's SGX turns out to be a 100% MIMD GPU according to their latest developer SDK. No branching granularity requirements; it just works at full performance (although it's still faster to use predication for small branches for a variety of reasons). If you're really smart and you're not forced to use a retarded ISA like x86, it *is* possible to create a good MIMD GPU (as PowerVR has proven, much to my surprise). And the implication then is that many-core CPUs become completely redundant.
 
Back
Top