If we could get software rendering going again, that might be just the solution we all need.
Tim Sweeney said:This is just the very beginning of a long term trend in which GPUs will become more and more generalised and eventually will turn into CPU like computing devices.
Obviously the problem that GPUs are optimized to solve will always be a very different problem to CPUs, but they will certainly become more and more general purpose, to the point where you can write a C program and compile it for your GPU in the future.
I don't think he meant that the future is only GPGPU.
ooh a return to voxel engines
Oooh, a voxel engine you say? Wait a sec, lemme check back in the warehouse.....Yes, one came in this morning. Here you go.
http://youtube.com/user/xenopusRTRT
The lines between software and hardware rendering seem awful blurry. Filtering jokes aside, is it really fair to call rendering on a Larrabee any less 'hardware' than rendering on a current GPU?
In the past five years CPUs have done poorly and GPUs scaled fantastically. But CPUs have recently gotten back on track with performance scaling thanks to multi-core and wider execution units. Previously we only had a 200 MHz faster Pentium 4 every half a year or so. Now the number of cores double in less than two years, IPC increases every major generation, and clocks again go up as well. GPUs on the other hand are now maxed out in computational power and bump into bandwidth, latency and thermal limitations, slowing their pace a little.And doesn’t he always believe that CPUs that made it come true are only five year away?
As has already been discussed extensively in other threads there are little reasons to assume that having "more hardware to get the job done" is automatically more efficient. And Larrabee won't only have sampler units but also 16-wide vector units with likely an ISA very suited to perform all sorts of graphics tasks.If the rumors that Larrabee have only additional filter units are true current GPUs use still more “hardware” to get the job done. This may change if more and more elements are moved to the shaders.
Unless consumers can (and want to) use these cores as the primary ones for general-purpose computation, including running the OS, then those are coprocessors - not CPUs. And yes, this does matter, and for a very fundamental reason: if you need two kinds of cores and one of them is only used for gaming and massively parallel workloads, your cost saving compared to a GPU in the low-end is a grand total of $0.00 and Epic's points are moot.Now imagine CPUs with instructions to optimize texture sampling (e.g. gather) and you'll realize that Sweeney might be right not in the last five years but maybe in the next.
Just to be clear, I don't expect GPUs to dissapear any time soon. They will just become as programmable as CPUs to produce movie quality scenes in real-time. There will always be a gap with CPUs, but you won't need a 3D graphics card to enjoy some casual gaming (as you still do today).
As has already been discussed extensively in other threads there are little reasons to assume that having "more hardware to get the job done" is automatically more efficient. And Larrabee won't only have sampler units but also 16-wide vector units with likely an ISA very suited to perform all sorts of graphics tasks.
The gaming industry is also clearly demanding absolute programmability, so GPUs will most definitely move more tasks to "the shaders". Looking at performance scaling from G70 to G80 (and their transistor counts), it does seem that increasing programmability hurts performance. Larrabee and CPUs can increase their efficiency at stream processing, but GPUs have to take some steps back (relatively). At the convergence point other architectural aspects will determine who's the fastest...
Either way, if NV/AMD really want to be able to compete against Larrabee effectively, what they'd ideally need is the capability to offload physics, sound, AI, etc. - basically make the GPU a 'Game Processing Unit'. If they don't do it, eventually Intel will and then I'm not sure what they can do.
I'm talking about all fully general-purpose CPU cores. Possibly simplified ones. Single-threaded performance can be kept at the same level as today. Applications that are not performance intensive don't need anything more and modern performance intensive applications can use the extra cores. Besides, single-threaded performance isn't going to increase much anyway, so you might as well invest the transistors in extra cores. Any developer waiting for a 5 GHz CPU any time soon to be able to run his application has been living under a rock.Unless consumers can (and want to) use these cores as the primary ones for general-purpose computation, including running the OS, then those are coprocessors - not CPUs.
Why would Larrabee not be capable of doing, well, everything?Larrabee doesn't do that yet though, so there definitely is an opportunity gap...
There's nothing that prevents CUDA from being run on x86, although it's not very efficient; similarly, there's nothing that'd prevent this from running efficiently on x86 either. It'd just be much much slower for massively parallel workloads given the differences in architectures.As long as NV, AMD and Intel would not provide a common API for such tasks I don’t expect that many developers would jump on.
Then you are describing a fantasy chip and nothing from Intel's roadmap in the next several years. Larrabee isn't going to have anywhere near the per-clock single-threaded performance of Conroe or Nehalem...Nick said:I'm talking about all fully general-purpose CPU cores. Possibly simplified ones. Single-threaded performance can be kept at the same level as today.
Because most of its processing power lies in its SIMD unit. Good luck running quite a few kinds of workloads efficiently on that...Nick said:Why would Larrabee not be capable of doing, well, everything?
There's nothing that prevents CUDA from being run on x86, although it's not very efficient; similarly, there's nothing that'd prevent this from running efficiently on x86 either. It'd just be much much slower for massively parallel workloads given the differences in architectures.
March 12, 2008. The cheapest Dell laptop, the Inspiron 1525, comes with a Core 2 Duo 2 GHz. That's the same CPU as in my Precision M65 I bought a year ago for four times the money. Single-cores are dead and buried. Even Celerons are dual-core today.You may be right from a technical point of view. But will the systems that the target group for such games buys have a strong enough CPU for this? It would not help much if the OEMs replace they mother boards with IGPs with a CPU that just can “emulate” this IGP at the same performance level. Beside of this it may be questionable if such an “stronger CPU no IGP” system could be build cheaper.
I have a desktop with a Q6600 (2.4 GHz), which is hardly two times faster than my laptop even for the most optimal multi-threaded software. So the gap between the slowest and the fastest CPU is pretty small. This has always been the case. Nobody can live with a CPU ten times slower than his neighbor's, especially if they exchange software. GPU's on the other hand range from insanly fast to barely adequate, and graphics is quite flexible to run on your and your neighbor's GPU. If this trend of low-end CPUs trailing not far behind high-end CPUs continues, they'll soon surpass those barely adequate IGPs. Especially with some additional instructions there is no reason why this won't happen sooner or later.At the end of the day I currently see the strong CPUs that can produce 3D graphics fast enough in systems that will have a strong GPU too.
Nehalem will do fine for next year. And Sandy Bridge might very well include ISA extensions useful for software rendering.Then you are describing a fantasy chip and nothing from Intel's roadmap in the next several years.
Compared to what? CUDA? You were talking about physics, sound, A.I., etc., which doesn't seem harder to program for Larrabee than any other stream processor. Also remember that many x86 tools can be used immediately for Larrabee to shorten the development time.Because most of its processing power lies in its SIMD unit. Good luck running quite a few kinds of workloads efficiently on that...
Nehalem is a ~265mm² chip on 45nm. Even after a straight shrink to 32nm it should be as big as Conroe was on 65nm, and with higher wafer costs (although it should clock much higher)... It could be in mainstream PCs with less cache in late 2010, but that depends a lot on your definition of mainstream...Nehalem will do fine for next year.
Compared to nothing. This is what they should do, not what they have done.Compared to what? CUDA?
You won't be able to run quite a few parts of AI on a stream processor. Larrabee is better than a stream processor there, but not by that much; most of the processing grunt is still in the SIMD unit.You were talking about physics, sound, A.I., etc., which doesn't seem harder to program for Larrabee than any other stream processor.