By the way, Pixomatic and SwiftShader are perfectly capable of running a game like Unreal Tournament 2004, exactly four years old now.
Where's the achievement in that exactly considering the awful output/resolution?
By the way, Pixomatic and SwiftShader are perfectly capable of running a game like Unreal Tournament 2004, exactly four years old now.
Unless your engineers are smart enough to make the control logic really simple and efficient... You'll always have *some* overhead, but an IEEE-compliant FP32 unit is hardly cheap so the control costs should be considered relative to that too.That's what I said. But you do have to use SIMD to get rid of redundant control logic and optimize performance / area. The only reason not use at least 4-way SIMD would be if you don't aim the chip primarily at graphics.
What awful output? For Pixomatic, make sure you set Scale2X=False and FilterQuality3D=3. It's perfectly playable at 800x600 on my 2.4 GHz Core 2. And that's a 2004 single-threaded software renderer that makes very little use of SSE and was developed by two people.Where's the achievement in that exactly considering the awful output/resolution?
Depending on the type of AI and what methods it uses to evaluate its environment, this can be very true or very wrong.I play Crysis on my laptop's Quadro FX 350M (G72). It isn't great but it's perfectly playable. Now, even if that GPU was suited for A.I., the dual-core CPU would still be a much better fit.
In the 32 nm era when quad-core makes it into mainstream there will still be a majority of systems equipped with low-end graphics cards. But the CPU has four powerful cores at your disposal...
I might as well rewrite your phrase and replace "graphics" with "CPU".
It will depend on the AI and the methods used, which should be determined by the developers, not the limitations of the hardware, or is your harping on how developers need more freedom only applicable to algorithms that favor the very limiting design assumptions that general purpose CPUs rely upon?Exellent. But that core isn't going to be signfiicantly faster at anything other than graphics. It will have SIMD units very much like the other cores, and texture samplers. And unless texture samplers can vastly accelerate A.I. you'd better let the other cores handle it.
Wait for integration and the probable abstraction layers that have been explored by Peakstream, Intel, AMD, Nvidia, and everybody else are pursuing. It might be a pretty well-established paradigm around the time you expect octo-cores to be introduced to the mainstream.Also, heterogenous cores again make it more difficult for developers. What's Intel going to do and what is AMD going to do? Will all CPUs have an IGP or just the mobile or low-end ones?
A lot of it is cycles. AI can either have a time dependency in a dynamic environment or a large problem space to evaluate in a more strategic game.Yes it costs cycles, but it's nonsense that this is why some games have poor A.I. Do you honestly think that extra cycles alone is going to fix that? It has a lot more to do with programmer skills, budget, time, and creativity than just raw cycles.
How about the cost in allowing the more flexible register access? Connecting the multiplicity of units to allow an equivalent amount reads and writes even from local storage or the register file must have an impact.Unless your engineers are smart enough to make the control logic really simple and efficient... You'll always have *some* overhead, but an IEEE-compliant FP32 unit is hardly cheap so the control costs should be considered relative to that too.
That's a very good point. I didn't realize this before although it's obvious now. Thanks for pointing it out!EDIT: Also, I presume this was obvious, but if you got this program: { if(...) result = func1(input); else result = func2(input); tex(result); } then, if there are no texture instructions in the functions, you can skip them completely for individual pixels for which the test fails. And even if there are texture instructions in there, it doesn't matter as long as the texture coordinates were calculated outside any conditionally executed instruction.
Yeah you need extra decoders, extra constant buffer read ports, likely some synchronization logic, and you need to route a lot of extra wires. But for a mobile chip ALU usage might just have a bigger impact on performance / watt.How about the cost in allowing the more flexible register access? Connecting the multiplicity of units to allow an equivalent amount reads and writes even from local storage or the register file must have an impact.
What awful output? For Pixomatic, make sure you set Scale2X=False and FilterQuality3D=3. It's perfectly playable at 800x600 on my 2.4 GHz Core 2. And that's a 2004 single-threaded software renderer that makes very little use of SSE and was developed by two people.
What northbridge? Nehalem will have integrated memory controllers and some variants will have IGP cores. I'm sure some manufacturers are not happy about this evolution but eventually it will all get integrated (e.g. with four CPU cores or more a software RAID controller is fine). They can keep coming up with new stuff but once it gets to the point where it's too small to fit on a separate chip it will get integrated into something else or become software.At 32 nm, chipset designers are going to be killing themselves to put something on their chipsets. The northbridge or northbridge/southbridge will have its capacity for size reduction capped by the need for the many hundreds of pins needed to interface with the system.
The manufacturers are going to put something in that space.
True, but it all depends on the workload. GPUs are doing great with a lot more cores so I'm not sure what's your point. One way or another you're going to add additional processing units that work concurrently. And in my opinion additional CPU cores is not the worst choice at all.To top it off, the returns for quad core are not double that of the returns for transitioning to dual core from single core.
When looking at current software, yes, it can be hard to imagine any purpose for an octa-core. But in 4 years a lot can change. Like I said before some game developers are already certain to be able to make good use of quad-cores. In 4 years they'll have more tools to work with and likely some hardware assisted locking/scheduling. In fact, Nehalem will re-introduce HyperThreading. So clearly Intel is confident that eight threads is manageable in the not too distant future.The transition to eight general cores will be for the forseeable future a highly dubious prospect for the mainstream. That means your core scaling argument is good for about 4 years.
What the heck are you saying? Anyone calling himself a software engineer is going to take hardware limitations as an equally serious factor in determining the right approach.It will depend on the AI and the methods used, which should be determined by the developers, not the limitations of the hardware...
What limitations? Except for rasterization GPUs are a lot more limiting. For any random application developers are first going to look at running it on the CPU, simply because it can run anything imaginable. Only if it's clearly going to run faster on even an IGP they might consider deviating from the default. This isn't going to change in the next few years, not with multi-core steadily evolving and IGPs sticking around....or is your harping on how developers need more freedom only applicable to algorithms that favor the very limiting design assumptions that general purpose CPUs rely upon?
These attempts are more succesful for multi-core CPUs than for GPUs. Even with GPUs getting more programmable every generation, you still have to deal with slow IGPs and crappy drivers.Wait for integration and the probable abstraction layers that have been explored by Peakstream, Intel, AMD, Nvidia, and everybody else are pursuing.
Exactly. But again, the only reliably evolving baseline is the CPU, so they're not just going to stop investing in that. Hence octa-cores are coming our way sooner or later and software rendering will compete with IGP's from the bottom up.It might be a pretty well-established paradigm around the time you expect octo-cores to be introduced to the mainstream.
Sure, a minimum of complexity is unavoidable. But modern CPUs offer billions of cycles per second to turn your agent into something that doesn't run circles. If I see a game having good A.I. and a nearly identical game having terrible A.I. I'm not going to conclude that I have to upgrade my hardware... Badly optimized software is a reality, and throwing more cycles at it is not the right answer.A lot of it is cycles. AI can either have a time dependency in a dynamic environment or a large problem space to evaluate in a more strategic game.
There is a bare minimum of complexity that cannot be avoided with heuristics, and an AI that heavily relies on assumptions or hard-coded values that save cycles tends to be increasingly fragile or ineffective.
Granted, a recent IGP is still a better option in most cases. But software rendering is, slowly but steadily, catching up. Games that previously required serious hardware now run smoothly in software. And modern casual games that use the same level of 3D technology are already a perfect match for software rendering. And the range of applications for which software rendering is adequate is only getting bigger every year. Interestingly, as ALU/TEX ratios go up CPUs have relatively less trouble processing pixels. Mark my words: Well within five years we'll be able to play Crysis smoothly on the CPU. Ironically though, some might not think of it as an achievement any more then...While it's an achievement to former CPUs and/or software renderers it's hardly any kind of achievement even compared to a recent IGP.
Been where done what exactly?Been there done that...
Last time I checked, Voodoo3 did not support DirectX 9 at all. One gigantic benefit of software rendering is that you can upgrade it. People who are still stuck with a Radeon 8500, GeForce 4 MX or a 855GM can actually run DirectX 9 applications in software. Your application is also going to look exactly the same on any machine. For casual games and medical applications that's already a very serious argument....and no the result is Voodoo3 material at best.
Granted, a recent IGP is still a better option in most cases. But software rendering is, slowly but steadily, catching up. Games that previously required serious hardware now run smoothly in software. And modern casual games that use the same level of 3D technology are already a perfect match for software rendering. And the range of applications for which software rendering is adequate is only getting bigger every year. Interestingly, as ALU/TEX ratios go up CPUs have relatively less trouble processing pixels. Mark my words: Well within five years we'll be able to play Crysis smoothly on the CPU. Ironically though, some might not think of it as an achievement any more then...
You really have to look at it as a proof-of-concept. It's just one step in the direction of making software rendering viable again. In this light I think it's a major achievement that we can smoothly run games that were originally intended to run only on hardware, just a few years after their release.
What I also consider an achievement is that while a recent IGP still costs a few bucks a software renderer that comes with an application costs essentially nothing (it pays for itself by reducing support calls). I'm sure at least a few people are happy they can run certain casual games without upgrading their outdated hardware.
Been where done what exactly?
Last time I checked, Voodoo3 did not support DirectX 9 at all.
One gigantic benefit of software rendering is that you can upgrade it. People who are still stuck with a Radeon 8500, GeForce 4 MX or a 855GM can actually run DirectX 9 applications in software. Your application is also going to look exactly the same on any machine. For casual games and medical applications that's already a very serious argument.
The reason IGPs exist is because there's a demand for very cheap chips that can handle the minimal 3D demands. For the same reason software rendering has a very good chance of returning even if it's no match against an IGP any time soon. It's not going to fulfill everyone's expectations, but in my eyes for those playing only casual games it already has returned...
I certainly think you might be right in the moderately long term there, but only for desktops and servers. In the laptop market (and even more so for UMPCs/MIDs), the higher power efficiency of IGPs will likely keep them highly relevant unless we're talking about Larrabee-like GPU-oriented cores being on every CPU. Either way what happens there depends a lot on what a few executives decide is best, and that's rarely very predictable sadly.The reason IGPs exist is because there's a demand for very cheap chips that can handle the minimal 3D demands. For the same reason software rendering has a very good chance of returning even if it's no match against an IGP any time soon. It's not going to fulfill everyone's expectations, but in my eyes for those playing only casual games it already has returned...
I purposefully noted northbrige or combination northbrige/southbridge.What northbridge? Nehalem will have integrated memory controllers and some variants will have IGP cores. I'm sure some manufacturers are not happy about this evolution but eventually it will all get integrated (e.g. with four CPU cores or more a software RAID controller is fine). They can keep coming up with new stuff but once it gets to the point where it's too small to fit on a separate chip it will get integrated into something else or become software.
Depends on what you consider a core. The way they are set up now, with the exception of the multi-GPU setups, they are not truly multicore.True, but it all depends on the workload. GPUs are doing great with a lot more cores so I'm not sure what's your point.
The question is how much, and here is where diminishing returns comes in.One way or another you're going to add additional processing units that work concurrently. And in my opinion additional CPU cores is not the worst choice at all.
The increasing influence of the laptop market is going to severely impact the prevalence of octo-core general purpose only chips.When looking at current software, yes, it can be hard to imagine any purpose for an octa-core. But in 4 years a lot can change.
Those that make occassionally decent use of quad cores already make significantly better use of GPUs to do far more than the quad cores could dream of handling. I don't see how that changes anything.Like I said before some game developers are already certain to be able to make good use of quad-cores.
Look at the more mainstream versions of Nehalem and see what Intel finds to have more bang for buck.In 4 years they'll have more tools to work with and likely some hardware assisted locking/scheduling. In fact, Nehalem will re-introduce HyperThreading. So clearly Intel is confident that eight threads is manageable in the not too distant future.
It's too bad the market's not going to wait decades for this.It might take a decade or so to mature but I'm confident that we won't run into a dead end beyond quad-core.
You are advocating the removal of specialized hardware. You are taking away a world of choice.What the heck are you saying? Anyone calling himself a software engineer is going to take hardware limitations as an equally serious factor in determining the right approach.
Here's where we'll probably have to differ.Only if it's clearly going to run faster on even an IGP they might consider deviating from the default. This isn't going to change in the next few years, not with multi-core steadily evolving and IGPs sticking around.
I don't see how. There's little if any point for abstraction or dynamic recompilation to target a core if all the cores are the same general-purpose core.These attempts are more succesful for multi-core CPUs than for GPUs. Even with GPUs getting more programmable every generation, you still have to deal with slow IGPs and crappy drivers.
This future will be delayed or possibly cancelled 2H 2009, by both AMD and Intel.Exactly. But again, the only reliably evolving baseline is the CPU, so they're not just going to stop investing in that. Hence octa-cores are coming our way sooner or later and software rendering will compete with IGP's from the bottom up.
It's more fascinating to watch the generalized hardware waste most of those billions of cycles and pumping out hundreds of watts for no return.Sure, a minimum of complexity is unavoidable. But modern CPUs offer billions of cycles per second to turn your agent into something that doesn't run circles.
You'd rather have those octocores hammering away at software rendering that could have been handled adequately by a GPU 4 years prior using a fraction of the transistors and you're lecturing on optimality?If I see a game having good A.I. and a nearly identical game having terrible A.I. I'm not going to conclude that I have to upgrade my hardware... Badly optimized software is a reality, and throwing more cycles at it is not the right answer.
ms reference rasteriser
Please specify.Seen it times and times again.
I only mentioned Unreal Tournament 2004 because that's a game that now, four years later, runs perfectly in software. That's performance-wise. Feature-wise, we're already four years later than that and we're certainly not limited to DirectX 7/8 games.I truly wish UT2k4 would even deserve the title of a D3D9.0 title. It's in its vast majority a DX7 T&L optimized game and in a few spare spots it might have a couple of DX8.0 shaders. With V3 material I meant resolution and filtering quality as examples.
You don't even need a 3D capable graphics card.I'd love to see such a SW renderer for Oblivion especially the 4 MX or the 855GM.
True, power efficiency is very important in the mobile world. But there are plenty of arguments why I believe it's only going to take one generation longer:I certainly think you might be right in the moderately long term there, but only for desktops and servers. In the laptop market (and even more so for UMPCs/MIDs), the higher power efficiency of IGPs will likely keep them highly relevant unless we're talking about Larrabee-like GPU-oriented cores being on every CPU.
Yeah, it's about a lot more than just feasibility. But once it's technically sound what the executives decide can both work against or in favor of software rendering. So it's no more unpredictable than for anything else.Either way what happens there depends a lot on what a few executives decide is best, and that's rarely very predictable sadly.
I wasn't implying the removal of southbridges any time soon. I was only saying that thinks like the digital side of RAID controllers, for which software drivers already exist, is only going to get pushed to the (multi-core) CPU side more and more. We're obviously always going to need analogue and I/O in hardware.Regarding southbridges, you seem not to be realizing that the cost there is also related to analogue and I/O...
You're welcome. I might be barking up the wrong tree most of the time, but it's interesting to see how hardware rendering is so vigorously defended against harmless software rendering. It only makes me more certain of where software rendering technology is/should be heading... So thank you!And I just thought I'd point out that obviously most of us on this forum don't take software rendering very seriously because of our "heritage" but I'm personally glad that you keep defending it so vigorously and try to dispell some myths from time to time! It's certainly nice to be able to have debates on the subject with someone who has so much first-hand experience!