The End of The GPU Roadmap

By the way it's exactly what Carmack plans to look into, but according to his Quakecon talk it's still in the theoretical stage...
 
And that's the point of the whole discussion. You're probably right that workloads haven't hit the point where emulation is more efficient. Yet! In less than ten years from now that situation will have changed drastically. Games will consist of maybe 50 % of graphics calculations that behave nicely, and 50 % of exotic algorithms that require highly generic computing cores.
Will they? Considering that the fastest selling console uses a glorified DX7-level chip I wouldn't be quite sure of it. With current engines and techniques the art production pipeline is the largest money sink in many games and I'm unsure how far it can be taken without making game development totally unprofitable. Will we use even more advanced rendering techniques? Probably, but Nintendo's success shows that the game market might simply not move in that direction as games are made mainly for profit and not for technical merits. Thus we might end up with more or less the same technology as today in a much smaller and lower power envelope - for example - which does not favor pure-software implementations as they will always be less power efficient than dedicated hardware.
I think the word "generally" is spot on. The problem isn't new software, it's legacy software. If we'd had CPUs with 32 cores by now, we'd have developers writing software with 32 threads. But they would have to sacrifice single-threaded performance to create such a CPU today. That's just not going to happen while there's still a majority of single-threaded software. But it doesn't mean the evolution toward more cores has stopped either. In ten years time the situation will be totally different. All software will be multi-threaded and scalable.
While I'd really love this to be true it just won't happen. Performance critical software will be multi-threaded as it mostly is already today. Some software which could use the improvement will also become multi-threaded but the rest of software industry will move along just as it ignored SIMD extensions, profile-based optimization, GPU acceleration and all the other 'potential' performance improvements which required more effort during development. Heck 95% of the software would likely go much faster if during its development performance was taken into account in the first place. I'd love to have only heavily optimized software on my machine but it just won't happen, functionality and stability come first and unless performance becomes a serious bottleneck it is usually ignored. Quoting a somewhat recent post on the GCC development mailing list:

"I won't live long enough to see GCC multi-threaded."
 
Will they? Considering that the fastest selling console uses a glorified DX7-level chip I wouldn't be quite sure of it. With current engines and techniques the art production pipeline is the largest money sink in many games and I'm unsure how far it can be taken without making game development totally unprofitable. Will we use even more advanced rendering techniques? Probably, but Nintendo's success shows that the game market might simply not move in that direction as games are made mainly for profit and not for technical merits. Thus we might end up with more or less the same technology as today in a much smaller and lower power envelope - for example - which does not favor pure-software implementations as they will always be less power efficient than dedicated hardware.
Ah, consoles. They'll always be designed for a specific market, and if dedicated hardware rendering is best for the casual gaming consoles then that's what they'll use. But once PC games really start to benefit from architectures like Larrabee and inventive new games appear, the console market will follow. Or at least part of it. Nobody needs Mario to be raytraced. Or do we? In ten years that might still cost too much. But what about twenty years? Twenty years ago it was inconceivable to render Mario in 3D. So in twenty years from now will we still have a DX7-class chip, measuring a tenth of a square millimeter by then? Highly unlikely.

There's a general trend toward using generic components to cut costs. The volume you need to sell to justify designing a dedicated chip keeps growing. And since you already have a CPU anyway, you might as well just take a slightly more powerful one from a manufacuturer that produces them in high enough volume to sell them cheap. In many ways using a CPU is ridiculously cheap compared to designing hardware from scratch. Just ask the OGP guys. Oh, wait, they haven't woken up yet to the fact that this looks a lot better than this. But I guess that's just me. Either way, the point is, why design a chip that measures a tenth of a square millimeter when you can have a cheap software renderer on the CPU. It might even happen sooner than you think.

Intel is already projecting CPUs to be manufactured at 4 nm by 2022. That's massive computing power even for a chip the size of a baby's fingernail. When I bought my first dual-core CPU, it cost me more than 400 € and it produced a lot of heat. Today I can buy a faster one for less than 40 € that stays cool and renders Half-Life 2 at NTSC resolution real nicely. So try imagining what a small 4 nm chip can do, and it easily becomes apparent that when you're designing a casual gaming console there's just no need to add a custom GPU.

Soon any console will have to be capable of processing user gestures and movement by processing data from a pair of cameras. Is the rigid architecure of today's GPUs the best way to go forward? I really doubt it. And this is only the beginning of the inventive things consoles will do with extra computational power. There's a lot of things still left to explore. The only thing that is certain is that the consoles of the future will have a lot more things to do than just graphics, and if you want to avoid bottlenecks and cut costs a highly generic CPU makes sense.
While I'd really love this to be true it just won't happen. Performance critical software will be multi-threaded as it mostly is already today. Some software which could use the improvement will also become multi-threaded but the rest of software industry will move along just as it ignored SIMD extensions, profile-based optimization, GPU acceleration and all the other 'potential' performance improvements which required more effort during development. Heck 95% of the software would likely go much faster if during its development performance was taken into account in the first place. I'd love to have only heavily optimized software on my machine but it just won't happen, functionality and stability come first and unless performance becomes a serious bottleneck it is usually ignored.
You're confusing application development with software development.

Back when Intel called its SIMD instruction set Internet Streaming SIMD Extension (iSSE) instead of just SSE, people laughed about how it could possibly accelerate the internet. Nowadys we have websites with sound, video, interactive media, games, and full-blown 3D. So while application software is increasingly being developed in an inefficient high-level language, it makes use of highly optimized plugins, libraries, frameworks and tools that do make use of SIMD and multi-core and whatnot. Why does Intel invest in AVX, which will start with 256-bit wide vectors but has the potential of scaling to 1024-bit vectors? Because it matters.
"I won't live long enough to see GCC multi-threaded."
Well if my grandfather said such a thing it might unfortunately be true. There's a lot more resistance against multi-core from the older generation than the younger generation though. If you grew up with a ZX81 as your first computer you're likely to look at it very differently than someone who got a quad-core for his 14'th birthday and who started writing his first "Hello World". The roadmaps already point toward at least 8-core CPUs with 256-bit vectors in just a couple years time. It's silly to think it will be left unused. Anyone who leaves it unused will be stomped by the competition that does know how to use it. Even a simple text editor will have reliable speech recognition.

Maybe GCC won't be part of it but there's a revolution bound to happen with programming languages and compilers. Once they add scatter/gather support to AVX any loop with independent iterations can be automatically parallelized. And by using languages that embrace concurrency any software can make use of multiple cores. Think about SystemC. Although currently it's an abstraction on top of C++, it describes the software as a system of components that each run concurrently. It's similar to designing hardware, where you also have tons of components operating concurrently. It may sound alien and daunting right now, but in ten years from now the students leaving university will have all the skills to develop software that makes good use of dozens of cores.
 
a) I must admit, the idea of 4nm long FET channels boggles me.

b) Nice post above, Nick, as usual.

c) Yeah, concurrency aint an unsolvable problem. The way forward is similar to what was done 50 years ago. Programs got too big and complex to write in assembly, so people gave up some of that power and started writing in high level languages. Sure, the languages provide a constrained model of the machine, but that is a small price to pay in the grand scheme of things. Similar stuff will be done for parallel processing. The laissez faire of pthreads will be given up in favor of a constrained architecture, which is a small cost in the overall scheme of things.

d) Parallel programming (certain models of it atleast),aren't hard. Particularly massively parallel programming is much easier than the intermediate (2-64 threads) range. I think the reason is because when there are too many threads, you need a mathematical model of parallelism in your software. How many of the cuda programmers are computer scientists or coding geeks? Many of them are scientists, who are physicists, geologists, astronomers first, and programmers second. How many of them had any experience of parallel computing before cuda came along?
 
Ah, consoles. They'll always be designed for a specific market, and if dedicated hardware rendering is best for the casual gaming consoles then that's what they'll use. But once PC games really start to benefit from architectures like Larrabee and inventive new games appear, the console market will follow.
Why? Look at most of the games that get released today on the PC, they are console ports, why would this trend revert?
Or at least part of it. Nobody needs Mario to be raytraced. Or do we?
Well, at least we know that *today* we don't need it to use complex pixel-shaders to sell well.
In ten years that might still cost too much. But what about twenty years? Twenty years ago it was inconceivable to render Mario in 3D. So in twenty years from now will we still have a DX7-class chip, measuring a tenth of a square millimeter by then? Highly unlikely.
I haven't got the slightest idea of what will happen in the computer market in twenty years nor is this discussion about it.
There's a general trend toward using generic components to cut costs. The volume you need to sell to justify designing a dedicated chip keeps growing. And since you already have a CPU anyway, you might as well just take a slightly more powerful one from a manufacuturer that produces them in high enough volume to sell them cheap.
That's exactly why I mentioned the Wii. It uses standard off the shelf components which have single-digit cost and consume very little power. And those components are also very simple from a design POV, more than an order of magnitude simpler than any of the current x86 designs if we want to talk about fixed costs.
In many ways using a CPU is ridiculously cheap compared to designing hardware from scratch. Just ask the OGP guys. Oh, wait, they haven't woken up yet to the fact that this looks a lot better than this. But I guess that's just me.
Comparing to a volunteer-based effort isn't very fair and hardware development costs aren't anywhere as high as where you place them. Low-power 3D IP both for fixed function and programmable hardware is available from multiple vendors, has drastically higher performance/W than software-based approach and is already shipping in million of handhelds. If it was as costly as you depict it this wouldn't have happened.
Either way, the point is, why design a chip that measures a tenth of a square millimeter when you can have a cheap software renderer on the CPU.
Because it actually has lower performance/$ and performance/W than a pure software solution.
Intel graphics hardware isn't exactly top-of-the-crop isn't it? Their processors surely are on the other hand...
Intel is already projecting CPUs to be manufactured at 4 nm by 2022.
They were also projecting of releasing 10 GHz P4 for the matter. As I said above I take long term predictions in our market with a grain of salt.
That's massive computing power even for a chip the size of a baby's fingernail.
Yes and as you might have guessed the process improvements apply to CPUs just as they apply to more or less dedicated hardware.
When I bought my first dual-core CPU, it cost me more than 400 € and it produced a lot of heat. Today I can buy a faster one for less than 40 € that stays cool and renders Half-Life 2 at NTSC resolution real nicely.
And in the same time graphics card did made much more progress both in performance and features as far as graphics goes.
So try imagining what a small 4 nm chip can do, and it easily becomes apparent that when you're designing a casual gaming console there's just no need to add a custom GPU.
Of course there is, at the same level of performance dedicated / specialized hardware will always be lower power (and lower cost) than general purpose hardware.
Soon any console will have to be capable of processing user gestures and movement by processing data from a pair of cameras.
That's quite a bold prediction.
Is the rigid architecure of today's GPUs the best way to go forward? I really doubt it.
I wouldn't call current GPUs rigid and besides they are getting more flexible by the day and anyway offering more or less dedicated hardware for certain tasks makes perfect sense and that's exactly why there's a market for it.
And this is only the beginning of the inventive things consoles will do with extra computational power.
I doubt it considering that the most innovative thing that has been done in the console market is stuffing accelerometers inside the controllers of a machine which has significantly _less_ computational power than its competitors.
There's a lot of things still left to explore. The only thing that is certain is that the consoles of the future will have a lot more things to do than just graphics, and if you want to avoid bottlenecks and cut costs a highly generic CPU makes sense.
This I honestly don't know, there are certainly a lot of things to explore. From a technical point of view at least, from a commercial one I have no idea and since it's the market that drives what gets done I cannot conclude that we'll really see something truly 'innovative' (whatever that means).
You're confusing application development with software development.

Back when Intel called its SIMD instruction set Internet Streaming SIMD Extension (iSSE) instead of just SSE, people laughed about how it could possibly accelerate the internet. Nowadys we have websites with sound, video, interactive media, games, and full-blown 3D. So while application software is increasingly being developed in an inefficient high-level language, it makes use of highly optimized plugins, libraries, frameworks and tools that do make use of SIMD and multi-core and whatnot.
That looks like a very nice tool but nearly 100% of the 'multimedia' stuff you find on the internet is based on Flash which is:

- single threaded
- doesn't use any kind of SIMD acceleration (or at least didn't the last time I disassembled their linux plug-in)
- suck big time from a performance POV anyway and could use a lot of straightforward optimization before going into SIMD/threading

Is it going away? I'd really hope so but I fear not.
Why does Intel invest in AVX, which will start with 256-bit wide vectors but has the potential of scaling to 1024-bit vectors? Because it matters.
Yes, but only to software that actually _needs_ the performance in the first place and that's exactly what I have written in my post. The rest of the software industry will move along as it already did with MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, etc...
Well if my grandfather said such a thing it might unfortunately be true. There's a lot more resistance against multi-core from the older generation than the younger generation though. If you grew up with a ZX81 as your first computer you're likely to look at it very differently than someone who got a quad-core for his 14'th birthday and who started writing his first "Hello World". The roadmaps already point toward at least 8-core CPUs with 256-bit vectors in just a couple years time. It's silly to think it will be left unused. Anyone who leaves it unused will be stomped by the competition that does know how to use it.
While development trends may change you are ignoring the fact that in many sectors the market leaders are those who wrote the most craptastic possible software. Flash is a monument to this unfortunate state and I don't see any serious competitors stealing thunder from it yet.
Even a simple text editor will have reliable speech recognition.
I wouldn't count on it and besides from a performance POV it can already be done on today's processors.
Maybe GCC won't be part of it but there's a revolution bound to happen with programming languages and compilers. Once they add scatter/gather support to AVX any loop with independent iterations can be automatically parallelized.
It depends very much on the code and the language you use, with C/C++ unless you very carefully placed your const and restrict qualifiers it might just not work because the compiler is completely unable to figure out if all those accesses alias or not. Depending on the language there's more to automatic loop vectorization than having scatter/gather instructions.
And by using languages that embrace concurrency any software can make use of multiple cores.
That's certainly true but we're not there yet and we won't get there very soon IMHO. And even when we get there (we've got the appropriate language, and compiler and libraries) you still have to convince people to go there and use them.
Think about SystemC. Although currently it's an abstraction on top of C++, it describes the software as a system of components that each run concurrently.
While your point is valid I'd have pointed to another language instead, having seen SystemC in action for hardware simulation I can tell you that's not something you want to use for performance critical codes unless you are very patient.
It's similar to designing hardware, where you also have tons of components operating concurrently. It may sound alien and daunting right now, but in ten years from now the students leaving university will have all the skills to develop software that makes good use of dozens of cores.
I certainly hope so but highly doubt it.
 
Wait till Web3D becomes the next web2.0. Adobe will be throwing all the optimizations you can find at flash so that it is as good a 3D platform for the web.

Ans speaking of flash, ever heard of this?

If you are saying that
we dont need perf
then it is the 2009's equivalent of
640k ought to enuf for everybody
 
Look at most of the games that get released today on the PC, they are console ports, why would this trend revert?
Because we won't see a new console before 2012.

Back in 2006, PC game development was a total mess. There were a lot of people with Shader Model 1.x cards, Shader Model 2.0 cards, Shader Model 3.0 cards, and Shader Model 4.0 was ready for launch. To top it off, each card had its own set of capabilities, and there were a gazillion driver versions. So you had the choice between creating a game that targets older specifications but looks antique on the day of release, or you could write a nice looking game that nobody could run, or you could spend two or three times the development budget to make it run okayish on both older and newer hardware. Suffice to say that graphics hardware underwent revolutionary but turbulent changes, which was hell for software developers.

So when consoles like the PS3 appeared it was heaven. They have one fixed specification and compared to the average PC of that time they were mind-boggling fast. So everyone and his brother embraced these platforms. Since it takes about three years to develop a game, we're still seeing a lot of games today appear on consoles first and then get ported to PC.

However, time is standing still for PS3, while the PC moved on. Nowadays it's a lot less turbulent and they are getting much more powerful. There won't be a new console for another three years or so, so PCs will have reclaimed the position of the dominant innovative gaming platform.

Will history repeat itself when the new consoles arrive? Likely not. Microsoft has understood the importance of having hardware conform to a minimum profile to provide a stable platform for developing games.
Well, at least we know that *today* we don't need it to use complex pixel-shaders to sell well.
That's true for casual gaming consoles like the Wii, but absolutely not for PS3 or XBox 360. And either way today's situation is actually totally irrelevant. Back when Mario was still rendered in 2D it was also quite true that you didn't need 3D to sell well. There was simply no competition offering affordable 3D rendering. But times have changed. So it's silly to think that things won't evolve further for the casual gaming consoles.
That's exactly why I mentioned the Wii. It uses standard off the shelf components which have single-digit cost and consume very little power. And those components are also very simple from a design POV, more than an order of magnitude simpler than any of the current x86 designs if we want to talk about fixed costs.
Oh, absolutely! Today. You can keep talking about that all you like and be 100% right. But that's not what this thread is about.

The costs for designing custom hardware is going up, and that's reflected in the off-the-shelf components as well. So for instance instead of having a complex chip for sound processing, it already makes a lot of sense to do the computations on the CPU and have a tiny DAC for sound output. In theory that's less efficient, but because you're not constantly using all the advanced features you make better use of the the silicon and you have no additional cost. The same thing is likely to happen with graphics and other workloads. Generic chips are cheaper than specialized ones. And an important bonus is that you can use the available processing power any way you like. If you have say six specialized chips, you're kinda forced to use the architecture the way it was meant to be used - at design time.

Think back about non-unified vertex and pixel processing. If you didn't use enough vertices you were not getting full utilization and the geometry looked angular. If you used too many, it became a serious bottleneck and the framerate would drop. So apart form artwork and gameplay every game was identical; everyone strived for the same balance between vertex and pixel processing workload. It severely limited the developer's creativity by offering only one right way to use the hardware. The same thing is currently still true about graphics and other workloads. You're either CPU limited or GPU limited. Always. If you have an awesome idea that will require a lot of CPU time but not a lot of GPU time, your game will be slow and leave silicon unused. So it would be useful to somehow unify the two and use the available performance for what it's needed the most.

Like I said before though, there's a tipping point for everything. Today it's still more efficient to have a specialized chip for graphics and just live with the bottlenecks and limitations. But because of technological progress, workload variation and design cost the tipping point is moving. Sound processing has all but completely moved to the CPU these days...

You may argue that the sound processing workload has remained constant so as dedicated chips got smaller and the CPU got more powerful it made sense to make the shift, but actually sound processing underwent mayor improvements in sample rate, resolution, output channels, filtering, effects, source channels, etc. and still the shift happened. So there is no reason it won't happen for graphics. We're already seeing systems with more GFLOPS available on the CPU than on the IGP. What people consider adequate graphics is evolving slower than CPU performance.
Comparing to a volunteer-based effort isn't very fair and hardware development costs aren't anywhere as high as where you place them.
Why would it not be fair? Their goal is to offer graphics free from bugs in closed drivers. If it's such a great idea to achieve that by designing open hardware then why after five years they're still nowhere near offering a better solution than software rendering, let alone closed hardware rendering? Heck, swShader was a volunteer-based effort that five years ago achieved way more than what they got today. And while they're making progress it's a race they're unlikely to ever win. Why? Because the cost of developing a custom chip from scratch is prohibitively high. Not just for them but for everyone. That's why the industry went from custom design to off-the-shelf in the first place. And the next step is moving toward generic processors and implementing what you need in software.

So it isn't very surprising that lately there have been some posts on the OGP mailing list suggesting to use a CPU or DSP, instead of an FPGA or ASIC...
Low-power 3D IP both for fixed function and programmable hardware is available from multiple vendors, has drastically higher performance/W than software-based approach and is already shipping in million of handhelds. If it was as costly as you depict it this wouldn't have happened.
Absolutely. But once again this is more about the future than the present. Handheld architectures have been evolving towards less chips that are more generic, and will continue to do so, to cut costs and extend capabilities. Think about the iPhone. It is capable of running all the applications in the vast App Store, thanks to a relatively powerful CPU. And its fixed-function GPU just got replaced by a programmable one.
Intel graphics hardware isn't exactly top-of-the-crop isn't it? Their processors surely are on the other hand...
So? Clearly it doesn't matter that much for Intel's sales. Graphics is becoming just another little task a system not primarily meant for gaming has to be able to only run adequately, like sound, to sell. So once CPUs get powerful enough to take over that task, and we're clearly getting there, it's a waste to have a second chip dedicated to graphics.
They were also projecting of releasing 10 GHz P4 for the matter. As I said above I take long term predictions in our market with a grain of salt.
But CPU's still got faster. Exactly how we get there is far less relevant. In fact, the number of cores has become a new parameter that allowed to better optimize the design. We could have easily had 10 GHz Pentium 4's by now. But a 3 GHz Core i7 is much more powerful.
Yes and as you might have guessed the process improvements apply to CPUs just as they apply to more or less dedicated hardware.
Which hasn't helped sound processing from being a market dominated by dedicated hardware...

While in theory it helps all hardware equally, there are other forces at play that dictate a move towards integration, unification, generic processing and software solutions.
And in the same time graphics card did made much more progress both in performance and features as far as graphics goes.
The high-end market achieved this "progress" by throwing more silicon at it and burning more Watts. That's not going to happen for Wii or other low-end systems. Like I said, we're starting to see ever more systems where the CPU delivers more GFLOPS than the IGP. There's only a handful of reasons left why the IGP wins at graphics.
I wouldn't call current GPUs rigid and besides they are getting more flexible by the day and anyway offering more or less dedicated hardware for certain tasks makes perfect sense and that's exactly why there's a market for it.
And as they become more flexible they get more CPU-like. So it gradually makes less sense to use dedicated hardware for it. And a CPU with gather support (which is a generic operation) will be even better equipped to run graphics. The convergence is undeniable. So only two things can happen: either the convergence stops at some point, or they get close enough to no longer require a GPU.

As long as graphics evolves the workload gets more generic. The TEX:ALU ratio hasn't stopped dropping while other forms of memory access increases so at some point it no longer makes sense to have dedicated texture sampling. This may happen ten or twenty years from now, I don't know, but the benefits of dedicated hardware is only slowing it down not stopping it.
I doubt it considering that the most innovative thing that has been done in the console market is stuffing accelerometers inside the controllers of a machine which has significantly _less_ computational power than its competitors.
Again you're not looking at a long enough timeframe. Things have changed dramatically since the time when Mario was a sprite. I obviously don't know exactly what innovation will appear in the next few decades, but I do know that it will be dramatically different from today and generic processing helps spur the creativity of software developers.
That looks like a very nice tool but nearly 100% of the 'multimedia' stuff you find on the internet is based on Flash which is:

- single threaded
- doesn't use any kind of SIMD acceleration (or at least didn't the last time I disassembled their linux plug-in)
- suck big time from a performance POV anyway and could use a lot of straightforward optimization before going into SIMD/threading

Is it going away? I'd really hope so but I fear not.
What were you looking at specifically? ActionScript itself is compiled to intermediate code which then gets interpreted or JIT-compiled. So that's obviously not using SIMD or multi-core. However, if you include for instance a video it can use a highly optimized video codec.
Yes, but only to software that actually _needs_ the performance in the first place and that's exactly what I have written in my post. The rest of the software industry will move along as it already did with MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, etc...
Again, you have to look beyond the application level software. More often than not it uses libraries and frameworks that do use SIMD to some extent. So a developer may think he doesn't need to bother with SIMD, it might be a vital part of the components he uses.

And even if you look specifically at applications that truely don't use SIMD at any level, that fraction is actually not relevant for the system architecture. It's not because some software doesn't make use of it that it's unnecessary. Intel invests in AVX for the software that does make use of it, at various levels.
While development trends may change you are ignoring the fact that in many sectors the market leaders are those who wrote the most craptastic possible software. Flash is a monument to this unfortunate state and I don't see any serious competitors stealing thunder from it yet.
You got to be kidding. Flash is getting some serious competition from Microsoft SilverLight, and Google is working hard trying to make absolutely anything and everything run in a browser. Sure, Flash still has the biggest market share (which it earned through innovation) but if it doesn't keep up with technological advancements it will lose terrain very quickly.
I wouldn't count on it and besides from a performance POV it can already be done on today's processors.
Please. Speech recognition on today's consumer systems is terrible. And it's not because of a lack of algorithms but primarily because of a lack of computing power. It's no coincidence that the word error rate of HMM-based speech recognition is improving at the rate of Moore's law. The best software available today is barely running in real-time on heavy workstations, and still doesn't perform well with speaker-independent conversational voice. But progress is steady so it's only a matter of time before it becomes viable on consumer systems. If putting accelerometers into controllers is innovative because it increases the interaction with the machine, then voice recognition will unleash a revolution.
It depends very much on the code and the language you use, with C/C++ unless you very carefully placed your const and restrict qualifiers it might just not work because the compiler is completely unable to figure out if all those accesses alias or not. Depending on the language there's more to automatic loop vectorization than having scatter/gather instructions.
That's only a minor bump in the road. First and foremost we need scatter/gather instructions before software developers can even use them at all!

Anyway, C++ just like any other language isn't static. We've seen many revisions and there are more to come. There have been proposals to adopt Fortran aliasing rules and to use an 'unrestrict' keyword for explicitely allowing pointer aliasing where practical. Compiler switches can ensure backward compatibility, and warnings can guide developers to write things the way they intended. C++ and pointers is for developers who understand the complexities anyway. And note that a lot of CRT functions have undefined behavior for overlapping memory, so those wouldn't be broken anyway.

So it's not like we don't have any solutions to the problem. A somewhat similar thing happened when Hyper-Threading appeared. Suddenly a fraction of software deadlocked and they blamed Intel for it. But nowadays everyone's aware of the pitfallls and has accepted to write well-behaved code. So programmers can adapt to hardware changes, and scatter/gather is no different.
That's certainly true but we're not there yet and we won't get there very soon IMHO. And even when we get there (we've got the appropriate language, and compiler and libraries) you still have to convince people to go there and use them.
I can settle for "not very soon". It's absolutely not something that's going to happen overnight. There's a gigantic amount of legacy code to rewrite. But just like for instance the move to object-oriented programming has been slow, that never meant it wouldn't succeed.
While your point is valid I'd have pointed to another language instead, having seen SystemC in action for hardware simulation I can tell you that's not something you want to use for performance critical codes unless you are very patient.
That's exactly why I mentioned it's currently an abstraction on top of C++. It merely gives us a peek at the ideas that could be useful to create real languages that natively support a high degree of concurrency.
 
Back
Top