Originally Posted by Nick
Compilation of amp-restricted functions can fail for many reasons:
- There's no support for char or short types, and some bool limitations apply as well.
- There's no support for pointers to pointers.
- There's no support for pointers in compound types.
- There's no support for casting between integers and pointers.
- There's no support for bitfields.
- There's no support for variable argument functions.
- There's no support for virtual functions, function pointers, or recursion.
- There's no support for exceptions.
- There's no support for goto.
The list goes on, and there are also device-specific limitations on data size and such.
We are not talking bout the same thing here. I said compilation wouldn't fail if the loop wasn't vectorizable.
Just because it's complex doesn't mean it's impossible. Writing compilers has always been hard. But you should have a look at the amazing achievements of the LLVM developers (and take a peek at the Polly project). I'd rather let those experts deal with the device limitations as much as possible than have it reflected in the language.
They know there are a lot of limits to what they can achieve and what they will achieve will be ultimately less than that.
Their amazing achievements notwithstanding.
By refusing to change the language/programming model to match evolution of hw, we are back to automagical parallelization of generic C. There is no reason to believe that sucess is this regard will be any more than what has been achieved so far, no matter who works on it.
Minor revisions also cause fragmentation. We have three versions of OpenCL, plus a bunch of extensions. There's six versions of CUDA, and no doubt Kepler will bring a seventh. And there's already a versioning system in place for C++ AMP, with the mention that "it is likely that C++ AMP will evolve over time, and that the set of features that are allowed inside amp-restricted functions will grow". And HSA's unification of the x86-64 addressing space will also lift numerous limitations.
This fragmentation really isn't helping the adoption of general purpose throughput computing. And an ecosystem in which code can be exchanged (commercial or otherwise) is close to non-existent. I can only see this change for the better when the language has minimal restrictions (preferably none at all) and abstracts the device capabilities. Vendor lock-in isn't going to work anyway and it's all evolving back to generic languages.
We will have to agree to disagree again.
JVM and CLR have seen far more revisions that what vendor neutral GPU compute has seen so far. I simply don't see how these assertions hold up in front of established facts.
Meh. It's C++, a lot of things are done explicitly. And these features are not even relevant to vector processing. Also note that general purpose throughput computing doesn't have to be limited to C/C++. Auto-vectorizing compilers can have many front-ends. Yes that's fragmentation too but at least it's driven by language features and not evolving device limitations/capabilities.
A language that refuses to add features needed by developers who do not care for vector processing is not very a general purpose language. Personal preferences aside, vector processing isn't everything.