It doesn't screw up the GPU, it enters into a protection state - in this case its not a particularly end user friendly protection (which why why we improved it) but it is a protection state.Not always - *I* shouldn't be able to write code that screws up my GPU either... that's not acceptable.
http://www.cpu-world.com/Glossary/T/Thermal_Design_Power_(TDP).htmlThere's a difference between things like "turbo mode" which detect thermal conditions and modify clock rates and such on the fly and an application being able to bring down the machine - which should never be possible. The issue here is if there is *any* way *ever* to make it fail, then it's broken, no matter how often you expect to see that case. I don't recall any (working) CPUs that would fail in the hardware for some workload, but please correct me if I'm wrong.
This is normal case of things - in you don't design to the maximum TDP because you're overdesigning for the corner cases. Generally you have catch alls for the corner cases.
And things have improved.No one is complaining about doing cool power savings and clock rate modification to keep the chip running at peak efficiency. The problem is you can't have the chip optimized for the 99% case, and fail catastrophically in the 1%. If you can 100% perfectly detect that 1% and down-clock or whatever you need to do to make it not die then that's completely fine, but I reiterate: the software should never be able to bring down the hardware.