I'm far from impressed. Have you really read them? Like the examples where a CPU beats FPGAs on some subcategory of BLAS performance or the other one where a massive setup of 16 FPGAs just edged out a single GT200 GPU on FFTs (and this 16FPGA setup used more power btw., performance/power wasn't better, only for a single small FPGA which delivered an order of magnitude lower performance to start with). And FFTs are a task GPUs are not really build for. If what you are doing all day long are FFTs, any ASIC build for the job will stomp it anyway.
And at least one of the article basically mentioned most the reasons I have given already. For instance it is mentioned that FPGAs are usually far to small to hold all parts of a program (if it has any complexity). That means you have to reprogram them for a function call, or you have to invest far more total die space in multiple FPGAs or some special multi context FPGAs which enable a faster partial reprogramming. Anyway, all this severely adds to the overhead for general purpose computing in practice. You really don't want to wait even a few milliseconds for reprogramming the FPGA so it can execute a function call during the runtime of some application/game. That's half an eternity for a microprocessor.
Being as they are research papers, they have to keep a neutral tone and be scientific about it. Therefore you will of course find some weighing back and forth and a lot of, sometimes it seems somewhat contrived, self criticism to "be scientific". And not a lot of goshing and swooning over the possible implications.
Also the FPGAs used for research, especially the older research, are clearly not designed with performance computing in mind, but more as prototyping, glue logic, industrial use etc. Therefore the results are that much more impressive.
So I still don't change my opinion that an FPGA is only helpful if you run a well defined algorithm for an extended period of time and you still need the flexibility to run different ones if you want (with a high latency switching them) or shy away from the steep startup costs of an ASIC (which beats FPGAs on both power consumption and performance and is also significantly cheaper for larger volumes). So while it may make sense as an accelerator chip in some HPC nodes for some niche applications, it's far from a general solution. Even while to some extent the same is true for GPU accelerators, they lend themselves to this a bit easier because they fit better to common programming paradigms and are actually fairly straightforward vector architectures with (quite) a few performance pitfalls. Developers are far more used to this and there are far less problems with compilers (compiling some general C code to some FPGA configuration is WAY more complicated and basically largely unsolved).
FPGAs are really in many ways the natural continuation of the idea of microprogramming. And again you mention current tools and compilers. It goes without saying, that a major amount of work, if not the biggest part, of building an FPGA computer would lie in doing tools and languages. And compilers? I was talking about interpreted languages, extreme late binding and dynamic type checking. That's one of the major draws, that you could use these previously somewhat slow and underdeveloped, but very very powerful "paradigms" for vastly speeding up the development process.
ASICs and specialized hardware in general will always have a place for its speed and effectiveness. An FPGA architecture would probably also include a certain amount of specialized circuitry in a mixed die for for example GPU abilities. As is already being done today with FPGAs in other realms than graphics.
That it was too expensive and wasteful to have general purpose computing hardware used for industrial process control or in consumer goods, when asics where faster, cheaper and more reliable. Or with regards to general purpose computing, that microprocessors would never be fast enough or good enough to really be considered for general purpose computing.
Without pushing it too much it could be said that todays ISAs are really hyper beefed-up microcontrollers. That's what they started out as.
The whole field kind of got rebooted in about 1980 when micros really caught on. And mostly not for the better. A lot of good ideas where forgotten, diluted or distorted beyond recognition. Things were balkanized and were run mostly by talented amateurs and hacks that lacked the deeper understanding and wiseness of the older generation. And they weren't willing to learn and be humble about it.
And that is pretty much where we are today. In an extrapolated version of that reality. With optimized versions of things that are really not very good at all and don't scale well.
Heed Donald Knuths words: Premature optimization is the root of all evil.
I fail to see that it will materialize anytime soon or even at all. Choosing an FPGA design as you proposed for a game console like the Wii U would be beyond crazy. It wouldn't have worked at all commercially. They could have sold a few hundreds/thousands to some research projects. That's all. And there would be hardly any games for it because it would be a almost unworkable system.
Except they don't.
Except you don't for a broad range of realistic general purpose scenarios.
Tools mature, okay. But how do the costs come down faster than what the whole industry experiences with shrinks (with ever less advantages)? The overhead for transistor count and speed is basically a constant factor.
As FPGAs become more and more popular the economies of scale will make them far far cheaper than they are today. What's more they are usually not made on the newest node. So only with those two things in mind it is self evident that FPGAs are in for some major price/power ratio performance gains.
Doing the next console with an FPGA cpu would have been daunting. But if the project had been started in time and everything ran smoothly Nintendo would have had the major advantage of having an architecture they owned wholly, and being able to do development in record time (late binding would make it possible to upend the teatable very late in the development process if thing for some reason didn't pan out).
That would have been worth a thousand times the gimped tablet they have to find uses for now.
Here is a slightly old but still interesting video I missed linking to the last time around:
http://www.youtube.com/watch?v=ckFUXWKMymU