I had to stop by and see what people made of these supposed specs...sounds kind of odd.
I can get behind the plausibility of what's rumored from nvidia...low clocks and low voltage to maximize benefits of 20nm...perhaps an extra unit for yields over what seems logical (just like gk110)...It makes sense that 'if' a truly efficient 64 ROP part (where 25 SMM would be for all intents and purposes similar to 4000sp from amd, but in reality I would think 24 would be more efficient) were coming on 20nm it would be at the minimum voltage(.9v)/yield(-20% clock, ie similar to 28nm clock) spec, but with space to breathe (230w with scope up to 300w would do just that). I can also get behind the possibility of a 256-bit bus (4GB) because of die size and aiming toward 4x resolution of the console standards (inevitably 1440p vs 720p, but probably touted as 4k, 1440p, and 1080p products).
Not saying it's the correct spec, but it seems plausible within the parameters and past things nvidia has done.
But this...is pretty wonky. I really question how a 512-bit bus would work on 20nm. 16nm or rather 20nm finfet perhaps (power savings but little in the way of transistor density), but 20nm (which is the opposite)? We're talking very small transistors (1.9x quoted scaling?), 30% power savings at the same voltage, and/or somewhere >20% and <25% clock scaling at a similar voltage iirc. Now, take into consideration we're talking .95v versus .9v, and also likely stricter tolerance toward the top (1.2v+)
Big dies seem unlikely imho.
High clockspeed? Maybe, but in theory not optimally. Only if it fits with the many factors amd has considered in the past (just big enough die to support a memory controller that can feed a certain number of units at a high clock for a process within the tdp that product is aimed in the market) and everything meshes out perfectly. This is what would have to be true of a 1536sp part (within 150w?) and that seems a dangerous game to play between nvidia perhaps shooting for the equivalent to 960 and 1920.
I would figure ideally they would want what is the closest to approx the optimal rate for each rop set (somewhere around 896-960 for 16, 1792-1920 for 32, +-2816 for 48, and 3840 for 64) but of course there are other factors (power consumption of lower voltage but also more dense memory, ideal performance per die size/tdp). You of course have to consider each 16 rop set has an equal amount of CUs, so for 48 rops it would be somwhere around 2688/2880/3072, for instance, but ideally probably 2880. To reiterate, nothing is ever as simple as it seems and I think Pitcairn is a prime example of that, but would amd go the same route for both mid and upper-end gpus that they went with Tahiti, especially after the public mauling they received when nvidia went for the under to their over and stayed (at least initially) under 225w? I understand keeping good yields (both in terms of either good units or lower clock) on a new process is probably part of it, as is extra instruction power (or even texture processing power given unit structure) to the competition may justify the means...but I just wonder.
I think nvidia's maxwell is pretty black and white...scaling from 6smm, 12, etc...probably even down to 3smm/8 rops for SOCs, which (I hate to say) would seemingly be more efficient than something like Kaveri...especially if they kept the cache unscaled to subsidize the bandwidth available on such platforms. The performance from 750ti gives away how much bandwidth help the cache gives, and I think ideally they're shooting for 6smm (768sp + 192sfu...or similar to 960sp), a very attainable clockspeed and and low-power memory within 75w...perhaps scaled up exponentially until the point more logic (even if redundant) makes more sense than trying to get yields on less and be trying to be able to up the clock.
I'm not even going to pretend I know the intricate differences and weights of die size, memory controller speed/size, cache, ram density considerations, vdd and vram, unit counts, yields considerations, and/or how they effect end product decisions...especially compared to engineers up close and personal with the new tech and that have done a great job of for the most part maxamizing performance to die sizes and tdps in the past while trimming unnecessary excess typically to our benefit of lower prices....but it just seems compared to what nvidia may seemingly do this could potentially be a little messy...if it's true.