If claims here in the handheld forum were correct the SGX543MP4+(@200MHz) in the PS Vita should be around 35mm2 at Samsung 45nm.
No idea yet about die area and/or power consumption considering Rogue. However a 500MHz frequency is quite modest for a hypothetical high end console design under 28nm. 500MHz for GPU blocks will be common place for 28nm/2012 smartphone/tablet designs. Tegra3 albeit being manufactured at 40nm TSMC should have in its tablet variant the GPU clocked already at 500MHz and I'd be very surprised if upcoming SoCs with single or MP SGX544's won't be clocked at least at 500MHz.
Reverse speculative math based on the ST Ericsson Novathor A9600: they're claiming >210 GFLOPs, >5 GTexels fill-rate (without overdraw) and >350M Tris. Assuming it's a quad core design, always per core a possible scenario would be 8 Vec5 ALUs, 2 TMUs per core clocked at 667MHz.
8 * 10 FLOPs * 0.667GHz = 53.36 GFLOPs
2 * 667MHz = 1.33 GTexels
53.36 * 4 cores = 213.44 GFLOPs
1.33 * 4 cores = 5.32 GTexels
...and that's still high end smart-phone/tablet ballpark for the A9600. If true you'd need roughly 37 cores to reach the 2 TFLOPs mark at 667MHz, or as an alternative clock at 1GHz and get away with 25 cores.
***edit: if you now would want to change VecX ALUs into the typical SP (stream processor) parlance of the desktop GPU world, you'd have for 25 cores = 1000 SPs and for 37 cores = 1480 SPs. I'm fairly sure that I'll be wrong at the end, but that's what speculative (reverse) math is good for.
Your numbers sounds right to me, but I was speculating based on information from John that If i'm not mistake works at IMG. And if we have 2 times power of SGX554 (72 flop / cycle per core) with perhaps the A9600 with 210Gflop are generated by two core "SGX600" 733MHz due to the size most appropriate for a cell phone or in extreme hypothesis 4 core at 366MHz.
If the rogue has the same size as a SGX543 about 8mm^2 per core at 65nm (or SGX543 speculating less than 3.5mm^2 at 28nm) with four times its power (better efficiency in the density of transistors etc.) maybe 32 cores series 6 are possible in 260mm^2 at 28nm processing each 144 flops per cycle at 500MHz or 2.3 Tflop reach the most optimistic of scenarios.