The wording could indicate that FLOPS could "scale" somewhere around 1 TFLOPS on 4870X2.
The Terascale article only involved the 4850/4870 SKUs.
So most probably it is on 1 GPU. And if the die is really 250mm^2, it's quite an accomplishment regardless of how much real power they can squeeze out before being CPU-limited (GPGPU) or limited in other factors.
Quasar, remove the scaling optimization part (aka plan RV670: Steroids), then it's still plausible. I'll admit I'm probably a passerby on all of this, but there's quite some values aren't as straight as expected.
Really don't want to revert to the RV635->RV670 discussion all over again, but from that (perhaps skewed) perspective 60mm^2 is more than enough to include the units. 40 ALUs (x5), 8 TMUs, 12 ROPs, and a double width memory interface. Register sizes are vague (They should have made it bigger with a 2+x ALU power increase, compared to current 670->770 scaling) so I'll let that go.
How did the component sizes fare with R520/580/670 in terms of TMU/ROPs and the associated baggage, respectively? That should get us out of this sea of potential scaling variables.