Actually super sampling 2x on the DC is strictly 1280x480.
Edit: some one was able to get the frame buffer of 2 super samples dc games. It's 1280x480. Massive but I guess dc doesn't do vertical ss
You technically can do vertical AA (I've done full 2x2 SSAA), but it's not really worth it. You have have to render the top and bottom halves of the screen separately, resubmitting geometry for each half.
Another way to render above 1280x480, that I haven't tried, would be to generate the polygon lists for the PVR on the CPU. The data structures seem to support up to 2048x2048, it's just that the tile accelerator, the part the receives rendering commands from the CPU and breaks them up into tiles, only supports up to 1280x480. I doubt this way would have much of an advantage over double submitting, outside of very low poly scenes or specialized cases.
I've tried thinking of ways to trick the PVR into double rendering a 640x480 scene in a way to get vertical AA out of it, but it doesn't seem possible. It probably would have been simple to add a hardware feature to adjust how the TA operates and allow processing tiles as pseudo 64x32, 32x64, or 64x64, but that feature doesn't exist.
2xSSAA does take some extra RAM, but nowhere as much as doubling the framebuffer size, in the range of 100-200KB, depending on the scene. Each tile needs a list of what polygons it contains, and with more tiles you have to have more lists, which takes up space.
The performance cost of AA varies depending on the scene, but what I've seen in game-like situations is a cost of something like 40-60%. Long horizontal spans can be rendered more efficiently than short ones, and increasing the horizontal resolution makes the spans longer, so with the extra efficiency it's less than a 2x cost. Fillrate costs almost double, but polygon setup costs don't (they do go up some, but less than double), so some of the extra fillrate load gets hidden under existing stalls elsewhere. Large transparent quads would be closer to costing double, since they're low poly and less minified, the fillrate cost can't be hidden.
From what I've seen, SSAA with mipmaps almost always has better performance than no SSAA without mipmaps (assuming you can see into the distance with some texture minification, and not face-first into a bilinear upscaled wall), so games without mipmaps, like Shenmue or Sega GT, would see a GPU performance increase if you managed to fit in mipmaps somehow and enable AA.