How do you mean totally free, as in no performance hit what so ever?
I am not sure how that is possible as the direct3d msaa involves rendering the frame and the z-buffer to a higher resolution.
I'd recommend reading the numerous PowerVR articles and reviews here to understand what is occuring on a tiler to help understand this.
However, to all intents and purposes the onchip tile is acting a the frame buffer - because of this and the fact that the Z-Reads are done onchip for each tile the extra Z reads are virtually free anyway (KYRO can do 32 Z checks per clock). When the tile is rendered the subsamples are combined to make the AA'ed pixels and are passed to the external frame buffer in the 'low' resolution format. So, the only costs is the extra onchip Z reads, which can be 'free', and an increase in in the number of 'tiles' the geometry is stored under (which can be offest by an increase in tile size, but that require more transistors)