Oh, hey. That old thing. I'm actually working towards an interactive normal map demo that people can run on real hardware to allow people to experiment with. There will be a couple different models and shading methods to pick and you'll be able to rotate the model and light source and see what the results are.
With the PVR, the only real overhead to normal maps is on the CPU, transforming the light vector into texture space and converting it to polar coordinates. For a static light source, there is no difference in CPU load, GPU polygon throughput, or fillrate, as far as I can tell. This is compared to a regular detail texture; the PVR only has one TMU, so multitexturing requires multiple passes, but the same number of passes (two) are required for diffuse + detail or diffuse + normal. The only difference I can think of is that a standard detail texture would be much more texture compression friendly than a normal map.