I googled a bit about realtime ethernet systems.
This paper cites 1ms processing time for each tcp/ip frame(for common controller).
GT5 prologue use case could do much better. The network is closed, only 4 machines, most likely no packet collisions -> all data could be shared via UDP etc. And they could optimize all the transfers for low lag instead of high streaming speed. They could always choose proper router that is fast too and not rely on the cheapest crap. So I don't see that 4ms as a problem. Especially as all the bandwidth consuming data goes to hdmi not network.
edit. If ping is to be believed I see 1ms roundtrip time for 32kB packets and 2ms roundtrip ime for 64kB packets. and this is in huge network not optimized for 4 playstations