16k*16k*8k*4bytes is ~8TBAs for having multiple MTs, one MT is 16kx16kx8kx32bits is 2gb in paged memory and I’m not so confident that the next round of consoles will have more than this, so this one MT will need to hold all of the data for (in this example) one game environment (this may also need to include the other RT’s from other parts of the pipeline, i also need to define what a game environment is!).
I'm not sure why you would use 3D texture with megatexturing, unless you are using it for irradiance volumes.
As for using multiple MTs there were interesting post and paper which talk about the possibility-
http://sandervanrossen.blogspot.com/2011/02/infinite-virtual-textures.html#disqus_thread
http://www.cg.tuwien.ac.at/research/publications/2010/Mayer-2010-VT/
You might want to consider fully dynamic lighting as you wouldn't be constrained with unique texturing if you do that, nor very long texture baking times.As for having separate channels for colour and light, it should allow for more vivid experience with the colour of an object being separate to the light colour, which in-turn should help with the post-processing part of the pipeline.
One could use MT for nice lighting/shadow cache to get frame independent lighting. (bake information into different channels during 'textureload' change when needed.)
IE. Directional occlusion, indirect light, shadow masks etc. (I also do not see why these should be in full resolution, especially if they have directional information to mix with normal maps.)
For displacement I would suggest testing vector displacement mapping for various reasons. (sharp edges and ability to curve on itself.)I think your right about normal storage, normal mapping would be a fallback path with tessellation being the first choice (supported by another channel for a decal base damage system). I need to draft the layout of the MT and see how things will fit memory wise.