I don't think sprites would be the way to achieve this level of quality. Not in motion anyway. You're looking at complex lighting models, subsurface scattering, various "fog" volume excercises with irradiance transfer, camera effects, heat blur etc etc. You don't want to stick your perfect directional highlight on a sprite only to watch it fall apart as the objects move but your lights don't follow.
This is all work for a 3D graphics engine, just with gameplay and camera restricted to their respective planes.
How about voxels with pixel shaders and a nice lighting engine. Since the camera is fixed a voxel engine could actually work.
There was a voxel fighting game over a decade ago that I can't remember now, but scaling that tech with modern processing power could get good results.