Okay, so this discussion would probably require someone with a programing background to get into the details, as I'm more of an end user. I'll try to bring up a few possible issues though.
Sampling is key. AA for geometry and shading, better motion blur (if the studio is willing to pay in rendering time), high quality texture filtering, shadow map samples - they all add up to a cleaner look.
Shader quality and precision is probably different as well. An offline renderer usually offers a more complicated Phong shading model than the hardwired stuff in GPUs, and then there's the more precise Blinn shader, and Ward anisotropic stuff, layered shaders and so on.
True, there are advanced shaders for hardware as well, but AFAIK they only calculate the lighting for the vertices, and the pixel shader interpolates that for the actual pixels. Also, shading is calculated for a pixel only once, and it gets source data like normals from textures which might not have a 1:1 texel to pixel ratio. 24 bit normal maps, especially when compressed, won't look good either.
Geometry. A rendered version of a CG character is usually tesselated to a few hundred thousand polygons in the basic renderers already (like if you run it through the built-in 3ds max scanline, Maya or LW renderer). You usually get very close to the 1:1 pixel to polygon ratio. More geometry means better shading, which is especially evident when you take a look at PRMan images - there's always a better than 1:1 ratio between micropolygons and pixels.
If you render low-poly geometry in an offline renderer, it'll have shading interpolation artifacts as well, and the result will loose some of its quality. I guess it has to do with the difference between the interpolated normal across a planar polygon between its neighbouring vertices, and a 'true' normal from an actual vertex. It simply responds better to the lights... and games still have low-res geometry, so the lighting data fed to the pixel shader probably isn't good enough when working with normal mapping.
This all goes for both diffuse shading and specular highlights, by the way.
Texture resolution is obvious and has been mentioned already. Mip mapping isn't as agressive most of the times either, so objects keep their sharp details even when they're further away from the camera.
Also, there's a lot of compositing done on CG stuff, finetuning colors, contrast, blur/sharpness, and tweaking everything until it looks good enough. It's pretty similar to how you'd enhance an image in Photoshop. Some of this should appear in nextgen console engines by the way...
It's also worth noting that CGI assets might have more time put into them simply because you don't need that many sets and characters. Warcraft 3 had about a dozen 'hero' resolution CGI characters, but at least 5-6 times as many ingame characters/creatures. You simply cannot spend as much time with ingame stuff because of the sheer amount of work.
Also, CG quality can range from just above ingame levels to almost movie VFX quality stuff. Blizzard, Blur, Square and a few other studios are on the edge of producing animated features (Blur is in preproduction most likely, and we all know about 'Spirits Within' and 'Advent Children' I guess).