Digital Foundry Article Technical Discussion Archive [2011]

Status
Not open for further replies.
KZ3 has 70 minutes of videos occupying 32Gb. It should be roughly 32 x 1000 / 70 = 457 Mb of compressed storage per minute of video on the average.
The videos are stored twice, once in 2D, once in stereoscopic 3D, so you're dealing with 140 minutes of video.
This direct grab is littered with compression artifacts. Let's count those blocks.

1-0421ixa5.jpg
 
I suddenly know why they've picked Bink! This way, the videos blend in perfectly with the low-res framebuffer effects. :devilish:
 
Maybe less CPU intensive while doing background streaming or level loading.

Hmm I wonder if that's true, maybe binks cpu load is just far less than other options. Is it also possible that bink just works better at really low bitrates compared to h264 and vc-1? Since their videos are meant to hide loading then keeping the video bitrate low is important. I've read that h264 doesn't work as well at low bitrates as say vc-1, and my own video encoding work pretty much confirms that I can get better quality video at low bitrates with vc-1 than h264. Of course Sony would never use vc-1, but maybe that's why they choose bink over h264 since it better suits their target bitrate?
 
How does the 3D effect work on the 360 considering that its HDMI is of an older version?

Yup. From what I can tell they didn't change the render resolution in the beta, so it's 1152x720 packed into 960x1080.
http://forum.beyond3d.com/showpost.php?p=1525909&postcount=1406

One of the 3D modes that displays can detect is a packed buffer: they pack the individual view into the left and right halves of a 1280x720 or 1920x1080 (in Crysis 2's case) buffer. Of course, this means you'll be losing a lot of information in the former due to the downscale and re-upscale, assuming you're starting with full (or close-to) frame renders (2D+depth or the fake parallax that the Crysis 2 beta employed). Were they to render both views at half-res in the first place, it wouldn't be an issue (proper 3D parallax).

It's rather curious that 360 hardware just simply doesn't output the 1280x1470, which is well within the HDMI cable bandwidth. Grandmaster actually covers this a bit here:
http://www.eurogamer.net/articles/digitalfoundry-xbox360-3d-ready-article

Digital Foundry said:
all output from the GPU is routed through the HANA video processor - the chip that converts the framebuffer into HDMI, component, and legacy standard def outputs. Connect up the 360 to a DVI monitor and you can see that HANA is a pretty useful piece of kit: it's able to support just about any single-link DVI resolution - even relatively obscure ones such as 1440x900. However, notable by its absence is 1920x1200, the de facto standard top-end resolution for single-link DVI, and utilised by a large amount of 24" LCDs, and from several of our developer sources, we've learned there's still no support for it in the current revision of the upcoming Kinect dash,
...
Quite why it is missing is a bit of a puzzle: its omission suggests that HANA has set limitations to vertical frequency, which may preclude the 1280x1470 HDMI 1.4 set-up used by the PlayStation 3. And even if it can output the resolution, there's no guarantee that HANA would be able to offer HDMI 1.4 handshakes to the 3DTV
 
Hmm I wonder if that's true, maybe binks cpu load is just far less than other options. Is it also possible that bink just works better at really low bitrates compared to h264 and vc-1? Since their videos are meant to hide loading then keeping the video bitrate low is important. I've read that h264 doesn't work as well at low bitrates as say vc-1, and my own video encoding work pretty much confirms that I can get better quality video at low bitrates with vc-1 than h264. Of course Sony would never use vc-1, but maybe that's why they choose bink over h264 since it better suits their target bitrate?

Its the other way around. They had disk space to burn so using bink, which requires higher bitrate compared to more rescource hungry algorithms in order to get a decent look, wasnt an issue. The more rescource hungry the algorithm the better it tends to look at lower bitrates. It is compressed more so is more intensive task to uncompress.
 
Last edited by a moderator:
http://forum.beyond3d.com/showpost.php?p=1525909&postcount=1406

One of the 3D modes that displays can detect is a packed buffer: they pack the individual view into the left and right halves of a 1280x720 or 1920x1080 (in Crysis 2's case). Of course, this means you'll be losing a lot of information in the former due to the downscale and re-upscale, assuming you're starting with full (or close-to) frame renders (2D+depth or the fake parallax that the Crysis 2 beta employed). Were they to render at half-res in the first place, it wouldn't be an issue (proper 3D parallax).

It's rather curious that 360 hardware just simply doesn't output the 1280x1470, which is well within the HDMI cable bandwidth. Grandmaster actually covers this a bit here:
http://www.eurogamer.net/articles/di...-ready-article
Oh ok thanks for the information. All I need to know you have provided :)
 
You must research this a bit more, my friend

See it this way. If you put glasses that have one picture in front of your left eye and the same picture to the right eye, your brain wont be interpreting it as double visual information.

Actually it does. It does not interpret what it sees as double number of objects but it perceives twice qualitative information (actually more through more accurate estimation of unrendered or inbetween detail) about the object. This has been shown many times in tests where subjects can percieve vastly more information about a scene in 3D than in 2D in less time. Understanding depth and object placement relative to another at a glance is just one of many benefits of stereoscopic vision that we have evolved.

If you add completely different pictures though for each eye your brain will start mixing the information. This is something that some scientist have done in simple forms of videogames to cure stereoblindness but one image misses huge chunks of information that the other has and vise versa

You are right that the brain mixes information. That is one reason why stereo 3D (unlike 2D, each eye sees a different image) can be lower res but give much more information.

This is possible because the brain sees the both information and tries to fill in the gaps so to speak. This is the value of true stereoscopic 3d vs fake image-warping 3D. In true stereoscopic 3D each eye sees completely different color values across objects. It aligns the geometry and uses the different color information due to subtle difference in lighting angle, shadow darkness, etc to create more detailed understanding of information than is possible with simple 2D image.

The only way I can see this work in the way you suggest (double visual information) is if there is some smart implementation where each pixel is placed like a "chess" pattern where image A have the pixels where image B doesnt have them and vise versa, and even then I am not sure how this will work in producing a 3D image.

Pixels have color (or brightness) values. The brain aligns the geometry and extapolates qualitative information based on things such as subtle differences in color values on each point of an object and fills in missing or inbetween details based on the two values resulting from different angle of each eye relative to position x on object and light source. It cannot do this in 2D. In truth, stereoscopic 3D gives vastly more, not just twice information as 2D.
 
Last edited by a moderator:
Exactly how are you reconstructing the texture information... Your imagination doesn't really count if the information isn't there to begin with. Just look at the 3D vs 2D screenshot and you're simply missing a ton of resolution from the lower res mipmaps. No amount of imaginary interpolation is going to give you back the information. That's just silly. You're certainly not going to interpolate the specular map that is entirely missing on the wall in the 3D shot.
 
It looks worse because you're looking with one eye so you only see half the information. It's like listening to a stereo track with one ear and saying the recording is bad.
...

Taking your stereo sound track example. The way you suggested in your example is:

8bit mono sound vs 8bit stereo sound (assuming frequency is the same).

The right way to do this comparison is:

8bit mono sound vs 4bit stereo sound (assuming frequency is the same).

K, should have used 16bit vs 8bit, but I was trying to make a point. :)
 
Taking your stereo sound track example. The way you suggested in your example is:

8bit mono sound vs 8bit stereo sound (assuming frequency is the same).

The right way to do this comparison is:

8bit mono sound vs 4bit stereo sound (assuming frequency is the same).

K, should have used 16bit vs 8bit, but I was trying to make a point. :)


Actually, this is incorrect. Bit depth determines dynamic range. This is not analogous to screen resolution because 3D and 2D video have same dynamic range. Analogous to screen resolution is sampling rate. Sampling rate is frequency information. This is something like resolution.

When you listen to stereo audio of the same total bit-rate as mono audio, each ear gets different information and the brain combines to create a total perception that is much more than sum of it's parts.

Same way, when you watch s3D video, each eye gets unique image that the brain combines to get a total perception that is much more than the sum of it's parts.

This is because stereo information taps into our brains ability to infer additional information from what is given in a way that mono cannot.
 
Interlacing

The way he explains would be accurate if the game used interlaced frames, but I don't think that's the case.

Actually, the brain automatically interlaces different images for each eye by aligning geometry in mental map and then combing color/brightness information from each image. The farther away the object the less this effect has (because distant objects have less difference between left and right image) but for nearby objects it is a very great benefit. This is why we evolve stereo vision and stereo hearing.

Our eye does this every day with our actual eyes looking at real objects. Our eyes are not interlaced. Same with our ears.
 
Actually, the brain automatically interlaces different images for each eye by aligning geometry in mental map and then combing color/brightness information from each image. The farther away the object the less this effect has (because distant objects have less difference between left and right image) but for nearby objects it is a very great benefit. This is why we evolve stereo vision and stereo hearing.

Our eye does this every day with our actual eyes looking at real objects. Our eyes are not interlaced. Same with our ears.
It's not really interlacing. In this case, since both frames are progressive you're still going to see blurry IQ and double sized jaggies, just at double the frequency of 2D.
 
The videos are stored twice, once in 2D, once in stereoscopic 3D, so you're dealing with 140 minutes of video.

Yes… I mentioned we need to take care of 3D videos in my previous post. There is still a factor of 4-5 between the KZ3 compressed video storage and C2's intro segment. Assuming the 3D videos take double the storage, the KZ3 videos should still have better bitrate on the average (Yes, the original question was "I hope KZ3 has better bitrate").

This direct grab is littered with compression artifacts. Let's count those blocks.

1-0421ixa5.jpg

Could be worse if they use lower bitrate.


EDIT:
Hmm I wonder if that's true, maybe binks cpu load is just far less than other options. Is it also possible that bink just works better at really low bitrates compared to h264 and vc-1? Since their videos are meant to hide loading then keeping the video bitrate low is important. I've read that h264 doesn't work as well at low bitrates as say vc-1, and my own video encoding work pretty much confirms that I can get better quality video at low bitrates with vc-1 than h264. Of course Sony would never use vc-1, but maybe that's why they choose bink over h264 since it better suits their target bitrate?

H.264 and VC-1 have multiple profiles, some are designed for low bitrate environment. If you use the right profiles, they should be comparable. I thought VC-1 is no longer actively pushed by MS because the implementors have to pay MPEG-LA patent fees (VC-1 violated MPEG4 patents). Given a choice, Sony may still go for H.264 because their implementations are probably more well used and optimized.

EDIT 2:
Its the other way around. They had disk space to burn so using bink, which requires higher bitrate compared to more rescource hungry algorithms in order to get a decent look, wasnt an issue. The more rescource hungry the algorithm the better it tends to look at lower bitrates. It is compressed more so is more intensive task to uncompress.

Would be nice if we know what the bitrate is.
 
No. I suggest research audio and visual perception and neuroscience.

It's not really interlacing. In this case, since both frames are progressive you're still going to see blurry IQ and double sized jaggies, just at double the frequency of 2D.

No. The brain takes 2 full progressive fields and combines different left and right information for each point of geometry. This is not difficult to understand if you stop thinking that the brain operates the way you render images. It will help you to receive a gist of how the brain interlaces the different images if you think of lenticular 3D system with 2 full progressive scan inputs (reminder: actual brain processing is much more sophisticated).

Your brain does this every day at every thing in the real world that you look and hear. Yes, your brain is much smarter than you think.
 
Last edited by a moderator:
No. The brain takes 2 full progressive fields and combines different left and right information for each point of geometry. This is not difficult to understand if you stop thinking that the brain operates the way you render images. It will help you to receive a gist of how the brain interlaces the different images if you think of lenticular 3D system with 2 full progressive scan inputs (reminder: actual brain processing is much more sophisticated).

Your brain does this every day at every thing in the real world that you look and hear. Yes, your brain is much smarter than you think.

Your mistake is thinking that our brain perceives 3D displays the same way it perceives the natural world. It doesn't. Otherwise we wouldn't have to simulate effects like motion blur.

The brain will interpret the data as 3D, but as blurry 3D ;)
 
It looks like you're arguing different points. The brain should not be able to fill in missing details from lower resolution textures, but it can perceive more info in 3D art given same resolution, dpi, etc.
 
It looks like you're arguing different points. The brain should not be able to fill in missing details from lower resolution textures, but it can perceive more info in 3D art given same resolution, dpi, etc.
I concur. The information the brain can interpolate/extrapolate is a different type of information to image resolution and texture resolution. It can't necessarily magically ignore big chunky pixels or chunky texels, although a left and right sample at the same image coordinate will result in a data perception different to a single sample. Very little we see is actually 'seen'. That image that your brain presents to your conscious mind is almost entirely an evaluative construct, with a titchy splodge of the real world light captured at the point of focus. A 3D image will be great for filling in general information of the scene and in providing an understanding of the world, but cannot overcome the fundamental resolving power of the eye making out pixelation or trying to fill in missing information. in fact brains tend to do the opposite, decreasing information from a scene to get to the important bits, so cannot be expected to look at a texture sampled at two adjacent texel rendered in the same pixel space and interpret missing detail.
 
Status
Not open for further replies.
Back
Top