Kojima "Blu-Ray is too small for Metal Gear Solid 4."

Is this really that different from what happens today, though?

Games have these details at the moment, unless I'm going nuts. Try walking on a wooden floor, then move to carpet in any game made since the PS2 and you'll make a different noise.

Right? Or not?


Creation of a sound is one thing. I you drop a brick on a metal floor its going to make a clang. You could just record that sound and play it back in your game.

But in real life the sound your ears hear is not that original sound. The sound reverberates through out the room bouncing off different surfaces. And the way sound waves bounce off a wooden/carpet/stone/metal floor/walls is different. The sound that hits your ears and your brain processes is that sound.

The sound your ears will hear in a big stone cave is going to be different from the same sound a small carpeted room. A big empty warehouse is going to sound different from a warehouse full of crates. But those are simple examples. The more complex the scenario, the more processing power it's going to take.
 
Is this really that different from what happens today, though?
Games have these details at the moment, unless I'm going nuts. Try walking on a wooden floor, then move to carpet in any game made since the PS2 and you'll make a different noise.
That's created by having two different samples, one for steps on floors, and another for steps on carpet. What happens if you want 6 different types of shoe on 4 different types of surface? You need 24 different samples. And you'll want some variation in those samples too, so maybe 75+ samples are needed. Alternatively you take a basic sample such as shoe on wood floor and process it to produce 5 alternative types of footwear, so you get a lot more variety (I'm sure you've heard the 3 different samples that make up most walk cycles!) with a lot less memory consumption. That's the difference with procedural audio (partial) synthesis and it's nice to hear someone talking about it in a next-gen game.
 
What part are you referring to? I don't see anything particually innovative there...

Maybe they haven't done stuff like that before, but things like environmental 5.1 effects and filtering depending on ground / floor / distance has been done on last generation consoles.

No thats not what I am talking about. If I understood well what Kojima said there is that every part in the enviroment affects sound according to the density of objects, the material they are made off, their shape etc. For example take a room with a wooden table, concrete walls, a small square carpet under the table, and a marble floor. Imagine a bottle on the table. A character bumps accidentally on the table.

Think of the sound the glass bottle could be doing in real life. The bottle rolls on the wooden table. You can understand that its a glass bottle rolling on a wooden table from the sound. Then falls on the square carpet, makes the right sound but the bottle doesnt brake due to the carpet, and continue to roll.

It rolls until there is no carpet under it so now its on marble. The sound is different from the carpet and the table. The bottle continues to roll until it hits the concrete wall.

What you described in previous gens were predetermined effects on certain areas, generated under the same fashion predetermined damage was generated for so many years.

Kojima wants to create an enviroment that has the ability to understand the objects (shape, density, hollow, weight, material etc) in it and calculate in real time the on goings, whether that's enviromental destruction or sound.
 
Reading the description, I see nothing that hasn't already been done in games. Taking sound samples for objects, and producing the sound for different materials coming into contact in realtime? That has been done--it's a much better solution that storing tire-grass, tire-granite, tire-metal, tire-whatever, etc. Taking into account the environment for how it will affect those sounds has also been done.

Overall, it sounds like what's already been done, "taken to the next level" (if that) as it were. Evolutionary, rather than revolutionary/innovative.
 
Reading the description, I see nothing that hasn't already been done in games. Taking sound samples for objects, and producing the sound for different materials coming into contact in realtime? That has been done--it's a much better solution that storing tire-grass, tire-granite, tire-metal, tire-whatever, etc. Taking into account the environment for how it will affect those sounds has also been done.
If it has been done, it's very uncommon. And I'd like to see some real evidence that it has been done too! Your examples above I have only ever heard of as samples, and it's things like that which give us feeble, unrealistic engine sounds.
 
No thats not what I am talking about. If I understood well what Kojima said there is that every part in the enviroment affects sound according to the density of objects, the material they are made off, their shape etc. For example take a room with a wooden table, concrete walls, a small square carpet under the table, and a marble floor. Imagine a bottle on the table. A character bumps accidentally on the table.

Think of the sound the glass bottle could be doing in real life. The bottle rolls on the wooden table. You can understand that its a glass bottle rolling on a wooden table from the sound. Then falls on the square carpet, makes the right sound but the bottle doesnt brake due to the carpet, and continue to roll.

It rolls until there is no carpet under it so now its on marble. The sound is different from the carpet and the table. The bottle continues to roll until it hits the concrete wall.

What you described in previous gens were predetermined effects on certain areas, generated under the same fashion predetermined damage was generated for so many years.

Kojima wants to create an enviroment that has the ability to understand the objects (shape, density, hollow, weight, material etc) in it and calculate in real time the on goings, whether that's enviromental destruction or sound.

I think this scenario of sound use in a game would be better handled using some well formulated sound algorthims and the computational power of the Cell versus needing an additional 25gb of space to accomodate massive amounts of sound samples.
 
I think this scenario of sound use in a game would be better handled using some well formulated sound algorthims and the computational power of the Cell versus needing an additional 25gb of space to accomodate massive amounts of sound samples.

Yeah. Do you think this is possibly what he is aiming for to achieve it?
 
How would an algorithm be able to replicate the sound of glass rolling along marble or carpet, though? Working with a base sound effect, you still wouldn't be able to get a believable sound in most cases. Professional sound enginners trying to create believable existing musical instrument sounds out of a synthesizer have struggled for years, and the difference is still noticeable. In fact the best solution I've heard is to sample the actual instrument for each and every note it can play, and then have each of those sound samples linked to a corresponding key. It's one thing to put reverb or delay onto a sound sample to get different environmental effects, or change the EQ to make it sound like something is going on in another room (something which I don't recall being done in games at all, but bloody should be); it's quite another to conjure up dozens of accurate interaction sounds from every one or two samples. I would have thought it's a lot easier to go and record a bottle rolling along a wooden table and a bit of carpet than it is to create a piece of engine code that can replicate the sound of rolling on carpet from the rolling on wood sample, or completely synthesizing the carpet sample. I may have misunderstood you, mind.
 
Professional sound enginners trying to create believable existing musical instrument sounds out of a synthesizer have struggled for years, and the difference is still noticeable. In fact the best solution I've heard is to sample the actual instrument for each and every note it can play, and then have each of those sound samples linked to a corresponding key.
Audio is pressure changes working on physical principles, and can thus be modelled mathematically. There are synthesizers that use a process called Physical Wave Modelling (or variations thereof) which do just this, and can produce very versatile virtual instruments...but...firstly the mathematical models aren't fully there yet, and secondly the processing power isn't there. At the moment, or at least last I heard which was a while back, you had some quite good flutes and percussion. Piano's and strings with their complex harmonics weren't as successful. The actual algorithms seem ideally suited to SPEs, so Cell could be a great platform for research into wave modelling.

As for engineers using samples...
I would have thought it's a lot easier to go and record a bottle rolling along a wooden table and a bit of carpet than it is to create a piece of engine code that can replicate the sound of rolling on carpet from the rolling on wood sample, or completely synthesizing the carpet sample. I may have misunderstood you, mind.
...yes, it would be! However they're not constrained by 512MB total system RAM ;) Accoustic modelling obviously isn't as convincing as real sounds yet, so massive libraries of samples will be the choice of the professional, as are sample based synthesizers. However, software that's a mix of the two looks to be the ideal current solution until physical modelling is perfected. eg. Take a couple of violin samples, and morph between them with other accoustic modelling etc. to create an algorithmic hybrid sample. That is what I'd expect of a computer game. You can't fit a sample into memory for every situation. To date, that means games recycle the same samples over and over again. Hopefully in the not too distant future, most samples will go through a synthesis step to 'mix them up a bit.' Footsteps could go through subtle pitch and EQ changes per step, blending a few base samples to create footsteps that don't sound like 3 samples being played in random order. The sound of a bottle falling off a wall won't be identical for all 200 bottles. You won't have a guy in sneakers walking with the same footsteps and a military grunt in combat boots. Your car engine sound will go through some processing to make it sound beefier, or let you hear what the sports muffler does, etc.

There's lots of scope to apply maths to audio, rather than just playing a limited library of audio clips with just a bit of room-wide reverb chucked in.
 
How would an algorithm be able to replicate the sound of glass rolling along marble or carpet, though? Working with a base sound effect, you still wouldn't be able to get a believable sound in most cases. Professional sound enginners trying to create believable existing musical instrument sounds out of a synthesizer have struggled for years, and the difference is still noticeable. In fact the best solution I've heard is to sample the actual instrument for each and every note it can play, and then have each of those sound samples linked to a corresponding key. It's one thing to put reverb or delay onto a sound sample to get different environmental effects, or change the EQ to make it sound like something is going on in another room (something which I don't recall being done in games at all, but bloody should be); it's quite another to conjure up dozens of accurate interaction sounds from every one or two samples. I would have thought it's a lot easier to go and record a bottle rolling along a wooden table and a bit of carpet than it is to create a piece of engine code that can replicate the sound of rolling on carpet from the rolling on wood sample, or completely synthesizing the carpet sample. I may have misunderstood you, mind.

IMO, your brain doesn't care that much when its is preoccupied with dealing with all the data relaying from your eyes and to your hands.

Listening to music and playing games are fundamentally two different experience in terms of sensory stimulation. When listening to music, the ears are primary source of stimulation and you are more apt to pick up subtle differences and the act itself is generally passive. When gaming, your eyes act as the primary source of stimulation, where sound differences become less relevant.

Sound used as device in games to determine direction and distance would be very useful. But the difference between a glass rolling on marble or cement will be dismissed by your mind as it has very little to do with overall objective of the typical gaming experience.
 
IMO, your brain doesn't care that much when its is preoccupied with dealing with all the data relaying from your eyes and to your hands.

Listening to music and playing games are fundamentally two different experience in terms of sensory stimulation. When listening to music, the ears are primary source of stimulation and you are more apt to pick up subtle differences and the act itself is generally passive. When gaming, your eyes act as the primary source of stimulation, where sound differences become less relevant.

Sound used as device in games to determine direction and distance would be very useful. But the difference between a glass rolling on marble or cement will be dismissed by your mind as it has very little to do with overall objective of the typical gaming experience.

Agreed - For most games you are 100% correct. However in slower paced games sound can be a crucial element especially one where as a game player you might not have the ability to "see" things that are important to the mission/level. In a game like this where it is less shoot/bang/blowemup and more patience/stealth I think investing in sound could be a worthwhile priority if done right. uncompressed however is a bit of a waste and designing the game so that those without surround sound could fully enjoy the game at the same time they design the game around "sound" would be a difficult task. Could this be the first "surround sound req'd" game?:cool:
 
You never know :p

Sound has been a much overlooked medium of information for the most part. As ChefO says, in slower paced games such as stealth games where often you hear your enemy before you see him, accurate depiction of where something is happening is pretty crucial. Similarly, if when moving around you knock something over, a loud enough noise would alert a guard.
As a result you are often more inclined to keep your ears peeled (as it were), and as such become more aware of sound quality. While I agree that a goal of perfect representation of sound is not as essential as conveying the information in the first place, approaching that goal would still be something that could lend more to the immersion of the gaming experience than spending more time taking the graphics up a small notch.

I appreciate the RAM limitations but I would have thought there would be some cunning way around it - if a possible source of a certain noise is out of earshot, it would not need loading until it was, surely? I have no technical knowledge regarding programming, so I don't have a clue as to what needs loading and when.. Is there (or would there be any point in) such a thing as a sound equivalent of LOD? By this I'm thinking any particular noise source that is only just in earshot (and there is no equivalent source closer by) could theoretically have a lower sample version loaded, and could be replaced by a higher sample version as it comes closer? Is that complete crap due to the complexity of making an engine know how to do such a thing? Similarly, since I've popped the lid open on my box of madness, I imagine there is a small section of the RAM that would be used as temporary memory for sound to be loaded from the disc? If you can't see a sound source, couldn't it be loaded off the disc as it makes a noise, rather than being preloaded in case that noise event happens? It doesn't matter so much if you hear the sound a split second after the event occurred (unless it's gunfire or a guard alert noise), since if you didn't see it then how would you know?

Educate me! :oops:
 
Yeah. Do you think this is possibly what he is aiming for to achieve it?

Not possible right now. That will take a couple of extra years of research and processing power.

Procedural sounds that sound like the real thing are not even close.

Edit - oh, well, like the guys above me explained.
 
Last edited by a moderator:
IMO, your brain doesn't care that much when its is preoccupied with dealing with all the data relaying from your eyes and to your hands.

Actually, that is completely not true. Your brain pays much more attention when audio-visual stimulation coincide.

Also, from a certain sound-quality onward, your reactions to sounds start becoming very primitive. For instance, if a glass falls to pieces you will always react to that in real life. But if you hear a recording that picks up on the high-pitch tones well enough (say 24bit at 96khz) then at the right volume you will have the exact same reaction. Certain sounds will make your heart beat faster, and you probably know about women starting to leak milk if they have young babies and hear a baby cry.

One of the most important advances of next-gen gaming is that we now have enough power to do real-time processing of sounds, and enough memory to keep the quality of the sounds high enough at the same time. If you think about real-time processing, then we aren't just talking about recreating the echoes of a certain environment, but also talking about spatial positioning in surround sound, and in the case of the above, also about the interaction between individual objects.

You should probably think of this as similar to the physics that make a piece of wood break realistically. Depending on how it breaks, and where the pieces fall, you'll hear different sounds. That kind of calculation will then result in different sounds happening.

These sounds, by the way, will for quite a while still be created from a combination samples, synthesis and processing, because that's exactly how they are being done now everywhere else (movies, music).

This generation there will also be enough space to store sufficient amounts of different hi-quality samples (certainly on BluRay ;) ).

Sound is much more important than many people think. My girlfriend, if she watches a horror movie, hates it when we have the surround on, because it makes everything much more scary. She also hates it when there's a phone in the movie that's the same as ours, or when a door opens in the rear speakers and it makes you think its our own door. Very cute is once when there were kittens making sounds on the TV, our female cat went to the center speaker immediately to try and take care of them.
 
Not possible right now. That will take a couple of extra years of research and processing power.

Procedural sounds that sound like the real thing are not even close.

Edit - oh, well, like the guys above me explained.

What about a good approximation?

Oh well. Perhaps if that doesnt work they will add tons of real sound samples (recorded) for each object and put it in the disk
 
Actually, that is completely not true. Your brain pays much more attention when audio-visual stimulation coincide.

Also, from a certain sound-quality onward, your reactions to sounds start becoming very primitive. For instance, if a glass falls to pieces you will always react to that in real life. But if you hear a recording that picks up on the high-pitch tones well enough (say 24bit at 96khz) then at the right volume you will have the exact same reaction. Certain sounds will make your heart beat faster, and you probably know about women starting to leak milk if they have young babies and hear a baby cry.

One of the most important advances of next-gen gaming is that we now have enough power to do real-time processing of sounds, and enough memory to keep the quality of the sounds high enough at the same time. If you think about real-time processing, then we aren't just talking about recreating the echoes of a certain environment, but also talking about spatial positioning in surround sound, and in the case of the above, also about the interaction between individual objects.

You should probably think of this as similar to the physics that make a piece of wood break realistically. Depending on how it breaks, and where the pieces fall, you'll hear different sounds. That kind of calculation will then result in different sounds happening.

These sounds, by the way, will for quite a while still be created from a combination samples, synthesis and processing, because that's exactly how they are being done now everywhere else (movies, music).

This generation there will also be enough space to store sufficient amounts of different hi-quality samples (certainly on BluRay ;) ).

Sound is much more important than many people think. My girlfriend, if she watches a horror movie, hates it when we have the surround on, because it makes everything much more scary. She also hates it when there's a phone in the movie that's the same as ours, or when a door opens in the rear speakers and it makes you think its our own door. Very cute is once when there were kittens making sounds on the TV, our female cat went to the center speaker immediately to try and take care of them.

I am not disputing that sound is important. But "real" versus "realistic" is not a big enough issue in video games to need an extra 25 Gb of space for samples of sound.
 
Probably not enough to necessitate a 50GB disc - certainly not at this point at least. I was under the impression that Kojima was misquoted though, and that he is most likely using a 25GB disc.
 
There's a lot of tech about these days to simulate real sounds. By the sounds of it (pun intended) existing consoles are using the same tech the music industry was using 20 years ago.

These days highly realistic sounds can be created by using virtual synthesis, physical modelling and other similar techniques. Another one (not used much) is re-synthesis, this recreates samples in real time and you should be able to tweak the parameters to make sounds slightly different each time.

You can also physically simulate a room using audio ray tracing if you want the echos to be correct.

A lot of this stuff is based on FFTs or convolution, both things Cell is very, very good at.
 
I am not disputing that sound is important. But "real" versus "realistic" is not a big enough issue in video games to need an extra 25 Gb of space for samples of sound.

I've just pointed out to you that most people are not aware of when a sound passes their emotional response threshold untill it passes it. In 1993, I hooked up my first affordable dolby surround system with a friend's laserdisc player, and played a super high-quality THX certified version of Terminator 2. There is this scene where you see a futuristic battlefield full of metal and skeletons. The camera settles on a skull, and you just hear the wind, and then suddenly a T2k crushes the skull under its foot. That sound is of such an incredible quality, and is so well timed and combined with the visuals, that you feel it in every fiber of your body (no, the speakers only had 60W peak output ;) ). We tested it on a few friends we asked over, and one of them literally jumped in his seat - after six times I was watching the guys rather than the screen. :D

It is great. I used to record movies on tape and then listen them back on my walkman when I was a kid. You should try it sometime, it's very cool.

Anyway, I'm rambling. What is important is that on the PS2, the memory available both on the DVD and in the PS2 was very insufficient for decent quality samples. This is going to make a big difference. Most games now, also on the PSP, have decent in-game music (because they can stream this), but they are still on a budget when it comes to samples, and the machines don't have the capacity to do a lot of realtime surround processing. The Xbox did a little better in that area, because it at least had limited hardware support for that.

There are tonnes of examples where there are limitations now. The music in Final Fantasy in game still uses relatively cheap music synthesis for a lot of its background music, just because they need to save space, memory and processing power. Also, not all text can be spoken in Final Fantasy XII, simply because there's no space for it (though maybe the budget of the voicecast was limited too).

Racing games with lots of cars like GT4 (or to a lesser extent PGR and Forza) still have to rely for the most part (or even nearly all in GT4) on synthesized car-sounds, because they don't have any memory left for decent samples.

For many things right now, sound is harder to model than 3d graphics, and it is and will be taking up relatively more space as time goes on.
 
Back
Top