Predict: The Next Generation Console Tech

Status
Not open for further replies.
For the memory it was said that the 12GB was for the dev kit with 8GB being the final.

That certainly makes for interesting DRAM configuration possibilities (GDDR5+DDR3 or all DDR3) when you think about it as they would need to maintain bus size between the two configurations ( I think). i.e. we could try to narrow it down.

(GDDR5 mixed in would make it rather bizarre or highly unlikely)
 
That certainly makes for interesting DRAM configuration possibilities (GDDR5+DDR3/4 or all DDR3/4) when you think about it as they would need to maintain bus size between the two configurations ( I think). i.e. we could try to narrow it down.

(GDDR5 mixed in would make it rather bizarre or highly unlikely)

I know I've speculated on both DDRx/GDDR5 split (like a PC) and all DDRx w/eDRAM. And with your other post reminding me of those costs, it makes me wonder if it's actually better/cheaper to do the latter as the former with a 6/2 split would be approx. $80-$95 going strictly by those numbers.
 
Well, when I mention the configuration possibilities, I mean the actual number of chips because there are implications to the bus size as the latter needs to be identical between retail and devkit for obvious reasons i.e. the processors.


not-enough-sleep stuff
e.g. GDDR5 DRAMs have x16/x32 I/O configs, DDR3 can have x4/x8/x16, and it seems that 8Gbit DRAMs are only x4/x8 at the moment (Micron).


So for an all-DDR3 config (for example):

8GB = 16x4Gbit chips configured as x16 -> 16x16 = 256-bit bus, 16 chips
12GB = 16x4Gbit, x8 config + 8x4Gbit, x16 -> 16x8 + 8x16 = 256-bit bus, 24 chips

Another config might be:

8GB = 16x4Gbit, x8 -> 128-bit bus, 16 chips
12GB = 8x4Gbit, x8 + 8x8Gbit,x8 -> 128-bit bus, 16 chips

Anyways, whether or not they're likely for manufacturing/assembly purposes is the next thing. :p

Things get really messy with GDDR5 mixed in.
 
Last edited by a moderator:
I just can't believe they would go separate pools after all the praise xb360's unified pool got, even though it would limit the possible max amount of memory quite a bit lower
 
Do you have a link to these posts & are they recent?

& the way I read the older spec sheet the specs wasn't 10X the PS3 but had a CPU that was 10X the PPU in the PS3 & had a GPU 10X the RSX leaving out the SPEs but said that the final specs would be 10X The PS3 like there is going to be something more to make up for the SPEs.

http://www.elotrolado.net/hilo_hilo...consola-de-nintendo_1603838_s8560#p1729055019

First post:

"I would only say right now I´m (shock icon)"
 
From a developer standpoint the most desirable configuration is to have unified single pool of memory (and tons of it). Nothing beats that.
If there are special high-performance "scratch pad" memories, those are desirable as well, but it's best if they work as automatic caches.
 
Well, when I mean the configuration possibilities, I mean the actual number of chips because there are implications to the bus size as the latter needs to be identical between retail and devkit for obvious reasons i.e. the processors.

e.g. GDDR5 DRAMs have x16/x32 I/O configs, DDR3 can have x4/x8/x16, and it seems that 8Gbit DRAMs are only x4/x8 at the moment (Micron).

Right. I was just looking at the cost aspect.
 
I know I've speculated on both DDRx/GDDR5 split (like a PC) and all DDRx w/eDRAM. And with your other post reminding me of those costs, it makes me wonder if it's actually better/cheaper to do the latter as the former with a 6/2 split would be approx. $80-$95 going strictly by those numbers.

If you are going with a large pool main memory with either g-spec DDR3 or DDR4 it makes little sense to go with a GDDR5 block for frame buffer. It makes more sense to go with a moderate sized eDRAM (32-64 MB) or an on die DRAM with wide I/O interface on the order of 256-512 MB. Both g-spec DDR3 and DDR4 provide enough bandwidth for bulk texturing. Which means you really only need a block of high speed memory for front/back buffers and intermediate buffers/textures.
 
Well, when I mention the configuration possibilities, I mean the actual number of chips because there are implications to the bus size as the latter needs to be identical between retail and devkit for obvious reasons i.e. the processors.

e.g. GDDR5 DRAMs have x16/x32 I/O configs, DDR3 can have x4/x8/x16, and it seems that 8Gbit DRAMs are only x4/x8 at the moment (Micron).


So for an all-DDR3 config (for example):

8GB = 16x4Gbit chips configured as x16 -> 16x16 = 256-bit bus, 16 chips
12GB = 16x4Gbit, x8 config + 8x4Gbit, x16 -> 16x8 + 8x16 = 256-bit bus, 24 chips

Another config might be:

8GB = 16x4Gbit, x8 -> 128-bit bus, 16 chips
12GB = 8x4Gbit, x8 + 8x8Gbit,x8 -> 128-bit bus, 16 chips

Anyways, whether or not they're likely for manufacturing/assembly purposes is the next thing. :p

Things get really messy with GDDR5 mixed in.

Or given that DEV kits at this stage tend to be little more than PCs, it is likely they just put in 2x4GB DIMMs and 2x2GB DIMMs. At one point the xbox360 dev kits were just Mac Pros with an ATI graphics cards.
 
True. :) No idea what they'll decide for devkits this time. My concern was mainly if they wanted to repeat what they did for the final 360 dev kits vs retail (re-use same mobo/chassis).
 
If you are going with a large pool main memory with either g-spec DDR3 or DDR4 it makes little sense to go with a GDDR5 block for frame buffer. It makes more sense to go with a moderate sized eDRAM (32-64 MB) or an on die DRAM with wide I/O interface on the order of 256-512 MB. Both g-spec DDR3 and DDR4 provide enough bandwidth for bulk texturing. Which means you really only need a block of high speed memory for front/back buffers and intermediate buffers/textures.

Cool. One of the reasons I started posting here was to gain a better understanding of things like this.
 
Just a question: supposing that next gen not only improves graphics, but also introduces complex physics.

Doesn't game physics benefit from lots of memory? Independent of the access rate of the memory?

Because you need, say, save the actual and former state of your physical system, and mathematical operators (e.g. matrices) to perform the physics computations?
 
Just a question: supposing that next gen not only improves graphics, but also introduces complex physics.

Doesn't game physics benefit from lots of memory? Independent of the access rate of the memory?

Because you need, say, save the actual and former state of your physical system, and mathematical operators (e.g. matrices) to perform the physics computations?

being a layman that doesn't make lots of sense.

each frame has a critical path, that is all the stuff that must be completed in a time sensitive manner that is a dependency for the next bit of the critical path. *Real physics* fall into that bucket. therefore logic dictates that you need to be able to access this physics data quickly otherwise your going to be wasting valuable compute time waiting for memory access.

if your data set can stay/fit in a cache then your set, if your doing particles and all that other crap that doesn't have any other time critical dependencies then your probably going ot be fine as well.


but i would think access latency would be very important, i wonder how good predictors are at complex physics...... :?:
 
itsmydamnation said:
being a layman that doesn't make lots of sense.

each frame has a critical path, that is all the stuff that must be completed in a time sensitive manner that is a dependency for the next bit of the critical path. *Real physics* fall into that bucket. therefore logic dictates that you need to be able to access this physics data quickly otherwise your going to be wasting valuable compute time waiting for memory access.

if your data set can stay/fit in a cache then your set, if your doing particles and all that other crap that doesn't have any other time critical dependencies then your probably going ot be fine as well.

but i would think access latency would be very important, i wonder how good predictors are at complex physics...... :?:

E.g. for integrating an ordinary differential equation in time (typically needed in physics), the efficient methods need the whole state of at least the former time...more efficient methods may even need intermediate states...

I am just asking. I am not familiar with how advanced the mathematical methods used for gamephysics are this gen? But I certainly hope and expect that game physics evole next gen, which obviously should lead to more complex math/physics models and higher demand in amount of memory.
 
I am just asking. I am not familiar with how advanced the mathematical methods used for gamephysics are this gen? But I certainly hope and expect that game physics evole next gen, which obviously should lead to more complex math/physics models and higher demand in amount of memory.

The prospect of a next-gen LBP make me drool, that's for sure. If they can get one out at launch, that'd be quite the coup.
 
Doesn't game physics benefit from lots of memory? Independent of the access rate of the memory?

Because you need, say, save the actual and former state of your physical system, and mathematical operators (e.g. matrices) to perform the physics computations?
The larger amount of simultaneous active objects you have, the more memory accesses you need to update and simulate them. Memory accessing requires bandwidth. Faster memory subsystem (more bandwidth, better caches and predictors) allow games to have more physics based dynamic interaction active at the same time.

More memory allows you to have more static objects (or sleeping physics objects) and bigger environments with stored physics state. Static/sleeping objects are not accessed every frame, so they do not always consume bandwidth. You basically only need to access a static object's physics state (collision mesh for example) when some other object hits it (or is near enough to cause a potential hit).

For example Skyrim's world is mostly static (most objects are static), but the game world is large and it remembers it's state. For a scenario like this, huge amount of slow memory might be preferable. But for a FPS game with lots of active physics (bridges falling apart around you, buildings collapsing, hundreds of shells and grenades flying around) more memory bandwidth is of course better for physics than more memory. It allows game to have more dynamic objects flying around (without slowing the game frame rate down).
 
Just a quick note to say that when it comes to memory you can now consider our systems to be the same as old mainframe using tapes.
Although everything is faster, the difference in speed between RAM and CPU cache makes algorithms designed for those systems relevant again.
 
bgassassin: I got a message for you to send back to those giving you some vague platform overviews. Maybe this will help get the ball rolling and some much needed clarification ;)

Regarding the 1 TFLOPs number for Durango, could they specify if that is Single or Double Precision. I tried to explain to a messenger that there isn't a huge need for DP on a console but according to him MS is quite happy with their DP performance. AMD is already packing this sort of performance into 365mm^2 GPUs in 2012 so while such architectural choices seem necessary, depending on what was cut from a GPU to put it into a console (e.g. ROPs from 32 down to 16 or even 8 at 1GHz, reduction in TMUs, etc) I guess it is possible, but I don't trust him. But he swears it is true.

Oh, and something about insisting there is a new memory type with similar performance to eDRAM but he didn't know the tech stuff (he really is a newb) but he was told it is a newer technology not on the market yet with some manufacturing and reliability issues. What came to mind was this, but the messenger didn't know.

So the big questions would be: What are the SP and DP Durango GPU metrics and besides the 8GB of memory what other memory architectures are in place on Durango.
Wait...So you also have a messenger?

1.1-1.5 sounded weak, but manageable. Exactly 1 TFLOPS sounds like joke (but than again 1 TFLOP DP sounds like one too).
 
Just a quick note to say that when it comes to memory you can now consider our systems to be the same as old mainframe using tapes.
Although everything is faster, the difference in speed between RAM and CPU cache makes algorithms designed for those systems relevant again.
Couldn't have said it better myself :)

Here's a good classic white paper (made by Sony R&D in 2009):
http://harmful.cat-v.org/software/O...ls_of_Object_Oriented_Programming_GCAP_09.pdf

Slides 17 and 18 are especially notable. RAM latency in cycles is now 400x more than in 1980 (comparison between PS3 and probably the first x86 PCs). Same is true for memory bandwidth relative to CPU ALU performance. And the gap is widening all the time. Memory performance is now the most important thing when you are designing efficient algorithms, both for CPU and GPU (both are memory starved now, and both will be even more in the future).
 
Status
Not open for further replies.
Back
Top