Predict: The Next Generation Console Tech

Status
Not open for further replies.
Can a 256bit bus be a realistic expectation, with a future 20nm shrink still big enough for it?

Here are quick and silly calculations, a smallish 256bit GPU : G94b, has a 180 mm² size.
20nm would give 1.96 the density of 28nm. So, a shrinkable Xbox APU would have a die size of about 350 mm².
That's a very rough and naive estimation, you may "afford" a smaller 20nm chip, 20nm may be not that dense, but can we put some "floor" die size for an APU this way?
 
i remembered someone said about 384-bit bus.
with 8Gb memory that doesn't seem plausible

though if they went with 6gb the 3gb reserved for os will actually be 1gb wich would actualy make more sense

384bit rumor was from #1 AMD China guy,it's GPU Bus width(well he also said the GPU is in the APU),not 256bit GDDR5,128bit DDR3 etc
 
What do you think about this bike bkilian? A worthy upgrade to my older model or should I change for better tires? ;)
293wmbq.jpg
That's a nice bike. Dunno about the puncture resistance though, are you sure that one is right? :)
 
Can a 256bit bus be a realistic expectation, with a future 20nm shrink still big enough for it?

Here are quick and silly calculations, a smallish 256bit GPU : G94b, has a 180 mm² size.
20nm would give 1.96 the density of 28nm. So, a shrinkable Xbox APU would have a die size of about 350 mm².
That's a very rough and naive estimation, you may "afford" a smaller 20nm chip, 20nm may be not that dense, but can we put some "floor" die size for an APU this way?
I was thinking they could use a low cost fan-out interposer, the size of the interposer would allow a much wider bus on a smaller chip. They can start with a large chip, and use a fan-out for future shrinks. At the projected 1$ per 100mm2 it would be a pretty good solution.

I also read something about "fine pitch" organic laminate packaging which could put more I/O on a smaller chip, wouldn't need anything special, no TSV or anything...
 
Last edited by a moderator:
durango specs are this (most likely, almost certainly)

8 cores 1.6 ghz Jaguar CPU.
8 GB DDR3 RAM.
1+tf teraflop GPU with some sort of alleged unknown special customizations that may help it perform better than the raw teraflops indicate.
2-3GB of RAM likely reserved for the operating system.
ESRAM on the GPU (unknown amount).

the above is likely highly accurate and current summation. take it or leave it believe it or dont, it's real.
From Aegis in response:

Everything I've heard from reliable sources makes your series of guesses, particularly about total teraflops, impossible. Save for the DDR3 guess, which I'm worried might be true. Which would be incredibly stupid.
 
Why doesn't it make sense? I am sorry I really don't know much about memory sizes/bandwith and in this case probability/cost. Do you have some easy "math" example?

The memory chips connect directly to the memory bus. Modern memory tech (everything but DDR3) only connects one chip to each controller. So, if your chips are 32-bit, and your bus is 384 bit, it means that you have to use exactly 12 chips.

Chips are sized as powers of two, so there is no way you can get 8GB from that. Closest choices would be 6GB or 12GB.

Mind you, I think the idea of a 384-bit bus is pretty much laughable. Either they have stacking, and get a really wide bus, or they use a traditional system and they really need to keep the bus size and chip amounts down for cost reasons.
 
From Aegis in response:

Everything I've heard from reliable sources makes your series of guesses, particularly about total teraflops, impossible. Save for the DDR3 guess, which I'm worried might be true. Which would be incredibly stupid.

Why is he worried abour DDR3?
 
The memory chips connect directly to the memory bus. Modern memory tech (everything but DDR3) only connects one chip to each controller. So, if your chips are 32-bit, and your bus is 384 bit, it means that you have to use exactly 12 chips.

Chips are sized as powers of two, so there is no way you can get 8GB from that. Closest choices would be 6GB or 12GB.

Mind you, I think the idea of a 384-bit bus is pretty much laughable. Either they have stacking, and get a really wide bus, or they use a traditional system and they really need to keep the bus size and chip amounts down for cost reasons.

Thanks for the information. So wouldn't that speak in favour of the 8GB DDR3 RAM rumour?
 
Why is he worried abour DDR3?

Not sure...quoting for a new page everyone

durango specs are this (most likely, almost certainly)

8 cores 1.6 ghz Jaguar CPU.
8 GB DDR3 RAM.
1+tf teraflop GPU with some sort of alleged unknown special customizations that may help it perform better than the raw teraflops indicate.
2-3GB of RAM likely reserved for the operating system.
ESRAM on the GPU (unknown amount).

the above is likely highly accurate and current summation. take it or leave it believe it or dont, it's real.
From Aegis in response:

Everything I've heard from reliable sources makes your series of guesses, particularly about total teraflops, impossible. Save for the DDR3 guess, which I'm worried might be true. Which would be incredibly stupid.
 
Not sure...quoting for a new page everyone


From Aegis in response:

Everything I've heard from reliable sources makes your series of guesses, particularly about total teraflops, impossible. Save for the DDR3 guess, which I'm worried might be true. Which would be incredibly stupid.

Wait a moment, Is he sayingthat 1+Tflops is "impossible"? WTF? impossible like "it is too low" or impossible like "it is too high"?
 
Why is he worried abour DDR3?

Because even on a 256bit bus it would have fairly limited bandwidth.
Having said that if it has 8GB of memory which seems to be a consistent rumor I'd pretty much guarantee it's DDR3, which implies the Embedded memory will act as an additional "fast" memory pool.
Without knowing the details of the fast pool it's hard to say if the limited main memory bandwidth is an issue.
IMO the number or texture reads/pixel has gone through the roof over the last few years, and having a large slow pool and a small fast pool is probably more of an issue today if textures are read from it then it was when 360 shipped.
Some of that could possibly be alleviated with larger caches, IME it's really hard to predict how a system will be have until you run code on it.

The FLOP figures everyone has attached performance (I guess we can blame Tim Sweeny for that) are only relevant if the bulk of shaders are ALU limited and I don't believe it's the case, for that to happen you have to have enough register space and cache to be able to hide memory reads. I would bet a fair amount of ALU resources are wasted on modern GPU's. I'd be willing to bet in the 40% range over the course of a frame.

The memory configuration is likely to be as big an indicator as performance as anything else.
 
Did anyone else see or read about the AMD CES press conferance? They showed off a tablet powered by a Temash APU (2-4 Jaguar cores, Unknown # of GCN CUs) running Dirt Showdown at 1080p. The SoC was supposedly less than 5watts TDP. I found it very impressive.
http://www.youtube.com/watch?v=FruxOZ9Nfp0

That, and from what I've been reading about Jaguar, it actually makes me hope the Jaguar core rumors are true; save the silicon and power budget for the gpu. If that's an example of what a 5w SoC can do, I can't wait to see what they do in at 170-200w system.
 
The FLOP figures everyone has attached performance (I guess we can blame Tim Sweeny for that) are only relevant if the bulk of shaders are ALU limited and I don't believe it's the case, for that to happen you have to have enough register space and cache to be able to hide memory reads. I would bet a fair amount of ALU resources are wasted on modern GPU's. I'd be willing to bet in the 40% range over the course of a frame.

If evaluating a specific workload it is true that peak FLOP figures don't give a good indication of performance.
The more general assumption is that an design would keep other resources relatively proportional to the ALU capabilities.

To take the inverse of the last sentence, the assumption would be that modern GPU designs have allocated resources such that they are ALU-limited a little less half the time. Knowing what the ALU resources are and some guesses at what it takes to feed them can give an indicator of what goes into the rest.
 
If evaluating a specific workload it is true that peak FLOP figures don't give a good indication of performance.
The more general assumption is that an design would keep other resources relatively proportional to the ALU capabilities.

To take the inverse of the last sentence, the assumption would be that modern GPU designs have allocated resources such that they are ALU-limited a little less half the time. Knowing what the ALU resources are and some guesses at what it takes to feed them can give an indicator of what goes into the rest.

The problem is that modern GPU's in the PC space at least are optimized to run last years games, and they all have similar memory configurations. It's still a bad way to compare even in that space IMO, but it's difficult to separate all of the factors.

I just think that when we're talking about consoles FLOPS (assuming they are in the same ballpark) is not going to be a good indicator of system performance, there will be a much bigger difference IMO in memory configuration, and how well utilized those ALU's will be.
 
I just think that when we're talking about consoles FLOPS (assuming they are in the same ballpark) is not going to be a good indicator of system performance, there will be a much bigger difference IMO in memory configuration, and how well utilized those ALU's will be.

Can this be applied to Wii U too?
 
Status
Not open for further replies.
Back
Top