Predict: The Next Generation Console Tech

Status
Not open for further replies.
AMD already has nice proven well-oiled working architecture that has been refined for last 8 years. Why would they try to do something radical?

Because MS asked them to design something different?

Or put another way, MS gave them specs that could not be met with a conventional design.
 
AMD already has nice proven well-oiled working architecture that has been refined for last 8 years. Why would they try to do something radical?

Exactly. Unified memory space and low latency communication between the supposed Jaguar and GCN cores are some of the things that we are likely to see, but I don't expect anything that radical. Also, GCN with DirectCompute/C++ AMP/OpenCL is no Cell-like programming nightmare. I doubt the - yet unproved - programmability advantages of a Larrabee-like design would be worth the trouble, especially for game development.
 
Larrabee was a manycore throughput x86 design that had some TMU blocks.
Nothing in the rumors, even the outlandish ones, has gone in that direction.
Which reading late Andrew Lauritzen posts and even Sebbbi makes me a bit sad. I would have wished that consoles goes ahead of the technology curve and possibly got rid of compatibility issues down the road.
Larrabee may not have struck the good balance between throughput and single thread performances.
May be starting with a clean generic ISA would have been better, I don't know.
Intel decided for Haswell to go with two FMA units instead of a wider one, I guess they did not want to break compatibility with AVX but there could more than that.
You explained me something about how Intel did not double everything (wrt simd units ADD vs Mull) but down the road they could, and I wonder if they came to the conclusion that it is easier in a multi-threaded environment with OoO execution to feed two narrower SIMD than a bigger one.

The point is that no matter Intel heavy investment in Larrabee, it fails to be competitive but in a market going against its strength. They may have spent a lot of money on hardware and software. I can't see any of the console manufacturers going through such pain (and heavy money spending) no matter the concept is sound of not.

We are going to get glorified (or not that glorified depending on one pov and final specs) PC, not bad but nothing exciting to look forward.
 
Last edited by a moderator:
I don't think there's any way AMD have had a top-secret, completely out-there architecture under development. I expect 'customised' is probably more like RSX was a 'customised' nVidia GPU than being a Graphics Synthesizer or Flipper type true custom part. At best it'll be Xenos like in being a mix of hardwares, taking maybe current tech and adding a couple of forward thinking ideas that'll feature in future GPUs.
 
I don't think there's any way AMD have had a top-secret, completely out-there architecture under development. I expect 'customised' is probably more like RSX was a 'customised' nVidia GPU than being a Graphics Synthesizer or Flipper type true custom part. At best it'll be Xenos like in being a mix of hardwares, taking maybe current tech and adding a couple of forward thinking ideas that'll feature in future GPUs.

Like what? Please elaborate.
 
Because MS asked them to design something different?

Or put another way, MS gave them specs that could not be met with a conventional design.

I don't think there's any way AMD have had a top-secret, completely out-there architecture under development. I expect 'customised' is probably more like RSX was a 'customised' nVidia GPU than being a Graphics Synthesizer or Flipper type true custom part. At best it'll be Xenos like in being a mix of hardwares, taking maybe current tech and adding a couple of forward thinking ideas that'll feature in future GPUs.
Well they have been in contact for long enough, I would not be too surprised if MSFT asked for something BC related.
 
Being a Xenos like part would not be such a bad thing. We know even that has tessellation support long before it became a huge deal with DX11...i wonder what they could fit in there.
 
Awaits bklian to debunk the new rumor, there's just something fishy about those 8s.
I'd lay the 8 on it's side and call it the XBox "Infinity". Or maybe it's the "Loop" charlie has been hinting about. Anyways, I can't debunk, since I don't admit the existence of the product, nor do I admit AMD has anything to do with it, and even if it did, I do not know the relationship between AMD's lineup and this fabled product other than codenames.
This would be a logical move from the original, as they worked out how much reservation they _actually_ needed.
 
Like what? Please elaborate.

GCN at the ISA level is already capable of a lot of things not exposed very well with current PC APIs.

The hardware already has a lot of the groundwork necessary to operate in the same virtual memory space as the CPUs. GCN at least promises the same capability to generate its own work, much like what Nvidia's latest Tesla architecture implements.
At least from the general standpoint of using the ISA as documented to write arbitrary data to memory in a format that could be added to a queue, there's nothing about the hardware that would prevent some kind of work-generation scheme with what exists right now, if the API and driver model allowed.
Here, it's more the case that the hardware is more flexible than is currently exposed, either due to the software or complications like a non-coherent expansion bus.

Preemptible shader execution would be something more forward-looking that doesn't appear to already exist in hardware. That would be an interesting thing to find, but may be too far out (in the future) even for a custom console chip.
 
Last edited by a moderator:
Like what? Please elaborate.

Smaller wavefronts, larger wavefronts, different TMU/ALU balance, smarter ROPS, additional instructions in the architecture, different mechanism to read data and constants for the vertex shaders it's hard to speculate.
Depending on how they project GPU's will develop going forwards vs where they are now. PC GPU's aren't very forwards looking, they are optimized to run existing titles, on a console you can discard some of that if you have a clear viewpoint on where things are headed.
 
Which reading late Andrew Lauritzen posts and even Sebbbi makes me a bit sad. I would have wished that consoles goes ahead of the technology curve and possibly got rid of compatibility issues down the road.
Larrabee may not have struck the good balance between throughput and single thread performances.
May be starting with a clean generic ISA would have been better, I don't know.
The console makers would need to get together and agree to go just as far ahead of the curve as the other, or find someone willing to pay the money to implement the changes.
The direction the money is going seems to favor being able to make use of what we already have or will soon have, which leverages existing investments and expertise.

ISA concerns are generally secondary to things like implementation and design, unless something is truly difficult to implement well, like x87 floating point instructions.

Intel decided for Haswell to go with two FMA units instead of a wider one, I guess they did not want to break compatibility with AVX but there could more than that.
A single SIMD unit would have made Haswell worse at existing workloads. The inflexibility of a single vector pipeline could conceivably make it worse overall for most loads that get by with shorter vectors.
 
Smaller wavefronts, larger wavefronts, different TMU/ALU balance, smarter ROPS, additional instructions in the architecture, different mechanism to read data and constants for the vertex shaders it's hard to speculate.
Depending on how they project GPU's will develop going forwards vs where they are now. PC GPU's aren't very forwards looking, they are optimized to run existing titles, on a console you can discard some of that if you have a clear viewpoint on where things are headed.


Thanks, I read an interesting paper on smart ROPs recently. I'll try looking for the link when I get a chance.

I would consider Xenos a forward looking design, that approach appears to have worked well for MS.

Where would you focus on to get the best bang for the buck?
 
GCN at the ISA level is already capable of a lot of things not exposed very well with current PC APIs.

The hardware already has a lot of the groundwork necessary to operate in the same virtual memory space as the CPUs. GCN at least promises the same capability to generate its own work, much like what Nvidia's latest Tesla architecture implements.
At least from the general standpoint of using the ISA as documented to write arbitrary data to memory in a format that could be added to a queue, there's nothing about the hardware that would prevent some kind of work-generation scheme with what exists right now, if the API and driver model allowed.
Here, it's more the case that the hardware is more flexible than is currently exposed, either due to the software or complications like a non-coherent expansion bus.

Preemptible shader execution would be something more forward-looking that doesn't appear to already exist in hardware. That would be an interesting thing to find, but may be too far out (in the future) even for a custom console chip.

Yes, building off of GCN would be quite logical.

There's also the theory that Nextbox is late/delayed and that GCN was it's forward looking design. If that's the case, then what would you do with an extra yr of developing GCN?
 
Preemptible shader execution would be something more forward-looking that doesn't appear to already exist in hardware. That would be an interesting thing to find, but may be too far out (in the future) even for a custom console chip.

What is the point of preemptive shader execution? Overall bandwidth is not increased it seems.
 
Mixing long and short running compute jobs spring to mind.

Thanks! Is this analogous to instruction re-ordering in a CPU scheduler? I would imagine the constraints would be different in that the CPU is concerned about data dependencies whereas the GPU is mostly temporal concerns?
 
A single SIMD unit would have made Haswell worse at existing workloads. The inflexibility of a single vector pipeline could conceivably make it worse overall for most loads that get by with shorter vectors.

Exactly. Two vector pipelines are more flexible and can also be fed from two different threads with HT. The drawbacks of this choice are (1) more stress on the front-end to feed both the FMA units and (2) the necessity to increase the number of execution ports. Increasing the number of execution ports was probably necessary anyway, in order to feed the vector integer multiply and the vector integer add units.
 
Status
Not open for further replies.
Back
Top