Predict: The Next Generation Console Tech

Status
Not open for further replies.
I'm not convinced that a ~1TFLOPS system will be able to achieve this, its becoming increasing clear that Microsoft no longer care about gaming.

FWIW I don't think it comes down to FLOPS at all in anything but a "you need more than you have now" way.
The design of the memory sub system is much more important IMO, coupled with number of ROPS.
On PC Cards to day the ALU's are grossly underutilized for a lot of the work, particularly processing vertices.
Any single number people latch onto is always going to be a gross over simplification.
 
This is the difference between VLIW5 (Cypress) and VLIW4 (Cayman).
http://www.realworldtech.com/cayman/6/

From all materials I've read (and the above Anandtech link), R300/../R580/Xenos are different from both: they have two execution ports (versus 4 in VLIW4), the first attached to a 4x32 bit execution unit and the second attached to a scalar 32 bit execution unit.

I've found information forum this very board that shows that AMD have used VLIW design since R300.

R3xx-R4xx was 3+1 (co-issue and dual issue) and R500 was 4+1, so they were but more limited.

Also states in the thread that R520 is VLIW 4+1 for VS and 3+1 for PS

However because Xenos is a unified shader architecture they can't have splits like in R520 but as Xenos being VLIW 4+1 already talked about in this thread I would say that is likely true.

Link to the post : http://forum.beyond3d.com/showpost.php?p=1517013&postcount=7
 
8 jaguar cores
8 gb ram with fast bandwidth
<16 cu gpu, 1-1.5 teraflops
dsps
etc

100w tdp

I totally called it a few months ago. :p
 
I've found information forum this very board that shows that AMD have used VLIW design since R300.

You're right that they are all VLIW architectures. However, as I said above, they are 3 different designs.
R300 VS/Xenos: 2 execution ports -> 4-wide vector + scalar
Cypress: 5 execution ports -> 5 x scalar
Cayman: 4 execution ports -> 4 x scalar

So saying that they went "back" is not correct.


That's exactly what I'm saying and it is very different from both Cypress and Cayman.
 
lol i saw #1 AMD china guy said "Microsoft won't care this rubbish level GPU"(someone copied our ~1tflop talk to TGFC and asked him)
 
Wow, this is why I try to say nothing, it's amazing what people will infer from almost no data.

Think about things logically. What are the important features required in a console, from the perspective of the manufacturer?

1. Enough power that users can see a clear reason to upgrade.
2. Low manufacturing cost.
3. Path to lower manufacturing costs, to enable profits and lower prices.
4. Design that meets all regions regulations.

People keep asking why a company would back off the 200mm^2+ designs of the last generation. Simple, for reasons of #2 and #3. Process reductions are becoming more expensive, and taking longer to achieve. For that reason you have to start with a smaller upfront cost.

And note that #1 is not "as much power as we can fit", it's a much lower requirement.
Anyways, off to lunch.

You are right,the problem for MS is that the xbox users have come to expect things like Samaritan in game as the starting point of the next generation improving quality of games wave after wave.
 
8 jaguar cores
8 gb ram with fast bandwidth
<16 cu gpu, 1-1.5 teraflops
dsps
etc

100w tdp

I totally called it a few months ago. :p

That should fit in a ~200mm^2 SoC and would be quite disappointing overall, but the efficiency of a single SoC might still be interesting from a GPGPU perspective. The GPU component would be in the same ballpark of a Radeon HD 7770, but I certainly hope that it has more than ~70 GB/s of bandwidth. I still think that 20 CU are probably doable in a single chip, considering the savings in motherboard layout and cooling.
 
With DDR memory??

With 3D stacking memory, yes. They'll get high enough bandwidth with DDR4.

You are right,the problem for MS is that the xbox users have come to expect things like Samaritan in game as the starting point of the next generation improving quality of games wave after wave.

They can still get Samaritan on a weaker box. 2.5TF was the recommendation at 1080P. With dynamic or lower than 1080P resolution it's doable. If bandwidth increases by a lot there'd be less sacrifices too. Most of us here would still be disappointed though no doubt :)
 
Last edited by a moderator:
Previously....

I admitted I was wrong on Xenos not being VLIW, no point in quoting the previous posts. :smile:

VLIW 4 is more efficient then VLIW 5 hence why AMD moved back to VLIW 4 with the 6000 series.

I say no more..... They may have different layouts and performance but they're all VLIW, including Xenos which is what the whole thing was started over ;)

This is what I was pointing out, they did not "move back" to VLIW4, it's a completely different layout from Xenos. Hence, I don't see how a 60% from Xenos to VLIW4 would not be possible as stated in your initial post.

Even comparing a 5770 to a 7770 at the same clocks ( Both cards have the same bandwidth, fill and texel rates ) the performance while faster on the 7770 shows an improvement in shader architecture of around 40% on average which meens that AMD would have to of made an improvement of ~60% between Xenos VLIW4+1 and the VLIW5 powered AMD 5000 series.

By the way, how can you say that the improvement is 40% in shading performance when when the 7770 might be bottlenecked by bandwidth or texel fill rates? Are you referring to a specific benchmark?
 
That should fit in a ~200mm^2 SoC and would be quite disappointing overall, but the efficiency of a single SoC might still be interesting from a GPGPU perspective.

Can you still shrink a ~200 mm² chip? Wouldn't you run into pad limitations after you shrunk it? As for GPGPU applications, yeah I agree. I think that HSA will be a part, maybe even a big part, of next gen consoles.

As for memory, I sure hope for something like 8 GB. It really bugged me that Halo: Combat Evolved for the original X-Box has such huge maps, Halo 3 came pretty close as well, but Halo 4 has much smaller maps.
 
By the way, how can you say that the improvement is 40% in shading performance when when the 7770 might be bottlenecked by bandwidth or texel fill rates? Are you referring to a specific benchmark?

I'm referring to a whole host of game benchmarks and not just benchmarks :smile:
 
I'm referring to a whole host of game benchmarks and not just benchmarks :smile:

Then it's a 40% improvement in overall performance, not shading performance. :smile: To really isolate the shading performance improvement we've to be sure the bottleneck isn't elsewhere. Even compute benchmarks might be memory bound on the HD 7770, but not on the slower HD 5770.
 
Then it's a 40% improvement in overall performance, not shading performance. :smile: To really isolate the shading performance improvement we've to be sure the bottleneck isn't elsewhere. Even compute benchmarks might be memory bound on the HD 7770, but not on the slower HD 5770.

They both have the same fill, texel rates as well as the same amount of bandwidth.

However the 5770 has a higher Gflop rate and more shaders so any victory for the 7770 would be surely related to the GCN architecture?
 
They both have the same fill, texel rates as well as the same amount of bandwidth.

However the 5770 has a higher Gflop rate and more shaders so any victory for the 7770 would be surely related to the GCN architecture?

Yes, but you can't say it's all about the shaders. The same theoretical maximum fill and texel rates don't mean the same practical fill and texel rates. Those practical differences are of course due to architectural differences. I'm assuming you already know that, as most people here are probably more experienced when it comes to these concepts than me.
 
They can still get Samaritan on a weaker box. 2.5TF was the recommendation at 1080P. With dynamic or lower than 1080P resolution it's doable. If bandwidth increases by a lot there'd be less sacrifices too. Most of us here would still be disappointed though no doubt :)

Yes,I know.1080p should be the standard for a true next gen.
 
If what is being said now-stuck with shitty hd7750 notebook level gpus for ten years- is true i will change consoles for pc or steambox...
I think mobile phones and tablets will make consoles a history 6-7 years down the road. If you think about it, you really can't sell big, noisy and hot box at 600 dollars, but you can make phones that will visually get close enough and sell shit ton of them.

Basically, if phone is 800$ people will say "Well, its more expensive, but I can pretty much do whatever I want on it, so it pays of". I guess thats why MS is looking for subsidized strategy and quick pay offs. They know phones and tablets are right on their heels, and they know they can't go for another 8 year long generation where they start actually profiting in last 3 years.
 
However the 5770 has a higher Gflop rate and more shaders so any victory for the 7770 would be surely related to the GCN architecture?

Yes, but you cannot determine the speedup in shading performance unless you're sure that the bottleneck isn't elsewhere. It is at least 40%, but it is likely be more, by Amdahl's law.

Let's make an example. Let a game be texel-rate bound 30% of the time and compute bound 70% of the time. If the HD 7770 is 40% faster than the HD 5770, assuming no efficiency gain in the texture units*, how much faster are shader computations on the HD 7770?

speedup = (execution time on HD 5770) / (execution time on HD 7770) = 1.4

Since in the texel-rate bound situation there is no improvement, the execution time on the new architecture is
execution time on HD 7770 = 0.3 + 0.7 * x

Hence the compute speedup is obtained as follows:
1.4 = 1/(0.3 + 0.7 * x)
x = 0.59

* This is a fairly strong assumption, though.
 
Status
Not open for further replies.
Back
Top