No, only two 256 bit wide FMA units per core would give this kind of peak flops. That's not gonna happen.If they had a 256 wide AVX2 unit per core then it'd go a long way. The Durango CPU would sport 410 GFLOPs.
No, only two 256 bit wide FMA units per core would give this kind of peak flops. That's not gonna happen.If they had a 256 wide AVX2 unit per core then it'd go a long way. The Durango CPU would sport 410 GFLOPs.
Yeah, that's what I was thinking. Going from the 128 bit units in Jaguar to 256 bit would only double the 102 GFlop number we have now.
if its 3 operand FMA why do you need an extra read port, more register space, or more L/S bandwidth?
Yes, it would be double if they went with AVX. AVX2 - first in market with Haswell- would give us the 409,6 Gflops.
Maybe that´s why Bkillian said that Durango would make things that our todays monsters PCs couldn´t ( nobody has AVX2 vectors yet in its PC)
Isn't that doubling inherent to the Haswell implementation of AVX2 (ie, it just has twice as many FMA units)? Or is double the FMAs implied for AVX2 support? In any case, I can't imagine AMD basically quadrupling the size of the FPU on Jaguar for Durango to match a design even Intel hasn't shipped yet.
Vector ALUs in Jaguar are already double-pumped (one 256-bit op per base clock), so just doubling the unit count should suffice.I can't imagine AMD basically quadrupling the size of the FPU on Jaguar for Durango to match a design even Intel hasn't shipped yet.
Vector ALUs in Jaguar are already double-pumped (one 256-bit op per base clock), so just doubling the unit count should suffice.
The physical SIMD units are 128 bit internally. Executing a 256 bit AVX2 instruction takes two cycles the same way a 128 bit SSE2 instructions took two cycles on Bobcat with its 64bit SIMD units.
Bobcat and Jaguar are designed to be low power general purpose CPUs, where you don't see a huge demand for floating point performance. One can argue it makes sense to use narrower data paths and execution units to lower idle power consumption.
Durango is an entirely different design point. Games can use lots of floating point resources and while power is a concern it isn't a mobile platform. Adding FMA and full width SIMD units seems like low hanging fruit to me although significant changes need to be made to the scheduler and register file to support the extra source operand.
Cheers
The physical SIMD units are 128 bit internally. Executing a 256 bit AVX2 instruction takes two cycles the same way a 128 bit SSE2 instructions took two cycles on Bobcat with its 64bit SIMD units.
Bobcat and Jaguar are designed to be low power general purpose CPUs, where you don't see a huge demand for floating point performance. One can argue it makes sense to use narrower data paths and execution units to lower idle power consumption.
Durango is an entirely different design point. Games can use lots of floating point resources and while power is a concern it isn't a mobile platform. Adding FMA and full width SIMD units seems like low hanging fruit to me although significant changes need to be made to the scheduler and register file to support the extra source operand.
Cheers
No, look at the hotchips presentation from AMD about Jaguar.
One Jaguar FP-unit can do 8 SP-MUL's and 8 SP-ADD's per cycle; or 1 DP-MUL and 2 DP-ADDs per cycle
You should have another look!No, look at the hotchips presentation from AMD about Jaguar.
One Jaguar FP-unit can do 8 SP-MUL's and 8 SP-ADD's per cycle; or 1 DP-MUL and 2 DP-ADDs per cycle
If Durango brings AVX2 it will be the best console CPU ever. This plus ESRAM (if low latency) to talk with the GPU: GPGPU heaven.Then i could see sense to the comments after Durango Summit last year about it being a super computer. And also would have a little more sense to have Xeon in the devkits as ESRAM its an aproximation to the Xeon giant L3 cache
But isn't AVX2 an Intel (Haswell) exclusive instruction set?
Exclusive in terms of only announced supporting architecture? Yes. But any extension Intel adds to the x86 instruction set, AMD is free to implement and so with AMD adding to the instruction set for Intel. That's how the license works.
Ok, thanks, it is like AMD can use something "like" AVX2, but not AVX2.
Are there any future AMD CPUs/APUs with "AVX2" support?
To use it, AMD would have to provide support for AVX2 instructions exactly as Intel has created the specification, else it's not AVX2. How they implement the silicon is up to them, though.
AMD has not announced any CPUs with AVX2 support. They only just started supporting AVX in H2 2011. (Intel was Q1)
Ok, thanks, it is like AMD can use something "like" AVX2, but not AVX2.
Are there any future AMD CPUs/APUs with "AVX2" support?