hmm... is navi suppose to be the 9th?Simon Pilgrim of Sony Computer Entertainment Europe after some commit on the CPU side, commit something for the GPU
https://reviews.llvm.org/rL347326
GFX9-NEXT
Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
hmm... is navi suppose to be the 9th?Simon Pilgrim of Sony Computer Entertainment Europe after some commit on the CPU side, commit something for the GPU
https://reviews.llvm.org/rL347326
GFX9-NEXT
GFX9 is Vegahmm... is navi suppose to be the 9th?
Simon Pilgrim of Sony Computer Entertainment Europe after some commit on the CPU side, commit something for the GPU
https://reviews.llvm.org/rL347326
GFX9-NEXT
Seems more like naming convention change for the AMDGPU unit tests.
CIVI-DAG became CIVI-NEXT.
GFX9-DAG became GFX9-NEXT.
Added HAWAII-NEXT.
Added FIJI-NEXT.
lol I was going to ask this next.What is CIVI?
lol I was going to ask this next.
Is the implication here that this could be a lead that Vega could be part of PS5?
What is CIVI?
Someone's joke of CM? Once upon a time at my work, someone responsible for setting up new accounts misread "ADAM" as "A D A I V I". Since then we've always forced Adam to use ADAIVI.
if a C turned becomes a NMaybe Navi
A webpage that has the flops/clock of various CPUs - worth a look if you are interested:
https://stackoverflow.com/questions...le-for-sandy-bridge-and-haswell-sse2-avx-avx2
AMD Bulldozer/Piledriver/Steamroller/Excavator, per module (two cores):
AMD Ryzen
- 8 DP FLOPs/cycle: 4-wide FMA
- 16 SP FLOPs/cycle: 8-wide FMA
AMD Jaguar:
- 8 DP FLOPs/cycle: 4-wide FMA
- 16 SP FLOPs/cycle: 8-wide FMA
- 3 DP FLOPs/cycle: 4-wide AVX addition every other cycle + 4-wide AVX multiplication in four cycles
- 8 SP FLOPs/cycle: 8-wide AVX addition every other cycle + 8-wide AVX multiplication every other cycle
So those numbers for Ryzen and Jaguar are per core, the Bulldozer are per module. Does Jaguar really have a performance penalty for DP? Don't Jaguar and Bulldozer both have the same FMAC only Bulldozer has one per module and Jaguar has 1 per core (or 4 per 4 core module)?A webpage that has the flops/clock of various CPUs - worth a look if you are interested:
https://stackoverflow.com/questions...le-for-sandy-bridge-and-haswell-sse2-avx-avx2
AMD Bulldozer/Piledriver/Steamroller/Excavator, per module (two cores):
AMD Ryzen
- 8 DP FLOPs/cycle: 4-wide FMA
- 16 SP FLOPs/cycle: 8-wide FMA
AMD Jaguar:
- 8 DP FLOPs/cycle: 4-wide FMA
- 16 SP FLOPs/cycle: 8-wide FMA
- 3 DP FLOPs/cycle: 4-wide AVX addition every other cycle + 4-wide AVX multiplication in four cycles
- 8 SP FLOPs/cycle: 8-wide AVX addition every other cycle + 8-wide AVX multiplication every other cycle
According to those numbers an 8 core Jaguar would do 24 DP FLOPS per cycle or 64 SP Flops per cycle while a 8 core Bulldozer would be at 32 DP FLOPS or 64 SP FLOPS per cycle. I was under the understanding that each Jaguar core used the same FPU as each Bulldozer module, but apparently not. Still, Bulldozer has a pretty long pipeline which can hold back IPC.So where does that put a 8-core jaguar? Jaguar 8 core must be very efficient being able to outperform a FX8350 @ 4ghz with its 2ghz?
What is CIVI?
For multiplication, double precision takes an extra iteration for Jaguar that blocks additional multiplications. The extra gap in cycles reduces further than the half-rate expected for going to double-precision.So those numbers for Ryzen and Jaguar are per core, the Bulldozer are per module. Does Jaguar really have a performance penalty for DP?
The architectures are different. For one, Jaguar doesn't have an FMAC and it lacks a fair number of extensions supported by the Bulldozer line. Bulldozer also has a higher priority for double-precision, while Jaguar saved hardware by reducing throughput for that data type.Don't Jaguar and Bulldozer both have the same FMAC only Bulldozer has one per module and Jaguar has 1 per core (or 4 per 4 core module)?
The Bulldozer module would presumably not be running at the same clock as the Jaguar one, and likely would target something close to twice the clock speed while only having half as many cores.According to those numbers an 8 core Jaguar would do 24 DP FLOPS per cycle or 64 SP Flops per cycle while a 8 core Bulldozer would be at 32 DP FLOPS or 64 SP FLOPS per cycle. I was under the understanding that each Jaguar core used the same FPU as each Bulldozer module, but apparently not. Still, Bulldozer has a pretty long pipeline which can hold back IPC.
Ahhh... I'd read that Jaguar's FPUs were double or two way 128bit FPUs which i equated as being the same as the FMAC in Bulldozer. So is Jaguar 128bit per pipe with a performance penalty to combine 2 pipes into DP, and Bulldozer is 2x128bit per module (2 cores) with no performance penalty for DP?The architectures are different. For one, Jaguar doesn't have an FMAC and it lacks a fair number of extensions supported by the Bulldozer line. Bulldozer also has a higher priority for double-precision, while Jaguar saved hardware by reducing throughput for that data type.
@3dilettante
So a FX8350 on its stock speed is faster then 8 core jaguar found in consoles?
How is that clock for clock?
Ahhh... I'd read that Jaguar's FPUs were double or two way 128bit FPUs which i equated as being the same as the FMAC in Bulldozer. So is Jaguar 128bit per pipe with a performance penalty to combine 2 pipes into DP, and Bulldozer is 2x128bit per module (2 cores) with no performance penalty for DP?
Is this the case for games in general? I had an old A10 laptop and a Phenom 2 940 and for most games the A10 was GPU bound, but older games like Quake 1 at lower resolutions ran better on it IIRC.For the consoles, the DP case isn't all that important, however.