AMD: Southern Islands (7*** series) Speculation/ Rumour Thread

Raqia · Jun 15, 2011

Interesting. Wonder what this means for the APU that comes after Trinity. Who needs an FPU when you can make the GPU circuitry do double duty?

Alexko · Jun 15, 2011

trinibwoy said:
Ah yes just saw that. So no more clauses. They seem to be embracing a lot of things nVidia has been preaching for years. Guess it'll come down to who has the best implementation.

Yes, it does look a lot more like Fermi than Cayman did. There's still one major difference, though: no SIMT, just a classic Scalar + SIMD.

Alexko · Jun 15, 2011

Raqia said:
Interesting. Wonder what this means for the APU that comes after Trinity. Who needs an FPU when you can make the GPU circuitry do double duty?

The latency should still be way, way too high for many workloads.

trinibwoy · Jun 15, 2011

Alexko said:
Yes, it does look a lot more like Fermi than Cayman did. There's still one major difference, though: no SIMT, just a classic Scalar + SIMD.

From a software perspective how do we know each D isn't a T? Each SIMD lane could very well be dedicated to a pixel/vertex/control-point/work-item. Otherwise how are they going to get SIMD instructions out of the average compute program?

rpg.314 · Jun 15, 2011

Any body has any idea where these slides might be?

EDIT: A pdf of entire deck would be nice.

rpg.314 · Jun 15, 2011

It's a huge change.

pcper said:
no roadmaps, specific products, or feature rollout time lines

All these should come out with 7xxx series. But I wouldn't be surprised if all of these didn't.

rpg.314 · Jun 15, 2011

Raqia said:
Who needs an FPU when you can make the GPU circuitry do double duty?

Anyone who does not want to invest a dime in his existing codes.

rpg.314 · Jun 15, 2011

So, if I understand correctly, AMD has reduced the branching granularity from 64 to 16. Right?

entity279 · Jun 15, 2011

Charlie was saying about 3-4 weeks ago that no new gpu is expected until winter 2012 (from both AMD & nV).

LE: This related to 8000 morphing into 7000 series speculation.

liolio · Jun 15, 2011

hardware.fr has the (all) slides in a more readable format:
http://www.hardware.fr/news/11648/afds-architecture-futurs-gpus-amd.html

Tridam · Jun 15, 2011

rpg.314 said:
So, if I understand correctly, AMD has reduced the branching granularity from 64 to 16. Right?

No. Instead of 16 vec5 over 4 cycles they will do 4x 16 scalar over 4 cycles.

rpg.314 · Jun 15, 2011

Tridam said:
No. Instead of 16 vec5 over 4 cycles they will do 4x 16 scalar over 4 cycles.

So,

a) 4 different wavefronts from one workgroup issue to a CU.

b) 4 different wavefronts from 4 different workgroups issue to a CU.

What is it?

Gipsel · Jun 15, 2011

trinibwoy said:
From a software perspective how do we know each D isn't a T? Each SIMD lane could very well be dedicated to a pixel/vertex/control-point/work-item. Otherwise how are they going to get SIMD instructions out of the average compute program?

It's easy. As each "thread" in a GPU has no own control flow, it is really only a D, not a T in SIMD (lane masking dosn't help that fundamental issue). There is no difference (and never was) between AMD an nvidia in this respect. Everything else are just stupid marketing terms.

Gipsel · Jun 15, 2011

rpg.314 said:
So,

a) 4 different wavefronts from one workgroup issue to a CU.

b) 4 different wavefronts from 4 different workgroups issue to a CU.

What is it?

Edit: To account for edit.

How many work groups can coexist on one CU depends on the resource allocation (as today). If you don't use a lot of local memory for instance, several workgroups can potentially run in parallel.

rpg.314 · Jun 15, 2011

I should have phrased it better.

a) 4 different wavefronts from one workgroup issue to 4 simd's in a CU.

b) 4 different wavefronts from 4 different workgroups issue to 4 simd's in a CU.

What is it?

I think it should be (a), else it would break dx11 spec.

GZ007 · Jun 15, 2011

Iam wondering if the TMU-s are still in CU-s :?:

Not a single slide mentioned them.

rpg.314 · Jun 15, 2011

It was a preview of compute arch. TMU's weren't mentioned in Fermi's compute briefing either.

fellix · Jun 15, 2011

I guess they will follow NV's direction in GF110 and (re)introduce 16-bit TMU quad per CU.

trinibwoy · Jun 15, 2011

Gipsel said:
It's easy. As each "thread" in a GPU has no own control flow, it is really only a D, not a T in SIMD (lane masking dosn't help that fundamental issue). There is no difference (and never was) between AMD an nvidia in this respect. Everything else are just stupid marketing terms.

Yes that's exactly what I'm saying. nVidia's "T" is just an independent data set. So no different from SIMD. I was responding to Alexko's comment that SIMT != SIMD. The only real difference is that developers don't have to explicitly code SIMD instructions compared to typical cases, SSE/AVX etc.

trinibwoy · Jun 15, 2011

rpg.314 said:
So,

a) 4 different wavefronts from one workgroup issue to a CU.

b) 4 different wavefronts from 4 different workgroups issue to a CU.

What is it?

Both are probably supported. I believe multiple blocks can be running in parallel on an SM so why not multiple workgroups per CU? Each workgroup probably runs on a single CU though for the usual reasons.

AMD: Southern Islands (7*** series) Speculation/ Rumour Thread

Raqia

Alexko

Alexko

trinibwoy

Meh

rpg.314

rpg.314

rpg.314

rpg.314

entity279

liolio

Aquoiboniste

Tridam

rpg.314

Gipsel

Gipsel

rpg.314

GZ007

rpg.314

fellix

trinibwoy

Meh

trinibwoy

Meh

Similar threads