Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 03-Jul-2011, 04:07   #26
mczak
Senior Member
 
Join Date: Oct 2002
Posts: 2,691
Default

Quote:
Originally Posted by ToTTenTranz View Post
Desktop versions actually have decently-clocked DDR3 chips.

I've also heard that in some cases it's only a 16-bit bus, but I'm pretty sure the 780G in my Ferrari One is using a 32bit Sideport with 384MB. The access to UMA is blocked through the bios, though
Ok you made me curious and that's what I found: sideport on rs7xx chipsets is always 16bit, ddr2/ddr3 - looks like earlier boards tended to use ddr2-1066, later ones ddr3-1333 (but that't just a rough guideline). Still the bandwidth is pathetic in any case.
And I've got very, very serious doubts about your sideport size. 384MB might be the total allocated fb memory (including UMA + sideport). If you got ddr2 sideport then sideport size is likely 64 or 128MB (1 512mb or 1gb chip) if it's ddr3 it's most likely 128MB (1 1gb chip).
mczak is offline   Reply With Quote
Old 03-Jul-2011, 09:44   #27
AnarchX
Senior Member
 
Join Date: Apr 2007
Posts: 1,488
Default

Quote:
Originally Posted by ToTTenTranz View Post
Is it that much more?
Bloomfield (3-channel) has 200 more "pins" than Lynnfield (2-channel), and Lynnfield actually has 40M transistors more because of integrated PCI-Express and DMA.
Quote:
Originally Posted by mczak View Post
It might not be that much more but it's still a budget cpu, after all. There is significantly more room for such things on the high end.
What speaks against two sockets: a budget one with 2x 64-bit and a high-end one with 3x 64-bit, which could be also used by 10 core BD Komodo.
The die could support 3x 64-bit, like the first K8 CPU supported 2x 64-bit and only 64-bit was used on S.754.
AnarchX is offline   Reply With Quote
Old 03-Jul-2011, 10:30   #28
CarstenS
Senior Member
 
Join Date: May 2002
Location: Germany
Posts: 2,965
Send a message via ICQ to CarstenS
Default

Quote:
Originally Posted by Kaotik View Post
Since it's now "the past", was 6800-series meant to be VLIW4, but 32-40nm case forced it to be VLIW5?
I'd rather guess that 6900 was meant to be 6800 for quite some time...
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts.
Work| Recreation
Warning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration!
CarstenS is offline   Reply With Quote
Old 03-Jul-2011, 10:42   #29
Kaotik
Drunk Member
 
Join Date: Apr 2003
Posts: 5,380
Send a message via ICQ to Kaotik
Default

Quote:
Originally Posted by CarstenS View Post
I'd rather guess that 6900 was meant to be 6800 for quite some time...
Yeah, that's what I'm thinking too, as in, Barts wasn't supposed to be 6800.
__________________
I'm nothing but a shattered soul...
Been ravaged by the chaotic beauty...
Ruined by the unreal temptations...
I was betrayed by my own beliefs...
Kaotik is offline   Reply With Quote
Old 03-Jul-2011, 14:07   #30
Dave Baumann
Gamerscore Wh...
 
Join Date: Jan 2002
Posts: 13,521
Default

Relatively speaking the brand is something that happens exceedingly latein a lifecycle. Engineers deal in codenames not brands.
__________________
Radeon is Gaming
Tweet Tweet!
Dave Baumann is offline   Reply With Quote
Old 03-Jul-2011, 16:33   #31
Kaotik
Drunk Member
 
Join Date: Apr 2003
Posts: 5,380
Send a message via ICQ to Kaotik
Default

Quote:
Originally Posted by Dave Baumann View Post
Relatively speaking the brand is something that happens exceedingly latein a lifecycle. Engineers deal in codenames not brands.
And switching to these codenames instead of clear numbering codenames makes it difficult for us to follow on which was supposed to be what
__________________
I'm nothing but a shattered soul...
Been ravaged by the chaotic beauty...
Ruined by the unreal temptations...
I was betrayed by my own beliefs...
Kaotik is offline   Reply With Quote
Old 03-Jul-2011, 23:50   #32
Alexko
Senior Member
 
Join Date: Aug 2009
Posts: 2,849
Send a message via MSN to Alexko
Default

Quote:
Originally Posted by Kaotik View Post
And switching to these codenames instead of clear numbering codenames makes it difficult for us to follow on which was supposed to be what
That's pretty much the point!

Well it's supposed to be difficult for NVIDIA, but it incidentally ends up being difficult for us as well.
__________________
"Well, you mentioned Disneyland, I thought of this porn site, and then bam! A blue Hulk." —The Creature
My (currently dormant) blog: Teχlog
Alexko is online now   Reply With Quote
Old 04-Jul-2011, 19:18   #33
Arun
Unknown.
 
Join Date: Aug 2002
Location: UK
Posts: 4,914
Default

It's not because it's the same VLIW4 shader core that it must necessarily be the same ALU-TEX ratio. Shaders with a higher ALU-TEX ratio require relatively less bandwidth per instruction and Llano is badly bandwidth limited already. If I had to guess, I'd go for 10 SIMDs (640 SPs) but with one Quad-TMU shared between two SIMDs resulting in the same number of TMUs as Llano (20 TMUs). With higher clocks you'd still have slightly higher TMU and ROP throughput but the die area saving should be worth it. I'd also expect a similar ALU ratio on the first GCN-based GPUs.

I could be wrong but I don't expect DDR3-2133 to ever be truly mainstream and it will be hard to find low-voltage DDR3-1866. DRAM price is still a significant part of the BOM so it'd be counter productive to force OEMs to pay even more for it.
__________________
Focusing on non-graphics projects in 2013 (but I still love triangles)
"[...]; the kind of variation which ensues depending in most cases in a far higher degree on the nature or constitution of the being, than on the nature of the changed conditions."
Arun is offline   Reply With Quote
Old 04-Jul-2011, 23:04   #34
mczak
Senior Member
 
Join Date: Oct 2002
Posts: 2,691
Default

Quote:
Originally Posted by Arun View Post
It's not because it's the same VLIW4 shader core that it must necessarily be the same ALU-TEX ratio. Shaders with a higher ALU-TEX ratio require relatively less bandwidth per instruction and Llano is badly bandwidth limited already. If I had to guess, I'd go for 10 SIMDs (640 SPs) but with one Quad-TMU shared between two SIMDs resulting in the same number of TMUs as Llano (20 TMUs). With higher clocks you'd still have slightly higher TMU and ROP throughput but the die area saving should be worth it. I'd also expect a similar ALU ratio on the first GCN-based GPUs.
Not that it wouldn't make sense, but I just don't see any such changes when there's already a brand new architecture.
Plus, 10 simds but 5 quad-tmus probably isn't smaller than 8 simds with 8 quad-tmus anyway (and since it's bandwidth and even rop limited mostly anyway it probably doesn't really matter for performance either way though you're right in theory 10 simds (but half the tmus) might be a tiny bit faster.

Quote:
I could be wrong but I don't expect DDR3-2133 to ever be truly mainstream and it will be hard to find low-voltage DDR3-1866. DRAM price is still a significant part of the BOM so it'd be counter productive to force OEMs to pay even more for it.
I don't know if ddr3-2133 will ever be mainstream, though if ddr4 is really only coming (barely) 2014 it could happen. But not for trinity timeframe.
Though unfortunately I bet OEMs will indeed save pennies with memory, just look at hd5570 / hd6570 or similar cards to get an idea, most of them not only don't use gddr5 (which probably indeed adds significant cost) but actually go for ddr3-667 instead of ddr3-800 (which is usually what the reference cards call for). I don't want how many pennies that saves (can count them on one hand?), but there is NO WAY the relative performance deficit is worth it... But I guess people buy that stuff... So you can only hope the OEMs hopefully at least will use ddr3-1600 for trinity...
mczak is offline   Reply With Quote
Old 05-Jul-2011, 13:38   #35
hkultala
Member
 
Join Date: May 2002
Location: Herwood, Tampere, Finland
Posts: 271
Default

Quote:
Originally Posted by Arun View Post
It's not because it's the same VLIW4 shader core that it must necessarily be the same ALU-TEX ratio. Shaders with a higher ALU-TEX ratio require relatively less bandwidth per instruction and Llano is badly bandwidth limited already. If I had to guess, I'd go for 10 SIMDs (640 SPs) but with one Quad-TMU shared between two SIMDs resulting in the same number of TMUs as Llano (20 TMUs).
You cannot just "share" those TMU's; There is one "4-way" TMU in every 16-way SIMT processor, and the internal busses and command structure etc won't allow "sharing it".

The biggest reason why R700-series was so much more efficient (performance/die size) than R600-series was putting the TMU's inside the shader processors.

In some low-end integrated models the SIMT processors are 8-way, so those have different "alu-tmu-ratio", the only way of reasonably "increasing" the "tmu-alu-ratio" would be a change to 32-way SIMT processors.
But that's not going to happen. It would mean increasing the wavefront size etc. which would result in many other changes.

Last edited by hkultala; 05-Jul-2011 at 13:46.
hkultala is offline   Reply With Quote
Old 05-Jul-2011, 14:01   #36
hkultala
Member
 
Join Date: May 2002
Location: Herwood, Tampere, Finland
Posts: 271
Default

Quote:
Originally Posted by Arun View Post
It's not because it's the same VLIW4 shader core that it must necessarily be the same ALU-TEX ratio. Shaders with a higher ALU-TEX ratio require relatively less bandwidth per instruction and Llano is badly bandwidth limited already.
TMU's don't require any bandwidth, it's the code that's using them.
If the code is bandwidth limited, having more TMU's won't make it run any slower.

And there are always those moments when a chip that's "usually" bandwidth limited is not a bandwidth limited.

And as you cannot separate those TMU's from the shader processors (without big change in architecture), it's much reasonable to just keep those extra TMU's even when they will be bandwidth-starved most of the time.

And.. I see no reason to go to 10 shader processors. That would be just overkill, and waste of die size.
Most of the time those just would not have anything reasonable to do because those would be waiting for textures(that would be coming slowly from memory) or waiting for pixels being drawn.
hkultala is offline   Reply With Quote
Old 14-Jul-2011, 03:24   #37
3dcgi
Senior Member
 
Join Date: Feb 2002
Posts: 2,210
Default

Quote:
Originally Posted by hkultala View Post
The biggest reason why R700-series was so much more efficient (performance/die size) than R600-series was putting the TMU's inside the shader processors.
The 700 series was more efficient because multiple blocks were rewritten from scratch and others were heavily optimized.
3dcgi is offline   Reply With Quote
Old 14-Jul-2011, 10:11   #38
CarstenS
Senior Member
 
Join Date: May 2002
Location: Germany
Posts: 2,965
Send a message via ICQ to CarstenS
Default

Quote:
Originally Posted by hkultala View Post
And as you cannot separate those TMU's from the shader processors (without big change in architecture), it's much reasonable to just keep those extra TMU's even when they will be bandwidth-starved most of the time.
HD 5450 was the last example I know of, where AMD had 80 ALU lanes coupled to 8 TMUs instead of four, also in RV730 they used the same 1:10 ratio. So, despite this going into the opposite direction,scaling of ALU-TEX ratio seems not completely absurd.
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts.
Work| Recreation
Warning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration!
CarstenS is offline   Reply With Quote
Old 14-Jul-2011, 13:36   #39
mczak
Senior Member
 
Join Date: Oct 2002
Posts: 2,691
Default

Quote:
Originally Posted by CarstenS View Post
HD 5450 was the last example I know of, where AMD had 80 ALU lanes coupled to 8 TMUs instead of four, also in RV730 they used the same 1:10 ratio. So, despite this going into the opposite direction,scaling of ALU-TEX ratio seems not completely absurd.
But this changes the number of elements the chip is working on. The "normal" chips have simd width 16 and run an instruction for 4 clocks for granularity 64. Now granted you could probably increase that to 128 but I'm not sure it makes a lot of sense.
mczak is offline   Reply With Quote
Old 14-Jul-2011, 14:27   #40
hkultala
Member
 
Join Date: May 2002
Location: Herwood, Tampere, Finland
Posts: 271
Default

Quote:
Originally Posted by CarstenS View Post
HD 5450 was the last example I know of, where AMD had 80 ALU lanes coupled to 8 TMUs instead of four, also in RV730 they used the same 1:10 ratio. So, despite this going into the opposite direction,scaling of ALU-TEX ratio seems not completely absurd.
No, it had 8-way SIMT processor("40 ALU lanes") coupled into a 4-way TMU.

And it had two of these processors.

But changing the "alu-tmu ratio" into another direction would mean widening the SIMT width from 16-way to 32-way. And that would increase wavefront size from 64 to 128. And that would mean quite big changes on many things.

You can easily split the wavefront size into half, but not double the size of it.
hkultala is offline   Reply With Quote
Old 14-Jul-2011, 14:54   #41
CarstenS
Senior Member
 
Join Date: May 2002
Location: Germany
Posts: 2,965
Send a message via ICQ to CarstenS
Default

I know that HD 5450 it was two times 40/4, this was just an example that in the past AMD has also been experimenting with different sized SIMD/T widths.
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts.
Work| Recreation
Warning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration!
CarstenS is offline   Reply With Quote
Old 28-Jul-2011, 06:04   #42
Erinyes
Member
 
Join Date: Mar 2010
Posts: 372
Default

Not sure where to post this, but i was just thinking this today. Would Krishna/Witchita also be VLIW4 like Trinity? It would make more sense than being VLIW5 wouldnt it? My earlier speculation was that it would be a Caicos + 2-4 Bobcat cores like they did for Brazos (Cedar + 2 Bobcat cores). But since they were already developing a VLIW4 architecture for Trinity it would make sense to carry it over to Krishna/Witchita as well. Maybe Krishna could have 128 VLIW4 SP's, 8 TMU's and 4 ROP's
Erinyes is offline   Reply With Quote
Old 31-Aug-2011, 00:14   #43
Jaaanosik
Member
 
Join Date: May 2008
Posts: 143
Default

Quote:
Originally Posted by Erinyes View Post
Not sure where to post this, but i was just thinking this today. Would Krishna/Witchita also be VLIW4 like Trinity? It would make more sense than being VLIW5 wouldnt it? My earlier speculation was that it would be a Caicos + 2-4 Bobcat cores like they did for Brazos (Cedar + 2 Bobcat cores). But since they were already developing a VLIW4 architecture for Trinity it would make sense to carry it over to Krishna/Witchita as well. Maybe Krishna could have 128 VLIW4 SP's, 8 TMU's and 4 ROP's
Hey, news about Trinity from semiaccurate.

Quote:
Last is the most interesting, a 50% increase in Gigaflops. The cores, going from stars to dragon, probably keeps the core count the same, but Llano gets the overwhelming majority of its flops from the GPU side. The difference in CPU contributed flops is probably a rounding error for this calculation.

That leaves the GPU. If you notice, the GPU is listed as HD7000, aka Graphics Core Next (GCN), aka Southern Islands. That means going from VLIW5 to scalar + VLIW4, whatever the code word for that is. In any case, going from 80 ‘old’ clusters (400 shaders) to 120 ‘new’ (480) clusters is where the majority of the 50% comes from. Throw in an updated memory controller, tighter integration between the sides, and you have not only more speed, but much more exploitable speed.S|A
Jaaanosik is offline   Reply With Quote
Old 31-Aug-2011, 00:31   #44
3dilettante
Regular
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 5,401
Default

"scalar + VLIW4" ?

That's wrong regardless of whether he's talking about GCN or VLIW4.
This does not mesh with any rumors or review sites I've seen discussing the GPU for Trinity, some of which stated AMD told them it was VLIW4.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is offline   Reply With Quote
Old 31-Aug-2011, 00:38   #45
itsmydamnation
Member
 
Join Date: Apr 2007
Location: Australia
Posts: 824
Default

I wonder if we will get a die shot

Quote:
This does not mesh with any rumors or review sites I've seen discussing the GPU for Trinity, some of which stated AMD told them it was VLIW4.
what do we believe the slide(could be fake) or the rumors :P
itsmydamnation is offline   Reply With Quote
Old 31-Aug-2011, 00:51   #46
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,237
Send a message via Skype™ to rpg.314
Default

Quote:
Originally Posted by 3dilettante View Post
"scalar + VLIW4" ?

That's wrong regardless of whether he's talking about GCN or VLIW4.
This does not mesh with any rumors or review sites I've seen discussing the GPU for Trinity, some of which stated AMD told them it was VLIW4.
GCN != scalar + VLIW4.
120 new shaders is not divisible by 64.
GCN barely taped out when they showed a laptop.

The only useful bit is that Trinity will launch right next to IB, possibly even earlier.
rpg.314 is offline   Reply With Quote
Old 31-Aug-2011, 01:27   #47
RedVi
Member
 
Join Date: Sep 2010
Location: Australia
Posts: 266
Default

Being HD7000 series in no way implies it must be GCN. Low end Southern Island parts will likely be VLIW-4. The fact that it is "7000" series was pretty much a given from a marketing point, whether it uses all new architecture or not is entirely separate from that fact.
RedVi is offline   Reply With Quote
Old 31-Aug-2011, 01:41   #48
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,237
Send a message via Skype™ to rpg.314
Default

Good point. I think all that slide means that the GPU in trinity will be marketed as 7xxx class, just like LLano has 6xxx gpu even though it is an evergreen derivative.
rpg.314 is offline   Reply With Quote
Old 31-Aug-2011, 01:51   #49
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,237
Send a message via Skype™ to rpg.314
Default

Bingo

http://www.anandtech.com/show/4705/a...-7000-products
rpg.314 is offline   Reply With Quote
Old 31-Aug-2011, 13:14   #50
TKK
Member
 
Join Date: Jan 2010
Posts: 146
Default

Quote:
Originally Posted by rpg.314 View Post
Good point. I think all that slide means that the GPU in trinity will be marketed as 7xxx class, just like LLano has 6xxx gpu even though it is an evergreen derivative.
You could even say 6xxx encompasses four architectures, or at least architecture revisions:

- Evergreen (6750/6770, renamed 57x0)
- Evergreen + UVD3 (Llano)
- Northern Islands VLIW5 / Enhanced Evergreen (Barts, Turks, Caicos)
- 'True' Northern Islands - VLIW4 (Cayman)


So yeah, I agree that HD 7000 for Trinity is just for marketing, APUs will always lag behind discrete GPUs by at least one generation.
TKK is offline   Reply With Quote

Reply

Tags
amd, fusion, intel, ivy bridge, trinity

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:14.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.