Only Cayman (6950) is VLIW4. Trinity is indeed VLIW4 as well according to reports/rumors. Maybe one day we'll actually have some use for VLIW4 (GPGPU). But then AMD did just show us how they want to leave it behind too.
http://www.pcper.com/news/Graphics-Cards/AMD-Fusion-Developer-Summit-2011-Live-BlogPCPer said:7:55 Trinity has a "6850" kind of thing....interesting....
7:55 I think that slipped!
7:56 But then he stated Trinity would be "VLIW4" so Cayman-based... interesting.
That was a confusion. He thought that the 6800 was based on VLIW4. He meant to say that the architecture is based on the 6900 series.
That was a confusion. He thought that the 6800 was based on VLIW4. He meant to say that the architecture is based on the 6900 series.
Ok you made me curious and that's what I found: sideport on rs7xx chipsets is always 16bit, ddr2/ddr3 - looks like earlier boards tended to use ddr2-1066, later ones ddr3-1333 (but that't just a rough guideline). Still the bandwidth is pathetic in any case.Desktop versions actually have decently-clocked DDR3 chips.
I've also heard that in some cases it's only a 16-bit bus, but I'm pretty sure the 780G in my Ferrari One is using a 32bit Sideport with 384MB. The access to UMA is blocked through the bios, though
Is it that much more?
Bloomfield (3-channel) has 200 more "pins" than Lynnfield (2-channel), and Lynnfield actually has 40M transistors more because of integrated PCI-Express and DMA.
It might not be that much more but it's still a budget cpu, after all. There is significantly more room for such things on the high end.
Since it's now "the past", was 6800-series meant to be VLIW4, but 32-40nm case forced it to be VLIW5?
I'd rather guess that 6900 was meant to be 6800 for quite some time...
Relatively speaking the brand is something that happens exceedingly latein a lifecycle. Engineers deal in codenames not brands.
And switching to these codenames instead of clear numbering codenames makes it difficult for us to follow on which was supposed to be what
Not that it wouldn't make sense, but I just don't see any such changes when there's already a brand new architecture.It's not because it's the same VLIW4 shader core that it must necessarily be the same ALU-TEX ratio. Shaders with a higher ALU-TEX ratio require relatively less bandwidth per instruction and Llano is badly bandwidth limited already. If I had to guess, I'd go for 10 SIMDs (640 SPs) but with one Quad-TMU shared between two SIMDs resulting in the same number of TMUs as Llano (20 TMUs). With higher clocks you'd still have slightly higher TMU and ROP throughput but the die area saving should be worth it. I'd also expect a similar ALU ratio on the first GCN-based GPUs.
I don't know if ddr3-2133 will ever be mainstream, though if ddr4 is really only coming (barely) 2014 it could happen. But not for trinity timeframe.I could be wrong but I don't expect DDR3-2133 to ever be truly mainstream and it will be hard to find low-voltage DDR3-1866. DRAM price is still a significant part of the BOM so it'd be counter productive to force OEMs to pay even more for it.
It's not because it's the same VLIW4 shader core that it must necessarily be the same ALU-TEX ratio. Shaders with a higher ALU-TEX ratio require relatively less bandwidth per instruction and Llano is badly bandwidth limited already. If I had to guess, I'd go for 10 SIMDs (640 SPs) but with one Quad-TMU shared between two SIMDs resulting in the same number of TMUs as Llano (20 TMUs).
It's not because it's the same VLIW4 shader core that it must necessarily be the same ALU-TEX ratio. Shaders with a higher ALU-TEX ratio require relatively less bandwidth per instruction and Llano is badly bandwidth limited already.
The 700 series was more efficient because multiple blocks were rewritten from scratch and others were heavily optimized.The biggest reason why R700-series was so much more efficient (performance/die size) than R600-series was putting the TMU's inside the shader processors.
And as you cannot separate those TMU's from the shader processors (without big change in architecture), it's much reasonable to just keep those extra TMU's even when they will be bandwidth-starved most of the time.
But this changes the number of elements the chip is working on. The "normal" chips have simd width 16 and run an instruction for 4 clocks for granularity 64. Now granted you could probably increase that to 128 but I'm not sure it makes a lot of sense.HD 5450 was the last example I know of, where AMD had 80 ALU lanes coupled to 8 TMUs instead of four, also in RV730 they used the same 1:10 ratio. So, despite this going into the opposite direction,scaling of ALU-TEX ratio seems not completely absurd.
HD 5450 was the last example I know of, where AMD had 80 ALU lanes coupled to 8 TMUs instead of four, also in RV730 they used the same 1:10 ratio. So, despite this going into the opposite direction,scaling of ALU-TEX ratio seems not completely absurd.