Hello again, after a 7-day vacation! Beyond3D needs some more "excitement" from me..
Just kidding (I have a twisted sense of humor)!!!!!!!!!! But I'll go ahead and ruffle things a bit just for the heck of it.
Hopefully, 7 days is enough for Bo_Fox to examine the new information presented within the thread instead of resorting to tit-for-tat, knee-jerk, highly agitating communication.
Thank you for your patience,
AlS
Beyond3D suddenly came to life for a short while with this thread. Too bad it couldn't continue.. I know, you're welcome! BTW, I usually did not initiate the disrespectful, denigrating insults, only defending myself with a shield that automatically does a recoiling knee-jerk. Apoppin, who used to be a moderator for a long time at a much bigger forum (Anandtech) said that the posting etiquette by some posters here in this thread was much worse than my posting etiquette, and that it's an AMD-biased forum (hence ruffling your feathers).. but it does not seem to be that bad here with the AMD bias (unlike [H] and TPU which express the fanboyism in an all-out infantile expression).
One guy (
itsmydamnation) was asking: "so how exactly does Barts work then if the shader compiler sends transcendentals to the T unit?" I could very well ask the same thing, "so how exactly does
Cayman work then if the shader compiler sends transcendentals to the T unit?", without offering anything concrete - there's no real insight or evidence being presented.
(Sarcastically, I was just proving a point that I could have also claimed Cayman to be VLIW5-based, with the exact same words they all have been using throughout the entire thread, without ever giving some real evidence - anything concrete at all, for goodness' sake).
Lightman said:
Barts has among many other improvements:
- new front end, boosting utilization of shader core
- improved tesselation
- compared to 5830 fully functional 32 ROP's
On top of that game code rarely is limited by shader performance!
There are few purely shader limited tests, Perlin Noise from 3DMark is one of them. Just look at the results from this page and think about them for a second:
http://techreport.com/articles.x/20126/7
Barts performs exactly as you would expect from VLIW5 there.
Look at more complex tests to see how Barts is in line with Cypress on Particle, Cloth, and other tests where Cayman is showing good progress
Lightman's contention falls flat in the face at all accounts:
Improved tessellation nowhere makes up for the overall performance discrepancy in games that do not use tessellation.
HD 6790 (Barts) has only 16 fully functional ROPs, just like HD 5830. Lightman does nothing to even figure the discrepancy between HD 5830's 33% greater shader -AND- texturing power than HD 6790, while HD 5830 does no better than 2-3% better than HD 6790, with his very poorly thought-out comments.
Game code is usually limited by shader performance, which was the case with HD 3870 vs HD 4870 back when games were less shader-heavy!
Regarding Perlin Noise performance (in the same Techreport link given by Lightman), Barts XT actually performs in line with HD 6950 as well as HD 5870, while the 2GB performs differently along with HD 6950, proving nothing.
As for GPU Particles, HD 6870 performs nearly identically to HD 5870 despite having FAR less FLOPS capacity. (OUCH, Lightman, did you really look at it yourself and think about them for a second?)..
As for GPU Cloth, HD 6870 performs much, much more in line with its Cayman cousins rather than with HD 5870, shader-wise.
Even Shader Toy shows HD 6870 to perform much more in line with HD 6950:
Barts XT: 180 / 2016 GFLOPS = 0.9325
HD 5870: 206 / 2720 GFLOPS= 0.0757
HD 6950: 201 / 2253 GFLOPS = 0.0892
3dcgi said:
I don't know what's happening in your example, but a VLIW5 SIMD has more shader power than a VLIW4 SIMD so it could perform faster if the shader's co-issue well and have transcendental instructions. The advantage of VLIW4 is smaller area and much of the time the extra unit doesn't provide an advantage.
Yet Barts XT performs like as if the "extra unit" of VLIW5 (if it really had VLIW5) was
always providing an advantage (rather than "much of the time not"), if this were to be the case.
CarstenS said:
Short excerpt, so that there's at least something posted on the interent.
A roughly even mixture of a lengthy shader not doing anything useful with MUL, MADD, MIN, MAX and SQRT (and AMD program from HD2900 launch basically)
HD 5870: 1.206 GI/s. (Giga-Instructions per second)
HD 6870: 893 GI/s.
HD 6970: 877 GI/s.
HD 7970: 1.101 GI/s.
Thank you!!!!! If HD 7970 were VLIW5-based, it should have had 39% more GI/s (or GFLOPs), then minus 20% due to the missing VLIW5 unit (1.224 GI/s if the performance were in line with the FLOPS capacity, which is more than 10% discrepancy -especially given that Tahiti has so many other things going for it like bandwidth, L2, etc..). Other factors could be in play here, so I cannot rule that out yet.
Thank you, CarstenS, for the first "something posted on the internet" that actually leans toward Barts being VLIW5-based. It could be interpreted as Carsten saying: "Since you B3D guys posted nothing yet on the internet, I'm the first one to take the positive initiative."
Carsten gets the love:
Mintmaster said:
Bo_Fox said:
Why does Barts XT perform so well in games against Cayman specs-wise if it's VLIW5 rather than VLIW4?
It doesn't. In some games one architecture is more efficient, and in others vice versa (which, BTW, is very clear evidence that Barts and Cayman have different architectures). Look at Crysis and Stalker, where the 6950 is 23-30% and 29-40% faster, respectively, than the 6870.
Looking at Crysis and Stalker does not give "very clear evidence" as to draw such an absolute conclusion. The 6950 has 39% greater texturing power, 19% more bandwith, etc.. than the 6870, so it is normally to be expected that in some games it really shines through. You do present a strong case, though, since both games are heavily shader-bound, though. However, the Cypress cards still do not pull ahead of the 6870 enough for things to make absolute sense just yet.
The reason:
Later drivers show the 6870 to be within 15-20% of the 6950, rather than up to 40% as was indicated with the early benchmarks (at the time of launch), using the same settings:
http://www.anandtech.com/show/5153/nvidias-geforce-gtx-560-ti-w448-cores-gtx570-on-a-budget/4
While it is stated "On cards with 1GB of VRAM or less it can be overly taxing, but with more than 1GB of VRAM the bottleneck shifts to rendering."
Thanks for trying to make a good find, though. :good:
Mintmaster said:
Bo_Fox said:
Why does Barts XT absolutely destroy HD 5850 and HD 5870, specs-wise, by a ridiculous margin?
In what world does that happen? Or are you normalizing performance to shader count?
You're making the same mistake that many people do: Shader performance is just one part of overall performance, and often less than half of a game's rendering time is limited by shaders. This is very clear when you compare the 9600GT to the 9800GT. Both are 256-bit, 16 ROP cards with equal bandwidth and similar clocks. However, the 9600GT has only 64 SPs to the 8800GT/9800GT's 112, yet the former is almost as fast as the latter in games. That's why the 7950 gets crushed by the 7970 in compute benchmarks, but only lags a bit in most games. By your logic, then, the 7950 and 9600GT are more efficient than the 7970 and 9800GT, and must have a better architecture.
The 9600GT is an excellent example, thank you very much. :smile: But you must not forget that the 9600GT has higher clocks, etc... far from "similar" as you put it. The 8800GT was already somewhat more bottlenecked by the bandwidth and ROPs (which still gave an 8800GTX about 20% advantage overall). The 8800GT has 25% greater overall gaming performance than the 9600GT, so it's not "almost as fast". By the way, I do notice the 7770 Cape Verde being astoundingly efficient, at 94 Voodoopower, compared against HD 7970 with roughly 3x the specs on paper, but only 220 Voodoopower.
It's just that Barts XT has something rather magical in it.
Perhaps Barts XT really has 1280 VLIW5 shaders, or what is it EXACTLY about Bart's improved front-end that makes it perform amazingly well given the specs?
Mintmaster said:
Bo_Fox said:
Why does HD 6790 perform about the same as HD 5830 if the latter has 33% more shader and texturing power, with other specs being roughly the same - if BOTH are VLIW5?
The 5830 has always been an underperformer, taking a bigger hit vs the 5850 than the 6790 takes vs the 6850, despite similar handicaps. It's an outlier, so that comparison is meaningless.
Hardly, since HD 5830 actually has a whopping 33% more FLOPS capability and 33% more texturing power than the 6790. Both handicaps are pretty similar, given that both have the same cache structure, and the same VLIW5 architecture as is claimed. The 5830 actually has as just many shaders and TMUs as the highest-end Barts! To say that it's an outlier after considering this is just as meaningless in that context.