Haswell vs Kaveri

Discussion in 'Architecture and Products' started by AnarchX, Feb 8, 2012.

  1. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    Ray tracing workloads tend to be irregular (especially for non primary rays) and they are likely to perform better on architectures that use a narrow SIMD width. This is just an hypothesis but AMD 64-wide vectors/wavefronts might be partially responsible for the observed performance in Luxmark.
     
  2. Psycho

    Regular

    Joined:
    Jun 7, 2008
    Messages:
    745
    Likes Received:
    39
    Location:
    Copenhagen
    Naah.. 7870 is 37% faster than 6970, even with 6% less peak flops and 14% less bandwidth..
    [​IMG]

    edit: seems to have widened with newer drivers:
    [​IMG][​IMG]
     
    #482 Psycho, Jun 3, 2013
    Last edited by a moderator: Jun 3, 2013
  3. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    That's probably part of the reason why Intel does so well, but the other is that AMD's APUs underperform equivalent discrete parts by a lot in this test. The A10-5800k should be 1/4 the 6970's performance, not 1/9th, and GCN makes a decent difference as well. It's a low BW test. It's possible that the APUs have an architectural omission causing this, but I'm thinking that this is a driver or throttling issue.

    Still, fantastic showing by Intel. Does anyone know what the CPU alone gets in that test?
     
  4. fellix

    fellix Hey, You!
    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,489
    Likes Received:
    400
    Location:
    Varna, Bulgaria
    The lack of global caching on all the pre-GCN architectures from AMD is also responsible for the poor RT/PT performance. Memory access in those algorithms is also highly irregular, even with accelerated structures and this is one of the weakest points for any massively parallel machine.

    Remember, when Fermi hit the market, we were all blown away by its performance in SmallPT right here, in B3D. ;)
     
  5. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    Good point.
     
  6. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Some more numbers are here:
    http://techreport.com/review/24879/intel-core-i7-4770k-and-4950hq-haswell-processors-reviewed/13

    Interestingly apparently the OCL CPU ICD does not yet support FMA (casts some doubt on AVX2 support in general). Also interesting to see the eDRAM having a positive effect on a pure CPU workload - enough to bump it past the higher clocked desktop chip.

    I'm also curious how the load balancing happens for the "CPU + GPU" modes on the IGPs... that's the case where significant TDP/thermal constraints are going to kick in for the i7-4950HQ. It still seems to do better overall than purely using the IGP, but I'd love to see what frequencies get chosen for the CPU/GPU/ring under such a load.

    The whole TR review is worth a read - interesting stuff, and happy to see them actually test some of the new stuff in Haswell a bit (AVX2, FMA) rather than the other reviews so far that run some mostly single threaded code, scoff at the mere 10% or so IPC gains and call it disappointing... I feel like the memo on parallelism went out many years ago now ;)
     
  7. jimbo75

    Veteran

    Joined:
    Jan 17, 2010
    Messages:
    1,211
    Likes Received:
    0
    Interesting to see how close the HD 4600 is to Trinity and the 5200 Pro in the Techreport's review. Hexus got very different numbers with Trinity thrashing HD 4600 to the tune of ~35% so we're still looking at pretty wildly different results depending on the game.

    http://hexus.net/tech/reviews/cpu/56005-intel-core-i7-4770k-22nm-haswell/?page=13

    AMD really needs to get the finger out though as clearly Richland isn't going to open up much more of a gap and a GT3 i3 could be a bit too close for comfort.
     
  8. Wynix

    Veteran Regular

    Joined:
    Feb 23, 2013
    Messages:
    1,052
    Likes Received:
    57
    TBH i don't even see the point of Richland, just seems like wasted time/effort for little gain.

    I hope AMD jump on the 20nm train fast, how long it took them to get their CPU/APUs onto 28nm is a joke.
     
  9. Kaarlisk

    Regular Newcomer Subscriber

    Joined:
    Mar 22, 2010
    Messages:
    293
    Likes Received:
    49
    Still, it becomes clear that the caching that already exists is really great – an enormous L4 cache only gives relatively low gains. On the other hand, as I understood it, somebody already mentioned the option of incorporating a smaller eDRAM cache in the CPU die directly, where it would have less latency and maybe more bandwidth.

    Considering how much IPC Ivy already has, I'd call those 10% gains awesome, especially since they did not come at a disproportionate cost. I mean, look at ARM – quad-core A5 SoCs are a tier below dual-core A7 SoCs, probably because the die cost of more cores is cheaper than the die cost of cores with higher IPC; the same story as with A7/A15.
    And clearly, parallelism has not been figured out yet. Which is actually why Intel can afford devoting more and more die space exclusively to the GPU. If there was a consumer application for 6 to 10 cores, Intel would have to think much harder about segmentation – i.e. there would be a market for consumer desktop chips with more cores, not higher performing GPUs.

    Also, it does not seem that Intel really cares about parallelism for consumers. I'm one of those people who would gladly buy an i7-4xxxR chip (BGA with GT3e on desktop), except that TSX is disabled on this chip – and if TSX does become very useful for consumer applications, it will be unavailable to me.
     
  10. leoneazzurro

    Regular

    Joined:
    Nov 3, 2005
    Messages:
    518
    Likes Received:
    25
    Location:
    Rome, Italy
    The point of Richland is to have something more competitive until Kaveri arrives, if it is due in Q4 then it will miss the "back to school" season and maybe Christmas. And, if Intel goes up in performance 5-10% due to IPC gains and AMD goes up 5-10% due to frequency gains at the same TDP (I'm refering to the CPU part only) then the gap will remain the same - but it's true that on the GPU side the gap will be quite reduced if it will exist at all. I'm referring to the HD4600, Iris Pro is too expensive for a comparison with Trinity/Richland. And price is again on AMD's side - comparing the newr flagship from Intel with the mainstream APU is itneresting from a theoretical point of view, but it is pointless from a market persppective as they are not competing against each other.
    Also, I'm not convinced that a respin (which Richland seems to be) is really a big effort for AMD.

    @jimbo75: TR tests the platforms also with DDR3-2133, and I think everyone knows GPUs of the Trinity/Richland/HD4600 ranks are held back by bandwidth, too...
    Also, AT tested the HD4600 with DDR2400, and Trinity with DDR2133, which is not a completely fair comparison, but anyway...
     
    #490 leoneazzurro, Jun 4, 2013
    Last edited by a moderator: Jun 4, 2013
  11. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    9,966
    Likes Received:
    4,561
    Why not look at price points before sentencing Richland to an early death?

    The most expensive Richland is priced at $150, so it's a competitor to the i3 line.
    The Haswell line starts at $190.
    Richland is not competing with Haswell. And since there's no desktop i3 Haswell, Richland will be competing with the Ivybridge i3 models.
    And yes, Richland's iGPU is quite a bit faster than the i3's HD2500.

    Reviewers who get samples for free tend to forget how important the price point is and like to sing victory to a product that costs twice as much as its nearest competitor.

    AMD is simply not competing against the Haswell line with APUs at the moment, period.
     
  12. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Agreed, it's really only a big win if you have a working set in the "dozens of MBs" sort of size range of course. That's quite common for graphics, but less common for CPU workloads; that said, some of that is because they have been optimized for current L3$ sizes :)

    Yep totally agreed, that was the point I was trying to make :) People keep unrealistically expecting their legacy stuff that uses 1-4 threads to keep getting faster when that is clearly not going to happen indefinitely... frankly I consider *any* IPC improvements at this stage to be minor miracles.

    Yeah the TSX segmentation seems like a poor idea to me, especially with it not being supported on the K-series parts. Don't get that all. Granted it is more important with more cores, but still.

    To consumers, sure, but the point is that in reality they simply set the price depending on how competitive it is in practice. You don't think they *want* to charge $300 for it? You think they are choosing to not have a high-end competitive part? Obviously not, especially with all of the noise they have been making about APUs.

    So sure, they're not going to commit suicide by pricing their stuff above higher performing parts, but the retail prices are sort of incidental in an architectural discussion.
     
  13. Kaarlisk

    Regular Newcomer Subscriber

    Joined:
    Mar 22, 2010
    Messages:
    293
    Likes Received:
    49
    Good point :D it is always interesting to find places where this kind of situation exists

    Segmenting the K-series parts like this is, IMHO, fine. If you really need VT-d (and, it seems, TSX) right now, you're probably a server/HPC guy, and Intel probably wants server/HPC guys to buy more chips instead of overclocking.
     
  14. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    Vt-d would be useful to a very minor segment of the population that would like to do gaming in a VM (and that segment could become a tiny bit less minor)

    It ought to be an "enterprise" feature needing VMWare Expensive Edition (tm) or an IBM mainframe but is technically available for free with Xen, like some people run FreeBSD or FreeNAS with ZFS at home.
     
  15. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,298
    Likes Received:
    396
    Location:
    Australia
    Lack of VT-D is the reason a bought a 8350 instead of a 3770K. ESXi is free, anyone with a home "file server" should be running ESX/Hyper V etc. it just makes thing easy. Sure its a small part of the market, but it also tends to be the market who put together the BOM's when enterprise and government go out to buy stuff as well :lol:.
     
  16. borden

    Newcomer

    Joined:
    Apr 24, 2013
    Messages:
    18
    Likes Received:
    0
  17. A1xLLcqAgt0qc2RyMz0y

    Regular

    Joined:
    Feb 6, 2010
    Messages:
    987
    Likes Received:
    278
    Why did you avoid the 3770 (non-K version)?
    Since no-one wants instability in servers you can rule out over-clocking as a reason.

    It has all features enabled including VT-d.
    http://ark.intel.com/products/65719
     
  18. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    o/c with an unlocked multiplier and CPU always able to do 1GHz over its default clock isn't really o/c. Such rigs usually have an aftermarket cooler, case fans, is memtested, tested for stability and heat at prolonged 100% CPU use and we don't make a Serious Business lose one million dollars if it crashes.

    Not sure why you would want to run a file server from an o/ced 8350 though. (but making your main rig the NAS is something to do if you wouldn't ever need the NAS to be on while main rig is down)
     
  19. Lightman

    Veteran Subscriber

    Joined:
    Jun 9, 2008
    Messages:
    1,804
    Likes Received:
    475
    Location:
    Torquay, UK
    #499 Lightman, Jun 5, 2013
    Last edited by a moderator: Jun 6, 2013
  20. itsmydamnation

    Veteran Regular

    Joined:
    Apr 29, 2007
    Messages:
    1,298
    Likes Received:
    396
    Location:
    Australia
    because with the workloads i run a 3770 has a performance deficit
    logical fallacy. My 8350 is overclocked to 4600 and undervolted to 1.275 volts. Its rock solid stable.
    I pass through my RAID card to a freenas VM to run ZFS.
    I run file,print,domain, firewall ( sidewinder) , SSL VPN ( F5 BIG-IP), IPTV server with real time transcoding to H264 and DLNA server (mezzmo) again with real time transcoding. i also run very large network simulations/development using things like dynamips/virtual ASR/qemu etc.

    It spends a very large part of the day at 100% utilisation.

    i know that.

    the price different also allow for an additional SSD so a could mirror my Guest OS's drive.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...