Why does Intel lose in Gaming

Discussion in 'PC Hardware, Software and Displays' started by Kaotik, Dec 9, 2005.

  1. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,185
    Likes Received:
    1,841
    Location:
    Finland
    We all know AMD dominates the gaming market at the moment when it comes to performance - but exactly why is this?
    Part of the reason surely lies in the long pipes of P4, which take a hit when there's a lot of branches - when the programs are more or less linear, P4's rock.

    However, this might not be the only reason - take a look at this ExtremeTech article:
    http://www.extremetech.com/article2/0,1697,1895945,00.asp

    It suggests, that by choosing a bit different options when compiling the game, especially P4's would get huge (they mention 5-10, 10-15 and even 20%) benefits, while possibly not harming AMD performance at all (it's a bit unclear apparently wether this would affect them at all, or would it affect some certain models or what)
    The main point is, that the games (based on BF2) apparently are compiled using Pentium Pro / Pentium II / Pentium III optimizations, instead of taking real advantage of SSE/SSE2 etc, which would only need one switch "turned" by the developer.
     
  2. Cartoon Corpse

    Veteran

    Joined:
    Apr 22, 2004
    Messages:
    2,216
    Likes Received:
    22
    that's pretty odd, if it's true. conspiracy? something to get/keep AMD afloat in the market for competition's sake?
     
  3. Himself

    Regular

    Joined:
    Sep 29, 2002
    Messages:
    381
    Likes Received:
    2
    Riddick has several builds for plan x86, sse and sse2, as well as amd64, so some games do do this.

    I'd say it's because p4 was really designed around rambus and without it, it's not working in an ideal environment. It hasn't upgraded it's bus since forever, and doesn't have an on die memory controller. Games are about moving data to the video card mostly, same reason the quake games ran so much better on the celerons vs the K6.

    Also could be that at the resolutions people play at, the cpu isn't the determining factor anymore so developers don't need to super tweak cpu performance.
     
  4. Blacklash

    Newcomer

    Joined:
    Feb 26, 2004
    Messages:
    219
    Likes Received:
    3
    Try this article at the same site. It doesn't explain precisely why AMD is better and it does demonstrate why you would want one over an Intel, particularly for gaming. In addition to being overall faster they are smoother, or they spend less time crashing to 0 FPS and in the rough|low unplayable FPS ranges.

    http://www.extremetech.com/article2/0,1697,1854918,00.asp
     
  5. arjan de lumens

    Veteran

    Joined:
    Feb 10, 2002
    Messages:
    1,274
    Likes Received:
    50
    Location:
    gjethus, Norway
    AFAICS, after hardware T&L, deep command queues etc started appearing, the processor effort involved in pushing data to the GPU has dropped by a rather large amount and is not by itself very important these days. The difference between Intel and AMD performance in games is AFAIK mainly due to branch mispredict penalties, memory latency and perhaps use of dynamic dispatch; in particular, AI and physics code tend to have poor branch/memory access predictability, which hurts P4 much more than A64.

    The P4 also has some odd, more technical performance issues; e.g. if two data elements are spaced exactly 64 Kbytes (or a multiple) apart, the P4 cannot hold both in L1 cache at the same time, causing you to randomly lose 20-30% performance every once in a while. IIRC, this particular issue has caused in particular Nvidia a lot of headache; for every driver change they do, they run an extensive set of performance regression tests, and this issue basically ensured that no matter what part of the driver they changed, some other unrelated part would go 20% slower.
     
  6. Sxotty

    Veteran

    Joined:
    Dec 11, 2002
    Messages:
    4,894
    Likes Received:
    344
    Location:
    PA USA
    A64 supports SSE2 as well so why would the P4 get a boost and not the A64?
     
  7. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,185
    Likes Received:
    1,841
    Location:
    Finland
    How's the SSE2 speed on A64?
     
  8. karlotta

    karlotta pifft
    Veteran

    Joined:
    Jun 7, 2003
    Messages:
    1,292
    Likes Received:
    10
    Location:
    oregon
    Not realy at at highres. If you run the cpu only to bench then the fx57 is 20 frames faster! Huge! BUT, when you run at 1600/1200 its a wash.... But the amd64 still out performs just not very much.
     
  9. wireframe

    Veteran

    Joined:
    Jul 14, 2004
    Messages:
    1,347
    Likes Received:
    33
    The Athlon 64, and its predecessor, were designed to handle floating point math very well by having multiple FPUs. The FPU pipeline is not as straight forward as the integer pipeline or SSE (Streaming SIMD). What the K7 and K8 actually do well is mask the intermediate steps giving you high floating point throughput. However, once you switch to SSE you have a simpler processing pipeline and all that logic that the K7/K8 has is no longer needed. That is to say, SSE, or streaming SIMD operation, can use reduced logic with a more straighforward pipeline . It becomes about clock rates again because the NetBurst and K8 SSE units/pipeline is roughly the same. So, the K8 gains a little over FPU but the NetBurst gains a lot, especially because it doesn't like FPU operations in the first place.

    Another factor is the Athlon 64 memory controller. This controller and its proximity close to the core/logic makes it excellent for highly granular data. However, it doesn't do a whole lot for streaming memory operations.

    Of course, both, or perhaps I should say a new architecture, could have multiple parallel SSE units but these are very bandwidth intensive and would require a lot of bandwidth to feed. Another solution would be to have separate clock domains. By using streaming operations you are, in many ways, paying for reduced complexity with bandwidth.
     
  10. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    213
    Location:
    Uffda-land
    Because NV and ATI are just now figuring out that HT can help performance if you multi-thread the drivers?

    P.S. This was mostly a quip, btw, so AMD-lovers don't pin my ears back too badly --it did blow my mind tho to see ATI going "Oh, yeah, huh --works with HT too."
     
  11. YeuEmMaiMai

    Regular

    Joined:
    Sep 11, 2002
    Messages:
    579
    Likes Received:
    4
    I would say due to the fact tht the AMD has a shorter pipe line has a lot to do with it.
     
  12. Sxotty

    Veteran

    Joined:
    Dec 11, 2002
    Messages:
    4,894
    Likes Received:
    344
    Location:
    PA USA
    Yeah, but ideally your CPU would already have been maxed out with AI, physics and other stuff. I think this is more of a stopgap measure personally but we will see, maybe with quad cores etc. coming out it will always be useful as developers of games won't be up to using all the CPUs perhaps, but then perhaps not. (maybe I just don't like wasting system resources to check whether there are excess resources available ;) not to mention the whole thermal throttling bit if you use certain CPUs at max load.)
     
  13. Tahir2

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,978
    Likes Received:
    86
    Location:
    Earth
    Superior FPU power, larger L1 cache, shorter pipelines, onboard memory controller, DDR1 latencies inherently lower.

    It is only recently (relatively) that AMD's processors have shown their advantage against the Intel based processors. This is not only down to AMD but the direction Intel has taken with its Prescott core.
     
  14. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    The problem with that is that it will crash on CPUs that don't support SSE/SSE2. Compiling for P2/P3 is probably done because they need the extra performance more than the top CPUs. Still though, it's not that hard to provide several difference executables for a range of different CPUs.
     
  15. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    Not bad, but not excellent either. You'll get better performance with 3DNow, and P4 will generally run SSE/SSE2/SSE3 code faster than A64.
     
  16. wireframe

    Veteran

    Joined:
    Jul 14, 2004
    Messages:
    1,347
    Likes Received:
    33
    Are you sure about this? I don't think this is right at all. I think the Athlon 64 actually does a bit better in SSE2 per clock than a Pentium 4.


    ...Which goes back to my argument above that once SSE code is used the IPC difference between K8 and NetBUrst becomes lower and it becomes a battle of clocks and memory bandwidth.
     
  17. Skrying

    Skrying S K R Y I N G
    Veteran

    Joined:
    Jul 8, 2005
    Messages:
    4,815
    Likes Received:
    61
    So the problem is that they could complie the games for a number of instruction sets, each one giving a CPU a gain. So instead of doing this they just complie the game for a lower end CPU and expect the higher end ones to bare it out, which AMDs do better than Intels.
     
  18. suryad

    Veteran

    Joined:
    Aug 20, 2004
    Messages:
    2,479
    Likes Received:
    16
    Really? I wonder why they compile the games like that...lowest common denominator?
     
  19. Skrying

    Skrying S K R Y I N G
    Veteran

    Joined:
    Jul 8, 2005
    Messages:
    4,815
    Likes Received:
    61
    To sell to a wider user base. The people with the top of the line CPUs dont need the extra attention, so they complie for the group that does.
     
  20. Tahir2

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    2,978
    Likes Received:
    86
    Location:
    Earth
    We are not talking about per clock, and even then it is debatable whether the A64 is faster per clock at SSE2 code than the Pentium 4 architecture.

    Per clock is also irrelevant, the Pentium 4 has more clocks than the A64 as it is.

    I don't remember where but there was an article that measured the SSE2 speed of the A64 vs the P4 and the A64 was behind in all instances.

    This could be similar to the way that A64 is faster at x86-64 than the Penitum CPU's. are.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...