Bring back high performance single core CPUs already!

Discussion in 'PC Hardware, Software and Displays' started by Frontino, Apr 10, 2012.

  1. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    3,744
    Location:
    Guess ;)
    Yes, you are correct. I chose to drag Skyrim in here as a game that had previously demonstrated scaling beyond 4 cores, and someone rightfully asked if that held true after all the recent patching.

    I felt it necessary to properly answer the question, and the answer was generally "no", scaling did NOT hold true after the newer patches, at least when playing using graphics settings that the ultra-enthusiast is probably going to use. I guess you could say that I was doing the proper due diligence to either support or refute my claim, and it kinda went 50/50 for me :D

    Negatives; six-core scaling appears to be zero (or perhaps even slightly negative?) Meh.
    Positives: it still needs a minimum of two cores to be playable, preferably four.

    If I can figure out how to get vsync turned off this weekend, I'll re-run all these tests without the FPS cap. I think we may still find some additional data lurking under there...
     
  2. swaaye

    swaaye Entirely Suboptimal
    Legend

    Joined:
    Mar 15, 2003
    Messages:
    7,846
    Location:
    WI, USA
    I find Skyrim to run ok on a dual core. I know some kids who've put hundreds of hours into it on old low-clocked Core 2 Duo CPUs with a 3850 and a 4670. Granted they play at 1360x768 / 1280x1024 and they settle for medium detail. Also, the fixed compiler settings made nice gains for old CPUs.

    The game engine seems to scale similarly to Oblivion with core count (ie, maxes out with 2 cores essentially). I'm sure you wouldn't want to play on a middling Athlon 64 X2 or a Pentium D. The graphics are probably the most demanding aspect compared to Oblivion, and you really want at least a 4850 / GTX 260. An 8800GT tears up Oblivion - not so with Skyrim.
     
  3. Richard

    Richard Mord's imaginary friend
    Veteran

    Joined:
    Jan 22, 2004
    Messages:
    3,508
    Location:
    PT, EU
    Skyrim has an internal 64hz limitation. This limit coupled with the usual 59/60hz refresh rate is the reason for the odd and occasional studdering (not stuttering) where the game appears to skip frames while displaying "60fps" on your favourite fps counter.

    There are some options for getting around it but most of them have the side effect of throwing the physics out of whack (even more than normal for TES).

    Anyway, great machine and thanks for your contribution. It also matches my anecdotal evidence: makes use of multiple cores, doesn't use up all my 8 threads. Do you play at uGrids=7? I play at 9 and have a mere i7 2700k + 8gb + R6970 @ 1080p max details + ini tweaks. You ought to be able to raise it to 11 (started crashing for me). Can you indulge me a little bit more and try setting your shadow buffers to 8192? It's playable for me and it noticeably increases the quality since I'm using "real-time" shadow updates (0 delay, 0 interval) but it means an average fps of 30 instead of 60.
     
  4. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    12,747
    I had that problem with Alpha Prime when trying to test out steamy's theory
     
  5. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    3,744
    Location:
    Guess ;)
    It is my observation that there is still performance left on the table if you're only using a dual core. I've severely bottlenecked my rig on the GPU by my excessive use of SSAA and high res. If I flip to MSAA, there is a measurable change between 2c / 2t and 2c / 4t or similarly 4c / 4t. But this is not to say that you couldn't play (and enjoy) Skyrim on a truly dual core rig, especially if you're at lower settings.

    Ah, that makes more sense now. I hadn't done any research on it yet to discover why....

    Yeah, the game settings that I used are my normal play settings, to include the SSAA as there's a lot of shader aliasing in this game and it drives me nuts. I think I tried ugrids=9 a while back, but I encountered performance issues on my old Q9450 + 5850 rig. Makes sense to go back, so I'll check it out.

    Yeah, I can do that. Maybe I'll run 1024, 2048, 4096, 8192 buffers through a few core options. Need to find a place with shadows all over the place, but that's not hard. Do you recall the settings for 'real-time' shadow updates? If not, I'm sure a small bit of time on Das Google will get me straight.
     
    #45 Albuquerque, Apr 13, 2012
    Last edited by a moderator: Apr 13, 2012
  6. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    12,580
    Very interesting and thanks for running that. So basically for an enthusiast who overclocks, looking at your 4.5 ghz numbers, all you need is a dual core without HT to just about max performance in game at least with that OC'd video card. For non-overclocking situations then a 2 core with 4 threads seems optimal.

    With 2 cores and 4 threads there's no difference. Moving to 4 cores with 4 threads gives you ~1.7% more perf. Moving to 4 cores with 8 threads gives your max perf. bump with ~4.2%.

    That last setting is a bit weird as higher core/thread counts revert back to the 4 core 4 thread speed. I wonder if the CPU is throttling at those higher core counts due to increased heat generation from more cores being active.

    But it basically reinforces what Carsten was saying about him not really needing to move up from his dual core CPU, assuming he can reach high enough clockspeeds.

    Regards,
    SB
     
  7. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    3,744
    Location:
    Guess ;)
    I want to stress again, we're hitting a GPU bottleneck around 47FPS with my SSAA usage. There is still performance on the table after 2c/2t if you're not choking your card to this degree.

    Also, the CPU temperature peaked 56*c during all this exercise, and "turbo" is effectively null on the K and X series CPU's when you overclock them. It operates at 4.5Ghz under *any* load, although it will still idle at 1.5Ghz if you're not doing anything.
     
  8. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    12,580
    On my K series CPU, turbo still works fine. I think support for it is entirely up to the motherboard vendor. On my motherboard when I set overclock speeds I set speeds for 1 core active, 2 core active, etc... So I can have very high single core speeds without being limited to the max OC for 4 cores active.

    But anyway, back to that. Yes. That's true so obviously not throttling on the CPU. But still hits the limit where an enthusiasts GPU is going to limit any potential benefits of more than 2 cores.

    And yes, as I said before there are games that are an exception to that. And as sebbbi mentioned, it's quite likely a byproduct of the majority of AAA PC games being ports from console.

    So when the next generation of consoles hit, hopefully we'll also see better use of multiple cores in the PC space.

    Also, that isn't to say that more than 2 cores aren't useful in non-gaming situations. :) You don't have to preach to me about multi-core. I was using dual CPU's in the desktop space all the way back when the Celeron 300A could not only overclock from 300 to 450 mhz reliably, but also worked with certain server motherboards sporting dual sockets. :) I've been a convert ever since.

    Regards,
    SB
     
  9. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    3,744
    Location:
    Guess ;)
    Oh yeah, I guess I forgot about that pain in the ass ;) The Intel DX79Si does allow for that kind of tweaking for six cores, but doing that per-core for a six core rig sucks. I just flipped the bit to allow one turbo multiplier for everything, and set it to 36. There is also a 'non-turbo' multiplier which is still set to 32, and then the processor will still continue to downclock below that point if you aren't using it.

    Actually, by capping the "All Turbo" multiplier to 24, it limits the entire CPU to 3Ghz even though the "non turbo" multiplier is still set at 32. I didn't expect it to work that way...
     
  10. Richard

    Richard Mord's imaginary friend
    Veteran

    Joined:
    Jan 22, 2004
    Messages:
    3,508
    Location:
    PT, EU
    The worst perf hit with 8k buffers happened in the dwemer ruins of Markarth. But you can see the difference immediately with shadows in a town.

    4K:
    [​IMG]

    8K:
    [​IMG]

    For having the sun shadows update continuously you need to edit these lines on your skyrim.ini (not skyrimprefs.ini)

    Code:
    fSunShadowUpdateTime=0.000 
    fSunUpdateThreshold=0.000
    It's a little weird at first.
     
  11. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    7,406
    Location:
    Guess...
    Cheers sebbbi, that's the kind of post I come here for, great insight into the relative performance of those CPU's
     
  12. imaxx

    Newcomer

    Joined:
    Mar 9, 2012
    Messages:
    131
    Location:
    cracks
    @Davros: Netburst was a very interesting CPU architecture. Not a good one, but very interesting. Pushed to the limits, 4Ghz base clock, it was running internally at an amazing 8Ghz frequency(!!). Problem is, the delta gained with an higher base clock (something like 33% if I remember well) was lost due to the compromises (32 stages pipeline) required to push such clock. AMD has done the same with BD for raising its clock, bringing its pipeline to more or less to the same length of the original netburst architecture (around 20-23). A risky choice, considered its precedent, at least (but BD sucks hard because of the shared decoder, anyway).

    @Albuquerque: you missed my point. In order to discuss on the same basis using a complex toy like Skyrim, you should be able to:
    * be able to isolate the multitasking parts in Skyrim (usually sound, AI, script engine).
    * be able to isolate the memory/cache subsystem impact.
    * analyse the % of the multithreaded work (i.e. the single-thread part of the rendering engine + the time spent multithreading i.e. the parallel octree descenents for occlusion+ the syncro/issue time spent for threads).

    I was referring to the boundaries you get when trying to maximize the performance of an application, if you want to measure them you can just write a simple INT app that uses a #pragma parallel for in order to issue a % of its work to threads (and inlining prefetchx's!). There you can see 'in clean' such data - Skyrim (or win bootup!!) is just too complex to do it, unless you can comply with the points above...

    The chart I attached implies that the benefits obtained in a multithreaded application more than linearly decrease with the core number due to a number of factors, which is probably why intel didnt get out with a 12+HT cores CPU for the consumer market..
    So, once the benefits of adding multiple cores scale down to minimal values -at that point any IPC increase can affect on average the system speed more than adding another core.

    In a sense, it can apply to GPU also, when speeding up the clock can result in a better performance than adding more cores, if the time wasted for scheduling/issuing the additional work to the added cores eats too many of their advantages.
     
  13. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    12,747
    Ahh, now i understand where your getting confused , the alu's were double pumped so at 4ghz they were running at 8ghz effective not actuall, they were still clocked at 4ghz
    like ddr 200 doent actually run at 200mhz its 100mhz. but because it deals with 2 lots of data per cycle its the equivalent of sdr running twice as fast.
     
  14. imaxx

    Newcomer

    Joined:
    Mar 9, 2012
    Messages:
    131
    Location:
    cracks
    No you are wrong, sorry.
    Dual-pump DDR is just a way of transferring more data in the same wave, and has nothing to do with it.
    Netburst ALU was running at double frequence - let me quote you the IA arch manual on my desk:
    "Netburst... Arithmetic Logic Units (ALUs) run at twice the processor frequency", Vol1, 2-7.
     
  15. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    12,747
    ok you win :D
     
  16. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    3,744
    Location:
    Guess ;)
    Sebbi brought up a point that I hadn't considered -- low power (ie, low speed) processors that try to 'make up for it' by having more cores; do they succeed? My next batch of testing now has a 1.5Ghz speed to try and test that out. I also liked Richard's 8192 shadowmap resolution, but I couldn't get uGridsToShow=9 to be stable... So I went for uGridsToShow=11 :D Don't ask me why the higher one worked and the lower one didn't...

    I also turned of SSAA (so I'm only using 4xMSAA + FXAA now) for this group of tests, to leave a bit more room for the CPU to show us what's going on. Besides, the hardest-core enthusiasts would probably trade off my love for SSAA and go back to MSAA to get their framerate into the 60's.

    Here are the pertinent changes to Skyrim.ini:
    Code:
    [General]
    uExterior Cell Buffer=144
    uGridsToLoad=11
    iPreloadSizeLimit=126877696
    
    [Display]
    iShadowMapResolutionPrimary=8192
    fSunShadowUpdateTime=0.000 
    fSunUpdateThreshold=0.000
    And here are the pertinent changes to SkryimPrefs.ini:
    Code:
    [Display]
    iShadowMapResolutionSecondary=8192
    iShadowMapResolutionPrimary=8192

    I also added a new 'cave' location, actually a chunk of the ruins under Markarth. It's almost purely fillrate limited, as it's just an active shadow cast against an otherwise static backdrop. I put this in here to see if the CPU could bottleneck even something as 'simple' as this scene...
    [​IMG]

    And here are the results:
    Code:
    c/t	Ghz	City	Cave
    ----------------------------
    6/12	1.5	29.5	59.1
    	3.0	59.1	59.1
    	4.5	59.1	59.1
    	
    6/6	1.5	29.5	59.1
    	3.0	58.1	59.1
    	4.5	59.1	59.1
    
    4/8	1.5	29.5	59.1
    	3.0	58.1	59.1
    	4.5	59.1	59.1
    
    4/4	1.5	29.5	59.1
    	3.0	58.1	59.1
    	4.5	59.1	59.1
    
    2/4	1.5	23.3	59.1
    	3.0	50.1	59.1
    	4.5	59.1	59.1
    
    2/2	1.5	16.5	59.1
    	3.0	40.5	59.1	
    	4.5	59.1	59.1
    
    1/2	1.5	15.5	43.3
    	3.0	32.0	59.1
    	4.5	52.2	59.1
    
    1/1	1.5	10.5	32.5
    	3.0	24.5	58.1
    	4.5	34.4	58.1
    Look at the 1.5Ghz data! Sebbi is on to something, I believe :) The "cave" scene shows that even a mostly fillrate limited scene still needs two physical cores, and so does the "City" scene (same one from my first test but now with the enhanced ugrids and shadows) although four threads is best if you're not going to overclock.
     
  17. Richard

    Richard Mord's imaginary friend
    Veteran

    Joined:
    Jan 22, 2004
    Messages:
    3,508
    Location:
    PT, EU
    Very nice! Either my 6970's 2GB or 256bit bus (or both) is the bottleneck for 8K buffers. Thanks for the test, and thanks for some hard numbers on core/thread versus clock scaling.
     
  18. Mendel

    Mendel Mr. Upgrade
    Veteran

    Joined:
    Nov 28, 2003
    Messages:
    1,342
    Location:
    Finland
    Get Core i7 2600k. Overclock to 5GHz, should be easy with any decent cooling, then disable cores you don´t need. Problem solved ;)
     
  19. Albuquerque

    Albuquerque Red-headed step child
    Veteran

    Joined:
    Jun 17, 2004
    Messages:
    3,744
    Location:
    Guess ;)
    You don't even need the i7-2600k; your best bang for the buck is more likely the i5-2500k. Use the extra $80 to buy more video card, or one of the Corsair H80 watercooler setups on sale. Lots of clock without lots of noise :)

    The 3930k will do 5Ghz with some VRM cooling, but there's zero reason for me to run it that fast. At the highest settings, I run out of GPU before I run out of CPU.
     
  20. Grall

    Grall Invisible Member
    Legend

    Joined:
    Apr 14, 2002
    Messages:
    9,105
    Location:
    La-la land
    Can't be the video RAM, if the GPU was redrawing basically its entire on-board memory space each frame there wouldn't be enough bandwidth to maintain even a semi-decent framerate.

    Besides, maxing out 2GB is quite hard. A framebuffer at 2560*1440 and 8x MSAA "only" eats 112,5MB, so there's loads of room left.
     

Share This Page

Loading...