Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 14-Apr-2012, 00:42   #51
pjbliverpool
B3D Scallywag
 
Join Date: May 2005
Location: Guess...
Posts: 5,585
Send a message via MSN to pjbliverpool
Default

Quote:
Originally Posted by sebbbi View Post
Unfortunately most games are mainly designed for consoles, and do not properly scale up on PC. It's very easy to draw wrong conclusions by using modern multicore PC CPUs to run game code that is designed for ancient (7 year old) in-order console CPUs.

1.6 GHz 17W Sandy Bridge is (considerably) more powerful than an old 6 thread in-order PPC CPU at 3.2 GHz. Basically if you are running a direct port designed originally for consoles, a high end Sandy Bridge could execute all the six threads sequentically every frame using just a single core, and still hit the required frame rate. And that's why you don't see any scaling when you add more cores, even if the game is programmed to utilize up to six of them.

If you want to properly test multithread scaling of games, you should get an entry level CPU with lots of cores/threads. For example a 4 thread ATOM or a lowest clocked 6 core Phenom. Of even better downclock a 8 core Bulldozer to less than 1 GHz. The scaling will be much better, and you will see huge gains by enabling extra cores.
Cheers sebbbi, that's the kind of post I come here for, great insight into the relative performance of those CPU's
__________________
PowerVR PCX1 -> Voodoo Banshee -> GeForce2 MX200 -> GeForce2 Ti -> GeForce4 Ti 4200 -> 9800Pro -> 8800GTS -> Radeon HD 4890 -> GeForce GTX 670 DCUII TOP

8086 8Mhz -> Pentium 90 -> K6-2 233Mhz -> Athlon 'Thunderbird' 1Ghz -> AthlonXP 2400+ 2Ghz -> Core2 Duo E6600 2.4 Ghz -> Core i5 2500K 3.3Ghz
pjbliverpool is online now   Reply With Quote
Old 14-Apr-2012, 09:57   #52
imaxx
Junior Member
 
Join Date: Mar 2012
Location: cracks
Posts: 94
Default

@Davros: Netburst was a very interesting CPU architecture. Not a good one, but very interesting. Pushed to the limits, 4Ghz base clock, it was running internally at an amazing 8Ghz frequency(!!). Problem is, the delta gained with an higher base clock (something like 33% if I remember well) was lost due to the compromises (32 stages pipeline) required to push such clock. AMD has done the same with BD for raising its clock, bringing its pipeline to more or less to the same length of the original netburst architecture (around 20-23). A risky choice, considered its precedent, at least (but BD sucks hard because of the shared decoder, anyway).

@Albuquerque: you missed my point. In order to discuss on the same basis using a complex toy like Skyrim, you should be able to:
* be able to isolate the multitasking parts in Skyrim (usually sound, AI, script engine).
* be able to isolate the memory/cache subsystem impact.
* analyse the % of the multithreaded work (i.e. the single-thread part of the rendering engine + the time spent multithreading i.e. the parallel octree descenents for occlusion+ the syncro/issue time spent for threads).

I was referring to the boundaries you get when trying to maximize the performance of an application, if you want to measure them you can just write a simple INT app that uses a #pragma parallel for in order to issue a % of its work to threads (and inlining prefetchx's!). There you can see 'in clean' such data - Skyrim (or win bootup!!) is just too complex to do it, unless you can comply with the points above...

The chart I attached implies that the benefits obtained in a multithreaded application more than linearly decrease with the core number due to a number of factors, which is probably why intel didnt get out with a 12+HT cores CPU for the consumer market..
So, once the benefits of adding multiple cores scale down to minimal values -at that point any IPC increase can affect on average the system speed more than adding another core.

In a sense, it can apply to GPU also, when speeding up the clock can result in a better performance than adding more cores, if the time wasted for scheduling/issuing the additional work to the added cores eats too many of their advantages.
imaxx is offline   Reply With Quote
Old 14-Apr-2012, 11:05   #53
Davros
Darlek ******
 
Join Date: Jun 2004
Posts: 10,806
Default

Quote:
Originally Posted by imaxx View Post
@Davros: Netburst was a very interesting CPU architecture. Not a good one, but very interesting. Pushed to the limits, 4Ghz base clock, it was running internally at an amazing 8Ghz frequency(!!).
Ahh, now i understand where your getting confused , the alu's were double pumped so at 4ghz they were running at 8ghz effective not actuall, they were still clocked at 4ghz
like ddr 200 doent actually run at 200mhz its 100mhz. but because it deals with 2 lots of data per cycle its the equivalent of sdr running twice as fast.
__________________
Guardian of the Bodacious Three Terabytes of Gaming Goodness™
Davros is offline   Reply With Quote
Old 14-Apr-2012, 12:02   #54
imaxx
Junior Member
 
Join Date: Mar 2012
Location: cracks
Posts: 94
Default

Quote:
Originally Posted by Davros View Post
Ahh, now i understand where your getting confused , the alu's were double pumped so at 4ghz they were running at 8ghz effective not actuall, they were still clocked at 4ghz
like ddr 200 doent actually run at 200mhz its 100mhz. but because it deals with 2 lots of data per cycle its the equivalent of sdr running twice as fast.
No you are wrong, sorry.
Dual-pump DDR is just a way of transferring more data in the same wave, and has nothing to do with it.
Netburst ALU was running at double frequence - let me quote you the IA arch manual on my desk:
"Netburst... Arithmetic Logic Units (ALUs) run at twice the processor frequency", Vol1, 2-7.
imaxx is offline   Reply With Quote
Old 14-Apr-2012, 13:47   #55
Davros
Darlek ******
 
Join Date: Jun 2004
Posts: 10,806
Default

ok you win
__________________
Guardian of the Bodacious Three Terabytes of Gaming Goodness™
Davros is offline   Reply With Quote
Old 14-Apr-2012, 19:03   #56
Albuquerque
Red-headed step child
 
Join Date: Jun 2004
Location: Guess ;)
Posts: 3,266
Default

Sebbi brought up a point that I hadn't considered -- low power (ie, low speed) processors that try to 'make up for it' by having more cores; do they succeed? My next batch of testing now has a 1.5Ghz speed to try and test that out. I also liked Richard's 8192 shadowmap resolution, but I couldn't get uGridsToShow=9 to be stable... So I went for uGridsToShow=11 Don't ask me why the higher one worked and the lower one didn't...

I also turned of SSAA (so I'm only using 4xMSAA + FXAA now) for this group of tests, to leave a bit more room for the CPU to show us what's going on. Besides, the hardest-core enthusiasts would probably trade off my love for SSAA and go back to MSAA to get their framerate into the 60's.

Here are the pertinent changes to Skyrim.ini:
Code:
[General]
uExterior Cell Buffer=144
uGridsToLoad=11
iPreloadSizeLimit=126877696

[Display]
iShadowMapResolutionPrimary=8192
fSunShadowUpdateTime=0.000 
fSunUpdateThreshold=0.000

And here are the pertinent changes to SkryimPrefs.ini:
Code:
[Display]
iShadowMapResolutionSecondary=8192
iShadowMapResolutionPrimary=8192


I also added a new 'cave' location, actually a chunk of the ruins under Markarth. It's almost purely fillrate limited, as it's just an active shadow cast against an otherwise static backdrop. I put this in here to see if the CPU could bottleneck even something as 'simple' as this scene...


And here are the results:
Code:
c/t	Ghz	City	Cave
----------------------------
6/12	1.5	29.5	59.1
	3.0	59.1	59.1
	4.5	59.1	59.1
	
6/6	1.5	29.5	59.1
	3.0	58.1	59.1
	4.5	59.1	59.1

4/8	1.5	29.5	59.1
	3.0	58.1	59.1
	4.5	59.1	59.1

4/4	1.5	29.5	59.1
	3.0	58.1	59.1
	4.5	59.1	59.1

2/4	1.5	23.3	59.1
	3.0	50.1	59.1
	4.5	59.1	59.1

2/2	1.5	16.5	59.1
	3.0	40.5	59.1	
	4.5	59.1	59.1

1/2	1.5	15.5	43.3
	3.0	32.0	59.1
	4.5	52.2	59.1

1/1	1.5	10.5	32.5
	3.0	24.5	58.1
	4.5	34.4	58.1
Look at the 1.5Ghz data! Sebbi is on to something, I believe The "cave" scene shows that even a mostly fillrate limited scene still needs two physical cores, and so does the "City" scene (same one from my first test but now with the enhanced ugrids and shadows) although four threads is best if you're not going to overclock.
__________________
"...twisting my words"
Quote:
Originally Posted by _xxx_ 1/25 View Post
Get some supplies <...> Within the next couple of months, you'll need it.
Quote:
Originally Posted by _xxx_ 6/9 View Post
And riots are about to begin too.
Quote:
Originally Posted by _xxx_8/5 View Post
food shortages and huge price jumps I predicted recently are becoming very real now.
Quote:
Originally Posted by _xxx_ View Post
If it turns out I was wrong, I'll admit being stupid
Albuquerque is offline   Reply With Quote
Old 16-Apr-2012, 19:22   #57
Richard
Mord's imaginary friend
 
Join Date: Jan 2004
Location: PT, EU
Posts: 3,513
Default

Very nice! Either my 6970's 2GB or 256bit bus (or both) is the bottleneck for 8K buffers. Thanks for the test, and thanks for some hard numbers on core/thread versus clock scaling.
__________________
The optimist proclaims that we live in the best of all possible worlds, and the pessimist fears this is true. - James Branch Cabell
Richard is offline   Reply With Quote
Old 16-Apr-2012, 21:20   #58
Mendel
Mr. Upgrade
 
Join Date: Nov 2003
Location: Finland
Posts: 1,337
Default

Get Core i7 2600k. Overclock to 5GHz, should be easy with any decent cooling, then disable cores you don´t need. Problem solved
Mendel is offline   Reply With Quote
Old 16-Apr-2012, 23:22   #59
Albuquerque
Red-headed step child
 
Join Date: Jun 2004
Location: Guess ;)
Posts: 3,266
Default

Quote:
Originally Posted by Mendel View Post
Get Core i7 2600k. Overclock to 5GHz, should be easy with any decent cooling, then disable cores you don´t need. Problem solved
You don't even need the i7-2600k; your best bang for the buck is more likely the i5-2500k. Use the extra $80 to buy more video card, or one of the Corsair H80 watercooler setups on sale. Lots of clock without lots of noise

The 3930k will do 5Ghz with some VRM cooling, but there's zero reason for me to run it that fast. At the highest settings, I run out of GPU before I run out of CPU.
__________________
"...twisting my words"
Quote:
Originally Posted by _xxx_ 1/25 View Post
Get some supplies <...> Within the next couple of months, you'll need it.
Quote:
Originally Posted by _xxx_ 6/9 View Post
And riots are about to begin too.
Quote:
Originally Posted by _xxx_8/5 View Post
food shortages and huge price jumps I predicted recently are becoming very real now.
Quote:
Originally Posted by _xxx_ View Post
If it turns out I was wrong, I'll admit being stupid
Albuquerque is offline   Reply With Quote
Old 17-Apr-2012, 00:01   #60
Grall
Invisible Member
 
Join Date: Apr 2002
Location: La-la land
Posts: 6,315
Default

Quote:
Originally Posted by Richard View Post
Very nice! Either my 6970's 2GB or 256bit bus (or both) is the bottleneck for 8K buffers.
Can't be the video RAM, if the GPU was redrawing basically its entire on-board memory space each frame there wouldn't be enough bandwidth to maintain even a semi-decent framerate.

Besides, maxing out 2GB is quite hard. A framebuffer at 2560*1440 and 8x MSAA "only" eats 112,5MB, so there's loads of room left.
__________________
"If I were a science teacher and a student said the Universe is 6000 years old, I would mark that answer as wrong (why? Because it is)."
-Phil Plait
Grall is offline   Reply With Quote
Old 17-Apr-2012, 00:12   #61
Albuquerque
Red-headed step child
 
Join Date: Jun 2004
Location: Guess ;)
Posts: 3,266
Default

Yeah, not sure what the bottleneck is on 8192 shadows, but I ran headlong into the same bottleneck on my Q9450 + 5850 config a few months ago. I made the (naive?) assumption it was VRAM limited, but never did the proper research to prove that.

GPU-Z and MSI Afterburner both show >2500Mb of VRAM usage while I'm dorking around the outdoors in Skyrim; typically less when I'm indoors somewhere. I haven't compared it after making Richard's shadow and uGrids changes; I'll check it out tonight and report back.
__________________
"...twisting my words"
Quote:
Originally Posted by _xxx_ 1/25 View Post
Get some supplies <...> Within the next couple of months, you'll need it.
Quote:
Originally Posted by _xxx_ 6/9 View Post
And riots are about to begin too.
Quote:
Originally Posted by _xxx_8/5 View Post
food shortages and huge price jumps I predicted recently are becoming very real now.
Quote:
Originally Posted by _xxx_ View Post
If it turns out I was wrong, I'll admit being stupid
Albuquerque is offline   Reply With Quote
Old 17-Apr-2012, 00:26   #62
almighty
Naughty Boy!
 
Join Date: Dec 2006
Posts: 2,469
Default

Quote:
Originally Posted by Mendel View Post
Get Core i7 2600k. Overclock to 5GHz, should be easy with any decent cooling, then disable cores you don´t need. Problem solved
Why have 5Ghz when you can have 5.5Ghz like me
almighty is offline   Reply With Quote
Old 17-Apr-2012, 02:03   #63
HMBR
Member
 
Join Date: Mar 2009
Posts: 235
Default

Quote:
Originally Posted by Albuquerque View Post
I generally agree, but there are some odd outliers. Skyrim really loves six cores even at the highest details:

I suppose it might be caching, but really seems to love threads. Bizarre.
X2 3.3GHz = 44FPS
X4 3.7GHz = 54FPS
X6 3.3 (up to 3.7GHz with turbo) = 49FPS

the game doesn't seem to care for more core than maybe 3,
I think you shouldn't look at the i7 3xxx compared to the 2600k because there is simply to much difference, a lot more l3 cache, a lot more memory bandwidth and in this game maybe a higher clock with turbo to,

my experience with Skyrim was divided between 3 CPUs,
first an E5400 (2mb L2, 2 Cores) at 3.7GHz,
without the first patches the framerate was already OK at my settings (above 20 on the most intensive places but normally above 40), BUT the game had some terrible stuttering and freezes which as far as I know were only happening with dual core CPUs, there was an unofficial fix that worked (from enbdev.com), but it was later solved with the patches,
I tested some underclocking at 2.4GHz the game was still playable, with more than 20fps in Riften, and here is the funny thing, with a higher level of details (all at ultra, which is to much for my VGA) the experience was still smooth at this clock, with constant framerate, while at 3.7GHz it was awful, with a lot of variation, with the framerate jumping up and down all the time,
anyway, the 1.4 patch made the game a lot lighter, I stated seeing most of the time the framerate at 50fps or more, with the lowest going to the 30s...
I swapped this CPU with a Core 2 Quad 65nm at 2.85GHz, and performance decreased a little but stayed close enough, it was clear by looking at task manager that this game uses very little more than 2 cores....

going for a i3 2100 the framerate definitely improved, I guess the architecture improvements really can make up for the "missing" cores easily in this case (gaming), a lot faster memory/IO subsystem I guess...

but there are games that are far more successful at using "more threads" than Skyrim,
comparing the e54@3.75 to the C2Q@2.85GHz I saw a huge advantage on the C2Q at some games like Witcher 2 and GTA 4 things jumping from 20 to 30FPS basically (but the C2Q also had more L2 cache),

I think a dual core sandy bridge with HT at 4.5GHz would mostly make useless more cores for gaming right now... and I also think that this is the reason why intel only unlocks overclocking at their more expensive parts... so yes, the OP have a point I think... most users don't really need 4/6 cores, but could use 2 stronger cores,
HMBR is offline   Reply With Quote
Old 17-Apr-2012, 04:11   #64
I.S.T.
Senior Member
 
Join Date: Feb 2004
Posts: 2,534
Default

Quote:
Originally Posted by HMBR View Post
X2 3.3GHz = 44FPS
X4 3.7GHz = 54FPS
X6 3.3 (up to 3.7GHz with turbo) = 49FPS

the game doesn't seem to care for more core than maybe 3,
I think you shouldn't look at the i7 3xxx compared to the 2600k because there is simply to much difference, a lot more l3 cache, a lot more memory bandwidth and in this game maybe a higher clock with turbo to,

my experience with Skyrim was divided between 3 CPUs,
first an E5400 (2mb L2, 2 Cores) at 3.7GHz,
without the first patches the framerate was already OK at my settings (above 20 on the most intensive places but normally above 40), BUT the game had some terrible stuttering and freezes which as far as I know were only happening with dual core CPUs, there was an unofficial fix that worked (from enbdev.com), but it was later solved with the patches,
I tested some underclocking at 2.4GHz the game was still playable, with more than 20fps in Riften, and here is the funny thing, with a higher level of details (all at ultra, which is to much for my VGA) the experience was still smooth at this clock, with constant framerate, while at 3.7GHz it was awful, with a lot of variation, with the framerate jumping up and down all the time,
anyway, the 1.4 patch made the game a lot lighter, I stated seeing most of the time the framerate at 50fps or more, with the lowest going to the 30s...
I swapped this CPU with a Core 2 Quad 65nm at 2.85GHz, and performance decreased a little but stayed close enough, it was clear by looking at task manager that this game uses very little more than 2 cores....

going for a i3 2100 the framerate definitely improved, I guess the architecture improvements really can make up for the "missing" cores easily in this case (gaming), a lot faster memory/IO subsystem I guess...

but there are games that are far more successful at using "more threads" than Skyrim,
comparing the e54@3.75 to the C2Q@2.85GHz I saw a huge advantage on the C2Q at some games like Witcher 2 and GTA 4 things jumping from 20 to 30FPS basically (but the C2Q also had more L2 cache),

I think a dual core sandy bridge with HT at 4.5GHz would mostly make useless more cores for gaming right now... and I also think that this is the reason why intel only unlocks overclocking at their more expensive parts... so yes, the OP have a point I think... most users don't really need 4/6 cores, but could use 2 stronger cores,
Read the rest of his data, you'll see that HT just isn't quite enough.
I.S.T. is offline   Reply With Quote
Old 17-Apr-2012, 04:31   #65
Albuquerque
Red-headed step child
 
Join Date: Jun 2004
Location: Guess ;)
Posts: 3,266
Default

Quote:
Originally Posted by HMBR View Post
<snip>
Yeah, uh, try reading the rest of my posts Let me help with one of my cliff notes from an earlier post (but long after the one you quoted...)
Quote:
Originally Posted by Albuquerque View Post
Yes, you are correct. I chose to drag Skyrim in here as a game that had previously demonstrated scaling beyond 4 cores, and someone rightfully asked if that held true after all the recent patching.

I felt it necessary to properly answer the question, and the answer was generally "no", scaling did NOT hold true after the newer patches, at least when playing using graphics settings that the ultra-enthusiast is probably going to use. I guess you could say that I was doing the proper due diligence to either support or refute my claim, and it kinda went 50/50 for me

Negatives; six-core scaling appears to be zero (or perhaps even slightly negative?) Meh.
Positives: it still needs a minimum of two cores to be playable, preferably four.
I've done a bit of homework for the world to see, and HT isn't much help. You want real cores, not HT... The i5-2500k seems to be your absolute best "bang for the buck" in terms of gaming performance, which really isn't news to anyone... Also, at very low speeds (ie low-power processors found in laptops) there is a measurable performance benefit to having four physical cores in Skyrim. The benches indicated a jump from 10fps -> 30fps via jumping from single core to quad core (hyperthreading helped at lower core counts, but maximum performance was found at 4c / 4t rather than 2c / 4t)
__________________
"...twisting my words"
Quote:
Originally Posted by _xxx_ 1/25 View Post
Get some supplies <...> Within the next couple of months, you'll need it.
Quote:
Originally Posted by _xxx_ 6/9 View Post
And riots are about to begin too.
Quote:
Originally Posted by _xxx_8/5 View Post
food shortages and huge price jumps I predicted recently are becoming very real now.
Quote:
Originally Posted by _xxx_ View Post
If it turns out I was wrong, I'll admit being stupid

Last edited by Albuquerque; 17-Apr-2012 at 04:37.
Albuquerque is offline   Reply With Quote
Old 17-Apr-2012, 05:34   #66
itsmydamnation
Member
 
Join Date: Apr 2007
Location: Australia
Posts: 796
Default

Quote:
Originally Posted by Albuquerque View Post
Yeah, uh, try reading the rest of my posts Let me help with one of my cliff notes from an earlier post (but long after the one you quoted...)


I've done a bit of homework for the world to see, and HT isn't much help. You want real cores, not HT... The i5-2500k seems to be your absolute best "bang for the buck" in terms of gaming performance, which really isn't news to anyone... Also, at very low speeds (ie low-power processors found in laptops) there is a measurable performance benefit to having four physical cores in Skyrim. The benches indicated a jump from 10fps -> 30fps via jumping from single core to quad core (hyperthreading helped at lower core counts, but maximum performance was found at 4c / 4t rather than 2c / 4t)
so what your really trying to say is that 17watt trinity is going to be awesome
itsmydamnation is offline   Reply With Quote
Old 17-Apr-2012, 06:55   #67
Ninjaprime
Member
 
Join Date: Jun 2008
Posts: 337
Default

I'm pretty sure the CPU market took the turn towards multicore the way it did for a reason, a reason smarter and more informed people than me decided. However, one has to wonder what kind of performance a hypothetical single core Sandy Bridge pushed to 5Ghz+ with 4 threads and 512 bit larrabee-style vector unit, with massive 32-64MB L2 cache would look like, performance wise, had the single core method stuck around. Might be able to keep up with or maybe even beat a dual core SB of today. Though 4/6 cores would probably crush it. Someone make it happen!
Ninjaprime is offline   Reply With Quote
Old 17-Apr-2012, 07:25   #68
Albuquerque
Red-headed step child
 
Join Date: Jun 2004
Location: Guess ;)
Posts: 3,266
Default

Quote:
Originally Posted by itsmydamnation View Post
so what your really trying to say is that 17watt trinity is going to be awesome
No comment

Quote:
Originally Posted by Ninjaprime View Post
However, one has to wonder what kind of performance a hypothetical single core Sandy Bridge pushed to 5Ghz+ with 4 threads and 512 bit larrabee-style vector unit, with massive 32-64MB L2 cache would look like, performance wise, had the single core method stuck around.
Meh. Four threads from a single core sounds foolish and unlikely useful; a fat vector unit makes a lot of assumptions on how game code would get written, and epic L2 cache actually doesn't seem of much use given prior history and the 'usefulness' of parts that sport the same clockspeed and architecture but fatter cache size (ie: very little difference.)

Case in point: my six core, twelve thread 3930k sports 50% more cache, 50% more cores, 50% more threads, 100% more main memory bandwidth, and 200% more PCI-E lanes and yet basically equals or loses to a 2600k when talking strictly about games. Of course, when you throw in something that is "compute" related (H.264 encoding, raytracing, blah-de-blah) then the SB-E platform brings out the big guns and lays waste to the 2600k.

Single core with all that jazz? Not seeing it.
__________________
"...twisting my words"
Quote:
Originally Posted by _xxx_ 1/25 View Post
Get some supplies <...> Within the next couple of months, you'll need it.
Quote:
Originally Posted by _xxx_ 6/9 View Post
And riots are about to begin too.
Quote:
Originally Posted by _xxx_8/5 View Post
food shortages and huge price jumps I predicted recently are becoming very real now.
Quote:
Originally Posted by _xxx_ View Post
If it turns out I was wrong, I'll admit being stupid
Albuquerque is offline   Reply With Quote
Old 17-Apr-2012, 08:06   #69
hkultala
Member
 
Join Date: May 2002
Location: Herwood, Tampere, Finland
Posts: 271
Default

Quote:
Originally Posted by imaxx View Post
@Davros: Netburst was a very interesting CPU architecture. Not a good one, but very interesting. Pushed to the limits, 4Ghz base clock, it was running internally at an amazing 8Ghz frequency(!!). Problem is, the delta gained with an higher base clock (something like 33% if I remember well) was lost due to the compromises (32 stages pipeline) required to push such clock. AMD has done the same with BD for raising its clock, bringing its pipeline to more or less to the same length of the original netburst architecture (around 20-23). A risky choice, considered its precedent, at least (but BD sucks hard because of the shared decoder, anyway).
AFAIK Willamette/Northwood had 28 stages, and AFAIK bulldozer has only about 20 stages(precise number not stated).
And Presscott had 42 stages.

So, bulldozers pipeline length is only "halfway from P6/K7 to willamette and northwood" and "1/3 way from P6/K7 to presscott".

And long pipeline was not the biggest/only problem with willamette/northwood's IPC; slow shifts, multiplications, small L1D cache were bigger IPC limiters.

On presscott with many of these fixed or improved and even longer pipeline, the pipeline length was really the biggest ipc-reducer.

And the pipeline length of bulldozer is quite equal to the pipeline length of Power7, worlds fastest microprocessor.
hkultala is offline   Reply With Quote
Old 17-Apr-2012, 08:35   #70
hoho
Senior Member
 
Join Date: Aug 2007
Location: Estonia
Posts: 1,218
Send a message via MSN to hoho Send a message via Skype™ to hoho
Default

Quote:
Originally Posted by hkultala View Post
AFAIK Willamette/Northwood had 28 stages, and AFAIK bulldozer has only about 20 stages(precise number not stated).
And Presscott had 42 stages.
No, pre-Prescott Netbursts had 20 stage pipeline and Prescott and up had 31 stages.
Quote:
Originally Posted by hkultala View Post
Power7, worlds fastest microprocessor.
In what workloads?
hoho is offline   Reply With Quote
Old 17-Apr-2012, 09:29   #71
hkultala
Member
 
Join Date: May 2002
Location: Herwood, Tampere, Finland
Posts: 271
Default

Quote:
Originally Posted by hoho View Post
No, pre-Prescott Netbursts had 20 stage pipeline and Prescott and up had 31 stages.
20 stages AFTER the trace cache

8 stages before the trace cache/total.

28 stages total.

And presscott had.. 31 + 11?
hkultala is offline   Reply With Quote
Old 17-Apr-2012, 11:23   #72
imaxx
Junior Member
 
Join Date: Mar 2012
Location: cracks
Posts: 94
Default

Quote:
Originally Posted by hkultala View Post
20 stages AFTER the trace cache
8 stages before the trace cache/total.
Counting the TC into the stages for the P4 is like to count L1I latency - it sounds not very fair.
NetBurst did pay indeed a high price for such performance, but was funny: a nice example I remember was it needed an additional mop for INC vs ADD for masking out flags, or the TC space 'borrows' alchemy.

I'm not saying BD is slower because it has the same stages of P4, but rather it sounds like a trend reversal for x86. I'm sure AMD ensured that the 10/15% clock advantage more than cover the issues it brings (well, same could have been said for Intel so..). BD looks a bit like K10 to me - a caged, immense firepower toy with a tiny entrance (K10 with a tiny exit, too).

Power7... if x86 had fixed instruction size, maybe with VLIW possibilities, more registers, an optimized set of instructions +etc... but you see where Itanium ended up -at AMD64. Compatibility wins, for large scale.
imaxx is offline   Reply With Quote
Old 17-Apr-2012, 11:48   #73
sebbbi
Senior Member
 
Join Date: Nov 2007
Posts: 1,222
Default

Quote:
Originally Posted by Albuquerque View Post
Look at the 1.5Ghz data! Sebbi is on to something, I believe
Yes... a 4.5 GHz single core (with HT) already reaches 52 fps (very near the 60 fps cap), and dual core (without HT) reaches the max. Adding more cores or threads do not help at all, since two beefy high clocked Sandy Bridge cores can already execute all the game threads sequentially in the allocated 16 ms time slot (60 fps target).

At 1.5 GHz however you see good scaling. 1 core = 10.5 fps, 1 cores with HT = 15.5 fps, 2 cores = 16.5 fps, 2 cores with HT = 23.3 fps, and 4 cores = 29.5. Also you see that extra hardware threads provided by HT give very good gains at low core counts (1 core + HT = 93% of 2 cores, 2 cores + HT = 78% of 4 cores).

It seems that Skyrim is designed to run at 30 fps on lower end processors (and consoles). The 29.5 fps is exactly half of the 59.3 fps cap seen in high end benchmarks. Maybe the game detects the CPU clock speed and lowers the cap to half if an low end CPU is detected (very good idea, since constant frame rate is always better than a fluctuating one). That's why there's no additional gains when going over 4 cores. You could try to slightly increase the CPU clocks, and see when the game switches to 60 fps mode. 2 GHz would be an good even number for example... The 3 GHz CPU slightly scales up from 4 -> 6 cores, so there's likely extra scaling to be discovered at lower clocks (as long as the frame cap is not lowered to 30 fps).
sebbbi is online now   Reply With Quote
Old 17-Apr-2012, 12:58   #74
Simon F
Tea maker
 
Join Date: Feb 2002
Location: In the Island of Sodor, where the steam trains lie
Posts: 4,425
Default

Quote:
Originally Posted by Albuquerque View Post
I've done a bit of homework for the world to see, and HT isn't much help. You want real cores, not HT...
What's wrong with real cores AND hyperthreading?

I'm sure my application would benefit from it.
__________________
"Your work is both good and original. Unfortunately the part that is good is not original and the part that is original is not good." -(attributed to) Samuel Johnson

"I invented the term Object-Oriented, and I can tell you I did not have C++ in mind." Alan Kay
Simon F is offline   Reply With Quote
Old 17-Apr-2012, 16:06   #75
Albuquerque
Red-headed step child
 
Join Date: Jun 2004
Location: Guess ;)
Posts: 3,266
Default

Quote:
Originally Posted by Simon F View Post
What's wrong with real cores AND hyperthreading?

I'm sure my application would benefit from it.
If I can choose between 2c / 4t and 4c / 4t, then obvious winner is the latter (given all other things are equal.) If you can get both, then more power to you!

My overclocked 3930k has no problem obviously crushing this game and anything else I throw at it, but I leave HT turned on regardless. It certainly helps when transcoding all the videos of my daughter...
__________________
"...twisting my words"
Quote:
Originally Posted by _xxx_ 1/25 View Post
Get some supplies <...> Within the next couple of months, you'll need it.
Quote:
Originally Posted by _xxx_ 6/9 View Post
And riots are about to begin too.
Quote:
Originally Posted by _xxx_8/5 View Post
food shortages and huge price jumps I predicted recently are becoming very real now.
Quote:
Originally Posted by _xxx_ View Post
If it turns out I was wrong, I'll admit being stupid
Albuquerque is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 10:36.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.