I Can Hazwell?

Exophase · May 6, 2013

See, you're talking about two different things here. Load is not clockspeed. Load is CPU time, and that means one thing and one thing only - the percentage of the time the OS has nothing to schedule on. A low load may (by OS or hardware) initiate a shift in clockspeed, but those are still separate things.

An idle intel CPU won't operate at the lowest multiplier, it will operate in an idle C state with the core parts clock gated off. It might operate in the lowest multiplier while not idle if the needed CPU time is low enough to warrant it. But these are still different things it's doing at different times.

This screenshot is an overclocked system at 5GHz, where one would presume that DVFS (including both turbo and down-clocking to lower states) is disabled. If the clockspeed is actually lower than this at the time then the screenshot is wrong and we can stop talking about its implications.

Albuquerque · May 6, 2013

Again, I've been talking about voltage (and clockspeed) as a function of load. Today's CPU's do vary their multiplier with load. I'm not sure what else you're trying to describe; this technology already exists and is in use.

Let me state it again: intermediate multipliers happen today for varying amounts of load. If load was only a momentary picture of either zero or 100% utilization, these intermediate multipliers would not get used outside of cases where the multiplier was hard coded.

Yes, I get c-state, that's what YOU are talking about, not me. My supposition was based on the same application of voltage regulation per load state as has been available for years: multiplier regulation per load state.

Exophase · May 6, 2013

I'm having a really hard time connecting what you're saying now with the speculation you originally made.

I say that with the suggestion that the new on-package voltage regulation may be intelligent enough to modify voltage per load, as well as per clockspeed.

So originally you talked about modifying voltage per load as well as per clockspeed, but now you seem to be referring to load AS the dynamically varying clock speed. Are you not talking about varying the voltage while the clock speed is at 5GHz? Is that not what the picture would be representing?

You already said it yourself, voltage is regulated dynamically along with clock speed (which can be regulated based on demand, ie load), so I don't know why what you're saying now is supposed to be referring to new technology..?

Albuquerque · May 6, 2013

Ok, so you are now aware that clockspeed AND voltage varies with load? Because I wasn't getting that from your prior statements. You seemed to suggest that there were two possible loads - 0% or 100%. The word load that I bolded was yours, yet you've now indicated that you're aware that load isn't purely determined as zero or 100, but rather is a calculation over time.

Now that I know that we're both on that same level, my speculation was this: voltage can still vary based on load, even if clockspeed doesn't. Case in point: to prove most overclocks, you put the CPU under continuous load. If errors or instability result, you raise the voltage.

Given constant clockspeed, varying the voltage can result in different capabilities. It isn't so much the voltage per-se, it's the voltage drop across the rails when under load that likely is the stability culprit. If you can maintain a minimum necessary voltage, the CPU should be fine. By upping the voltage, you are in effect maintaining some meaningful minimum necessary voltage to keep it stable.

Under load (most execution units across all cores are firing), more correction (ie: voltage) is necessary to keep that minimum necessary voltage. Under low load (ie sporadic, quick hits that aren't enough to v-droop the entire package) a lower voltage would still be acceptable at the high speeds.

If this new voltage regulation technology is as Bad-Freakin-Ass (TM) as we could hope, it may be feasible that Intel can better measure and maintain a minimum-necessary voltage across the package.

Albuquerque · May 6, 2013

I'll even go so far as to mention this: Intel is quite aware of this droop behavior, because there was an engineering blog that mentioned how Intel "warms up" the AVX units before any actual data goes through them. They do it to ride out the incoming voltage drop that the huge math units will undeniably cause on their way "up".

If they've already thought that far ahead, it makes me even more amenable to the possibility that Intel is able to vary voltage per-load agnostic of clock.

Edit: More on the warmup here: http://www.agner.org/optimize/blog/read.php?i=142 Do a find for the word "warm".
Edit2: Better, here's D Kanter on RWT explaining the voltage issue and how it's nothing new: http://www.realworldtech.com/forum/?threadid=115980

Exophase · May 6, 2013

I thought it was pretty clear from the start that I was aware that clock speeds and voltages are varied dynamically with demand since I've mentioned it several times in my posts..

What I was saying wasn't that there's two possible loads, just that this is what it looks like instantaneously. And that the voltage has to be supplied to meet the CPU's demand on an instantaneous level. If you lower the clock speed as well then that's a totally different story. The screenshot doesn't show a lowered clock speed so I don't know how that can be inferred, and if that really is the case then having a low voltage with an arbitrarily low clockspeed is a non-story. I thought the whole idea was speculating how it could support this clockspeed with this voltage.

You can lower clockspeed without lowering voltage but you can't go the other way around - you need some minimum voltage to drive some clockspeed. The opposite isn't true.

The reason why the OS/CPU can lower the clockspeed when the load is low is because it has determined that there's enough slack that a higher load is acceptable (when you lower the clock you raise the load because you still need the same number of clock cycles but have fewer cycles/second). It'll do the same thing in reverse - when the load is really high (close enough to 100%, usually means something isn't yielding) for long enough it'll raise the clock speed to see if the software can benefit from it.

None of that makes sense with respect to voltage alone because load isn't a function of voltage, and you need the voltage to drive the timing requirements to make the thing work at some clock speed. Being able to lower voltage is a secondary effect of being able to lower clock speed. Even if there were conditions that made a lower voltage necessary at the same clock speed because of what the instruction mix looks like, trying to vary voltage this way based only on past behavior would be very dangerous because you'd only know to correct it after the CPU start changing behavior again. Short-changing clock speed isn't a big deal but short-changing voltage is catastrophic. And besides that, I doubt the variation in voltage needed based on instruction mix at the same clock speed is that huge, most of the core (fetch, decode, schedulers, caches) is doing the same sort of stuff regardless of instruction mix.

I don't think the AVX warm-up stuff has anything to do with avoiding a voltage drop, it sounds more to be like it's just making sure you really want it before wasting power having it on. You often have to run benches for a while to get up to peak clock speed too, same sort of thing.

Albuquerque · May 6, 2013

Well then, maybe you answered my question.

sebbbi · May 6, 2013

eastmen said:
Yea , I was talking in the affordable range of say $500-$800 bucks. Heck I'd take a 6 core at this point.

You can get yourself a 16 core AMD Opteron (with 16 MB cache) for less than $600: http://www.newegg.com/Product/Product.aspx?Item=N82E16819113308&Tpk=6366 HE. Granted, it's not a good fit for a gaming PC, but if you want to game, nothing beats a high clocked quad core Intel CPU. These CPUs are designed for serious multithreaded workloads.

Also a 8 core (16 thread) Xeon is 1100$ (http://www.newegg.com/Product/Product.aspx?Item=N82E16819117275). That's 300$ more than your price range, but it also gets you a very nice 20 MB cache and ECC memory support. It's a pretty good deal if you are processing large data sets (and need both the extra cores and the extra cache).

I doubt Intel is going to introduce an 8 core consumer Haswell model. Consumer models have integrated GPUs, and the GPU inside Haswell is going to be much larger compared to Ivy Bridge. The high end GT3 GPU could take almost half of the chip. 8 CPU cores + GT3 would be a huge chip. I am sure they could sell the (similar sized) 12 core Xeon server chips at much higher profit margins.

Exophase · May 6, 2013

sebbbi said:
You can get yourself a 16 core AMD Opteron (with 16 MB cache) for less than $600: http://www.newegg.com/Product/Product.aspx?Item=N82E16819113308&Tpk=6366%20HE. Granted, it's not a good fit for a gaming PC, but if you want to game, nothing beats a high clocked quad core Intel CPU. These CPUs are designed for serious multithreaded workloads.

Although it wasn't said I think there was an implication that the single threaded performance would need to be as high and the aggregate performance higher than the current cheaper/lower core CPUs. This particular Opteron would be way behind an FX-8350 on both counts; AFAIK it can only turbo up to 3.1GHz w/less than 8 cores and 8x4GHz will easily beat 16x1.8GHz as well. The perf/W should be much better but that kind of goes beyond just having serious multithreading requirements.

The case should work for SB-Es or 6/8-core Xeons though.

pjbliverpool · May 6, 2013

sebbbi said:
You can get yourself a 16 core AMD Opteron (with 16 MB cache) for less than $600: http://www.newegg.com/Product/Product.aspx?Item=N82E16819113308&Tpk=6366%20HE. Granted, it's not a good fit for a gaming PC, but if you want to game, nothing beats a high clocked quad core Intel CPU. These CPUs are designed for serious multithreaded workloads.

Also a 8 core (16 thread) Xeon is 1100$ (http://www.newegg.com/Product/Product.aspx?Item=N82E16819117275). That's 300$ more than your price range, but it also gets you a very nice 20 MB cache and ECC memory support. It's a pretty good deal if you are processing large data sets (and need both the extra cores and the extra cache).

I doubt Intel is going to introduce an 8 core consumer Haswell model. Consumer models have integrated GPUs, and the GPU inside Haswell is going to be much larger compared to Ivy Bridge. The high end GT3 GPU could take almost half of the chip. 8 CPU cores + GT3 would be a huge chip. I am sure they could sell the (similar sized) 12 core Xeon server chips at much higher profit margins.

Next gen consoles utilising a lot of low latency GPGPU may push PC CPU workloads beyond what a quad Haswell is cable of matching. So perhaps there will be a genuine need for 8 or more core CPU's moving forward?

UniversalTruth · May 6, 2013

Grall said:
there simply not being any need or demand for 8+ core desktop CPUs

You, please, tell me more about it because I don't get it

This is like saying there is no need or demand for progress.
Simply give me that 8 core processor and I will find what to run on it. Don't worry about me.

Grall said:
What are you going to run on them anyway? There's no applications used by everyday people and companies that require that much oomph, and precious few situations where you really even need four cores

I think companies will start to optimise their stuff for 8 cores and beyond processors once those become mainstream.

Of course, such a demanding game like Crysis 3 (only one example, I bet they are dozens) would have a tremendous potential to scale well with counting up cores

Grall · May 6, 2013

UniversalTruth said:
Simply give me that 8 core processor and I will find what to run on it. Don't worry about me.

...Like what exactly, are you going to take up offline graphics rendering in your spare time?

Seriously, there aren't many tasks that need 8 cores (or heck, even more) outside of graphics, some specific areas of professional applications and distributed computing.

I think companies will start to optimise their stuff for 8 cores and beyond processors once those become mainstream.

Almost all apps used by regular people spend 99.999%+ of their time simply waiting for user input... Optimizing for 8 cores would be pointless, you basically can't optimize something like that even for two cores.

An office app just to make a simple example spends a staggering amount of time waiting for user input, and in the case of databases for example, sometimes also disk I/O if you have a large set of data. Where would the 8-core benefit come from?

It's not that developers are neccessarily lazy or anything, because seriously. What could you do on 8 cores that would bring a noticeable benefit to the user...? You'd need monstrous spreadsheets, or super high resolution photoshop canvases and so on to notice any difference whatsoever really.

Of course, such a demanding game like Crysis 3 (only one example, I bet they are dozens) would have a tremendous potential to scale well with counting up cores

Not sure what C3 would do with 8 cores, or how many cores it even utilizes right now to be frank. It's not as if a game simply discovers new tasks to hand off to more cores once they're installed in the system. You'd have to design the content also, hand in hand with the game engine to take advantage of such computing resources.

mczak · May 6, 2013

pjbliverpool said:
Next gen consoles utilising a lot of low latency GPGPU may push PC CPU workloads beyond what a quad Haswell is cable of matching. So perhaps there will be a genuine need for 8 or more core CPU's moving forward?

Haswell ought to be perfectly capable of low latency GPGPU too. Also don't forget for executing such workloads on the cpu it is already quite a bit faster too compared to Ivy Bridge thanks to AVX2/FMA.

pjbliverpool · May 6, 2013

mczak said:
Haswell ought to be perfectly capable of low latency GPGPU too. Also don't forget for executing such workloads on the cpu it is already quite a bit faster too compared to Ivy Bridge thanks to AVX2/FMA.

If the GPGPU is kep on die yes, but in combination with as discrete GPU I understand the general consensus around things working that way is: unlikely.

I'd be interested to understand how well Haswell compares to a full HSA architecture though in terms of flexibility.

mczak · May 7, 2013

pjbliverpool said:
If the GPGPU is kep on die yes, but in combination with as discrete GPU I understand the general consensus around things working that way is: unlikely.

I'd be interested to understand how well Haswell compares to a full HSA architecture though in terms of flexibility.

I think the idea of using the IGP for some GPGPU calculations even if there's a discrete GPU present would be quite interesting. No idea though if anyone has some plans to do something like that (certainly the new consoles have a lot more GPGPU capability but I have no idea how much they'll get used for non-graphics stuff).
Though it has to be said that the IGP gflops is in the same league as the cpu gflops in case of Haswell 4-core / GT2, so you can't really expect a many-fold increase in any case.

rpg.314 · May 7, 2013

eastmen said:
so do we have to wait for broadwell to get 8 physical cores ?

You are not going to get more cores of caches in the consumer products any time soon, never mind the Moore's law.

There's no sw to drive them. Pretty much the only thing that is going to grow is the GPU or other ff hw.

eastmen · May 7, 2013

I don't want to steal away the thread but I currently have a fx 8150 and I have an i72600. Both of them are way to slow for the things I want to actually do.

The software is there and more and more will come as the hardware actually releases .

Its nice that performance of the gpu inside these chips when up but that is hardly enough. I also don't want to be buying server parts to get the speed that I want.

These companies complain that the market is contracting , but it seems to be all their fault since they want to sell the same old things with barely any improvement and are starving the mid to high end. I don't see anyone buying a decent chip in the last 3-4 years needing any of these new chips. They just don't offer a lot of performance over what was already there.

rpg.314 · May 7, 2013

eastmen said:
I don't want to steal away the thread but I currently have a fx 8150 and I have an i72600. Both of them are way to slow for the things I want to actually do.

Like what?

The software is there and more and more will come as the hardware actually releases .

Which sw? I going to guess whatever it is, it's not a consumer product.

Its nice that performance of the gpu inside these chips when up but that is hardly enough. I also don't want to be buying server parts to get the speed that I want.

These companies complain that the market is contracting , but it seems to be all their fault since they want to sell the same old things with barely any improvement and are starving the mid to high end. I don't see anyone buying a decent chip in the last 3-4 years needing any of these new chips. They just don't offer a lot of performance over what was already there.

They dont offer more performance because the consumer market doesn't need them.

eastmen · May 7, 2013

I edit vacation videos. More power is always needed for it.

rpg.314 · May 7, 2013

eastmen said:
I edit vacation videos. More power is always needed for it.

Not enough consumers do it to drive the market. Those who do will have to settle for whatever the general consumer market will pay for.

I Can Hazwell?

Exophase

Albuquerque

Red-headed step child

Exophase

Albuquerque

Red-headed step child

Albuquerque

Red-headed step child

Exophase

Albuquerque

Red-headed step child

sebbbi

Exophase

pjbliverpool

B3D Scallywag

UniversalTruth

Grall

Invisible Member

mczak

pjbliverpool

B3D Scallywag

mczak

rpg.314

eastmen

rpg.314

eastmen

rpg.314