What happened to SLDRAM?

Saem · Aug 30, 2002

Deep buffers, prefetching, caching, speculative data loading and really high clock rates will help you overcome much of the latency problems. But to overcome them you'll need bandwidth, because you move data before you need it, not after, ideally it arrives right when you need it.

Entropy · Aug 30, 2002

Saem said:
Deep buffers, prefetching, caching, speculative data loading and really high clock rates will help you overcome much of the latency problems. But to overcome them you'll need bandwidth, because you move data before you need it, not after, ideally it arrives right when you need it.

"ideally"

Entropy

Saem · Aug 30, 2002

Fine, fine, "IDEALLY."

Nevertheless, we have to trade-off in different places to get desired performance. Of course, I'm a big believe in the right tool for the job and if it was upto me, I'd make multiple types of MPUs as functional units within the main MPU and have different memory types to handle different situations. But that's getting expensive.

Entropy · Aug 31, 2002

Saem said:
Fine, fine, "IDEALLY."

Nevertheless, we have to trade-off in different places to get desired performance. Of course, I'm a big believe in the right tool for the job and if it was upto me, I'd make multiple types of MPUs as functional units within the main MPU and have different memory types to handle different situations. But that's getting expensive.

Yup. I've been fortunate in that scientific computing can afford and justify more specialized hardware, most of what is new in PC-space have been used for a long time in more expensive computers. Applications differ though, and microprocessors have had the edge in straight scalar non bandwidth dependent performance for some time. As I'm sure you are aware, all the features you bring up have been used for a long time in scientific/engineering computers, with much larger caches particularly relative to the core-main mem speed gap. The situation with PC CPUs is much much worse in terms of the CPUs outspeeding the memory systems that feed them.

So I'd consider latency to be quite important along with bandwidth.

It will be interesting to compare the Hammer with its 333MHz bus Athlon siblings, as it doesn't have all that much to gain in IPC over the Athlon according to AMDs presentations and nominal bandwidth will be identical. Most of the gain we see will be from the improved main memory latency, so it will give us some data to chew on pertaining to the relative importance of different parameters in the memory hierarchy, over a range of applications.

To me, this is known as "fun".
To others, the above sentence might justifiably be called "perverse".

Entropy

Saem · Aug 31, 2002

I think it's fun as well.

BTW, there should be significant IPC gains in the way of the FPU on the Hammer, since that unit is horribly imbalanced when you take into consideration the pipe that feeds the processor. An Athlonesque FPU on the P4 would be disgustingly strong. And I think the reason Intel really got rid of the second FPU pipe is because if would have spanked IA64 processors so bad, it's not funny.

elimc · Sep 4, 2002

To sign up for MPF in San Jose, go here
https://www.mdronline.com/corpreg/e...&npr=eventreg_frm_update.asp&mode=MPF

To see the schedule, go here
http://www.mdronline.com/mpf/mpf-brochure.pdf

Entropy

What I meant to say was that I really don't think I'll see anything interesting from Micron. Maybe it will do well in some specialized market segment or something.

The nForces are 4 layer and run fairly well. Same with the i845s and so on. At least that's my understanding. I'm not sure where you're getting 6 layers from, unless we're heading into workstation and lowend server.

The nForces have stability problems. There are a number of DDR SDRAM boards that are six layer, IIRC.

If the Latency wouldn't be important RDRAM would have been much faster then SDRAM from the beginning. Most Benchmarks show that Latency is really important. Latency means that the CPU is idle for as long as it takes to feed the first data.

Which benchmarks? RIMM4200 can have quite a bit less latency than PC1066, but the gains are meager at best. The P4 masks latency fairly well. Unfortunately, it is hard to mask low BW. If the BW of RAM was anywhere close to the needs of the processor, latency would become more important.

the Interface of RDRAM runs at 533 MHz. RDRAM is DDR like normal DDR-SDRAM.

BTW, there should be significant IPC gains in the way of the FPU on the Hammer, since that unit is horribly imbalanced when you take into consideration the pipe that feeds the processor. An Athlonesque FPU on the P4 would be disgustingly strong. And I think the reason Intel really got rid of the second FPU pipe is because if would have spanked IA64 processors so bad, it's not funny.

Have you seen the SPEC scores for the Itanium? The FPU scores are incredible. Even with a second FPU, I would have a tough time seeing the P4 score higher in SPEC.

Saem · Sep 4, 2002

Elmic,

You're kidding me right? Have you seen the throughput on the FPU in the P4? The extra pipe would have made a fair bit of difference, I'm sure there are enough operations in the SPEC tests that would benifit from the P4 not lagging behind due to a non-pipelined operation. Too bad I can't profile the code, but I'm pretty sure it would have done some serious damage to Itanium -it's beat it, but if the pipeline was there then it would have been a thrashing. Itanium 2 is a slightly different story, the reason here being a very large cache (which would help on the tests with large data sets) and double the system bandwidth of the P4. Even then, the P4 wouldn't have been that far behind with comprable performance. The Xeons with the extra cache (L3) might have been really powerful.

elimc · Sep 5, 2002

CFP2000:

http://www.spec.org/osg/cpu2000/results/res2002q3/

Athlon 2600+
Base: 655
Peak: 710

Itanium2 1GHz
Base: 1356
Peak: 1356

You would need dual P4 Xeons to beat the Itanium. The integer tests are much less impressive, however.

Saem · Sep 5, 2002

Why are you posting Athlon scores? I've already stated it's being held back by it's system bandwidth.

As for the P4, I believe the 2.8 scores in the high 800s. Too lazy to check.

mboeller · Sep 5, 2002

elimc said:
CFP2000:

http://www.spec.org/osg/cpu2000/results/res2002q3/

Athlon 2600+
Base: 655
Peak: 710

Itanium2 1GHz
Base: 1356
Peak: 1356

You would need dual P4 Xeons to beat the Itanium. The integer tests are much less impressive, however.

Pentium4 scores :

P4 2.8 GHz :

SPECint_2000 base : 1020 ( using PC1066 )
SPECfp_2000 base : 1041 (using PC1066 )

Saem · Sep 5, 2002

Crap, I didn't expect it to be that good. Heh.

elimc · Sep 5, 2002

Didn't see P4 scores where I was looking. Of course it is kind of silly to compare the two since their market segments won't overlap much.

MfA · Sep 5, 2002

Yeah, one is in an existing and the other in a non existing one

What happened to SLDRAM?

Similar threads