A thought on X-Box 2 and Creative Labs.

Dont confuse double data rate as a doubling in clock frequency.

Umm...I'm not. ;) This summer/fall, there should be 400 Mhz *clock speed* DDR ram available. In other words, "Effective" 800 Mhz.

So, as I said, clock speed will have QUADRUPLED in 5 years assuming that 400 Mhz DDR hits the streets this summer / fall. Doubling the data rate is in ADDITION to the raw mhz increase. Why wouldn't ram CLOCK SPEED increase 2-3X more in the next 5 years?

More and more efficient with memory bandwidth? like the crossbar memory architecture. How much can that be expanded? really?

How much did you think they could be "expanded" 5 years ago? The point is, pixel caches and memory controllers will continue to evolve to meet the changing characteristics of the pixel and vertex pipelines, as well as the memory architecture used.

Even then, look how complex the GF3's memory controller is already, is it rally practical.

Exactly how "complex" is it? Despite that complexity, they improved it for GeForce4. And it's practical enough to include a version of it on their MX boards. (Full version not being needed for a mere 2 pipe board.)

I don't understand why The PowerVR/IMGTECH camp is so hesitant to believe that the same memory technologies that have proven to "satisfactorally" evolve over the past 5 years, won't continue to do so? I could understand that position 5 years ago, because there really was no releavant history to draw on. But now?

IMGTEC still havent moved away from a synchronous memory controller, its all they need. Wonder how much cheaper that controller is than the one on the GF3.

I wonder too. I've been wodering how much "cheaper and better" IMGTEC's stuff is for the past 5 years. I fear that I'll be wondering the same thing 5 years from now. Not particularly for lack of faith in the tech, but for lack of someone bringing a product to market to actually try and exploit it.

No, im saying that true 900Mhz SDRAM isn't possible for many years.

Again: 1997+ 5 years = 4X increase in ram clock (to 400 Mhz.) Why not 2002 + 5 years = 2.25X increase in ram clock?
 
I dont think for system memory (slotted) they will try to go nearly that high, the FCC would not like it for one :) For backplane use those clock speeds would be possible, but for the most part video card memory has had a cheap ride because its just an extension of system memory. If that would change the price differential between commodity and video memory would grow even larger.

Marco

PS. just saw an <A HREF=http://www.eet.com/semi/news/OEG20020508S0022>interesting link</A> on flipcode which relates to this subject ...
 
Humus, agree about the quoting. And a spelling checker comes handy sometimes. Eh he he. Look Mom no typos this time :LOL:

There's been some referring to CELL. I'll start a thread on "it" at Hardware Talk -- please spill.

[OT: Damn Russia and the referee -- everybody was looking forward to a "Finnkampen" this time...]
 
I'll try to stay out of the discussion here, just one comment.

If I see anyone refer to QDR(tm) again I might enter nam ng mode, and you wouldn't want that would you. None of the persons who spoke about QDR(tm) above knows what it is, so please just don't talk about it. And if anybody have found a memory, or info about plans of developing a memory that has the properties described above as "QDR", then please point to it, because I haven't.

[Edit]
And Humus and Gunhead, count me in.
 
When I thought I've seen it all, BenSkywalker jumps in with a 25 quotes post ... :rolleyes:
Must be some sort of record ...
 
Mr. Kutaragi still see's Cell as a computation node though

Yes, he does, but from just that interview, I can't really be sure, what's on his mind.

But at the moment, I just see Cell as just a really fast smart router. My speculation is 'Cell' will use some fluid model, with really small packets. Maybe the size of ATM cell.

On Topology, that's something, they need to work on to make it work.

How it is couple with the rest of the computation processor, is anyone's guess. But too me, I think it will be closely couple for MPEG on demand type of things, but for games, I doubt it. Even with like 30% slack capacity, its pretty hard to make it work, without degrading quality of service.
 
Joe DeFuria said:
High-end five years ago was Voodoo2 with three 64-bit 100MHz EDO DRAM channels... that's 2.4GB/sec.

Well, first of all, Voodoo2 was 4 years ago. (February, 1998). Second...to be exact, it was three 64 bit 90 Mhz EDO DRAM channels. ;)

Third, if you are going to say that, we might was well doble that to six 64 bit 90 Mhz modules (4.8 GB/sec), due to V2 SLI.

Most importantly though, I'll repeat with emphasis: ;)

At the time, the best single chip solution, bandwidth wise, was the Riva 128. 128 bit, 100 Mhz, SDRAM.

I certainly agree that multi-chip, and multi-board configurations, is one way to attack the problem. But I'm purposely limiting the comparison here to "bandwidth per single chip" to have a somewhat apples-to-apples comparison.

Uh, my Monster3D II and Obsidian2 X-24 both use 100MHz EDO DRAM...
 
Cross console distributed computing wont work in any console application period (unless you want to call peer to peer multiplayer games distributed processing, but those are far too limited to need hardware support).
 
Basic said:
None of the persons who spoke about QDR(tm) above knows what it is, so please just don't talk about it.

I thought Quad Data Rate SDRAM was when the memory retrieves data bursts twice on both the rise and fall edge of the clock cycle. I am assuming QDR(tm) also describes something else? I am interested in what it actually is if I am wrong, please share.

Thanks.
 
Uh, my Monster3D II and Obsidian2 X-24 both use 100MHz EDO DRAM...

Well, it might have had ram rated at 25ns, but it's running at a defualt speed of 90 Mhz. The Voodoo2 had a synchronous bus, with both the core and memory clocks at 90 Mhz. (Pixel rate of 90 million, dual textured pixels / sec.)
 
It seems the art of winning an argument lies in confusing you're opponent by putting in random quotes and answers until everyone completely loses track of whats been said ;)
 
LittlePenny said:
Basic said:
None of the persons who spoke about QDR(tm) above knows what it is, so please just don't talk about it.

I thought Quad Data Rate SDRAM was when the memory retrieves data bursts twice on both the rise and fall edge of the clock cycle. I am assuming QDR(tm) also describes something else? I am interested in what it actually is if I am wrong, please share.

Thanks.

I don't think such an appoarch is possible at all, how would the chip know when to sample data then second time on each edge? The reason we have clocks at all are so we know when to feed or sample data, so we'd need another clock to handle the second data sampling. This is what's done in for instance the P4 quad pumped bus, it has two clock signals with one 1/4 cycle ahead. So it can sample when the first rises, the second rises, the first falls and when the second falls.

I think that QDR does something else though, I think (but can't back it up) that it works like DDR but allows simultaneous writes and reads, something I suppose should come in quite handy for graphic cards.
 
LittlePenny:
QDR is a trademark for a SRAM memory interface (not SDRAM). It's an interface with separate data and address buses for read and write, and both of them are running at double data rate. It's possible to access both buses simultaneously with different addresses. So it's not anything like people seem to think here. Read more at www.qdrsram.com

And there is no such thing as QDR SDRAM, trust me that I've searched.

The memory that fits best to "four bits of data per pin and clock cycle" is Kentron Techonologies' QBM. But that's not a memory chip interface, but a way to connect two DDR SDRAM chips with an external switch, and double the datarate per pin that way. But don't expect this method to push the highend datarate. It's more of a way to get rather high bandwidth out of cheap components than to get realy high bandwidth. I don't expect to see QBM compete with highend DDR, and certainly not with highend DDRII. Read more at www.kentrontech.com

Which brings us to DDRII which doesn't transmit four bits per pin and clock, even though some people thinks it does. The interface is pretty much the same as DDR, except that the minimum burst length is 4 instead of 2 (2 clock cycles instead of 1). There's some other fixes in the interface, but the big change is internal; the memory core is running at half the speed compared to DDR memory with same datarate. These fixes makes it easier to increase the clock frequency, but at a given clock, it's got the same bandwidth as DDR.
 
Dont you think that chip designers will prefer the DDR over DDR-II (at 400MHz) because of the smaller burst length? Today nvidia has the LMAII trying to have smaller granularity and then BOOM comes a minimum burst length of 4.
 
Some more about the real QDR.

Since it has dedicated buses for read and write, it's only efficient if you know that read and write bandwidth is equal all the time, otherwise you'd be better off with just one wide bus. Remember that while this memory increases the bandwidth to each memory address, it doesn't increase the bandwidth per pin. There is one more benefit with this interface (again, if read and write bandwidth is equal), you don't get the cost for "turning the bus around" when switching between reading and writing. But generally, no, this isn't anything for graphics cards.
Those who make QDR SRAM seem to agree, they keep talking about high performance networking and communication. That's applications where you might need to store data for some time, but you usually send the same amount of data as you receive.

Now I must say that I just remembered a DRAM interface that actually sends two bits per pin every clock. Rambus has made a high speed interface that builds on the ideas from DRDRAM, but have four logical levels (voltages), and can thus send two bits at once. It's not called QDR though, and I haven't heard about anyone using it or if it even made it to production. It had even harder rules about the connection that DRDRAM. One thing was that the traces could not be longer than 2". :eek:
 
pascal:
Yes, I believe DDR400 should be better than DDRII400 regarding performance. The biggest benefit with DDRII400 would be the price. Another benefit with DDRII is that they should get to higher clocks than DDR. So for graphics cards it wouldn't be a choice between the two interfaces at the same clock.

[Edit]
Clarification: DDR400 and DDRII400 refers to a datarate of 400 MHz (=200MHz clock). That's not exactly exciting speeds for graphics card nowadays. But I noticed now that you said 400MHz, and probably meant clock frequency. That would then be DDR800 and DDRII800, but the same conclusion holds.

One more thing. Even though the burst length limitation is bad for graphics cards, it doesn't have to be for main memory. I think main memory always is read in 64 byte chunks (cache lines), so it wouldn't matter there.
 
Yeah, I am talking about clock frequency.

I would like to know what people will do because the frequency is not twice as fast, but the latency will be twice as long. Probably a DDRII900 will be slower (system point of view) than a DDR800. I would like to know what nvidia will do with NV30 and LMA.
 
Joe DeFuria said:
Uh, my Monster3D II and Obsidian2 X-24 both use 100MHz EDO DRAM...

Well, it might have had ram rated at 25ns, but it's running at a defualt speed of 90 Mhz. The Voodoo2 had a synchronous bus, with both the core and memory clocks at 90 Mhz. (Pixel rate of 90 million, dual textured pixels / sec.)

LOL, uh, no, Voodoo2 doesn't have a synchronous bus. The EDO DRAM is fixed-clock based on the BIOS IIRC. The modules all say 100MHz on them...

I also did extensive tests with my Monster3D II and found that my frame rates would increase per core MHz until about 103MHz, where the returns would diminish, probably because mem bandwidth was finally hitting the performance and not increasing.
 
One issue with DDR-I vs DDR-II memory: Write-to-read bus turnaround is very slow in DDR-I, something like 1 clock + one full CAS latency (~4 cycles) during which the memory bus just sits idle. The DDR-II protocol has a fix for this problem, reducing bus turnaround to only 1 clock either way, potentially increasing protocol efficiency.

And DDR-II would not have twice the latency of DDR-I either. For the most part, each one of the memory latencies in DDR-II (RAS, CAS, precharge) will be about 1 clock longer than in DDR-I at the same clock speed, in order to align accesses to the half-speed core clock. In higher-speed DDR-I's (>250 MHz), these latencies are already at something like 4-6 clocks each , so the latency hit would be something like 20-25%.

Also, with DDR-I it is difficult to achieve good protocol efficiency with bursts shorter than 4 elements, so the burst-length-4 limitation of DDR-II may not be detrimental to performance at all.
 
Back
Top