NVIDIA's Project Denver (ARM-based CPU)

Vitaly Vidmirov · Jan 31, 2011

JohnH said:
Eh? Big endian order typically reverses bytes within a word for the "convenience" of making them appear the same to a human reader but the memory order _does_ change as a result.

Order of what? And why does it change?
+00 11
+01 22
+02 33
+03 44

are placed in register as is
bit 0 ... 31
[11 22 33 44]

Imagine a bitfield data. With BE I can process it with big chunks of any width regardless of source data width.

MEM (4 consecutive bytes)
10110101 01010100 10110111 11110000

BE:
10110101 01010100 10110111 11110000
10110101 01010100 10110111 11110000 << 1
-------------------------------------------------------------------
01101010 10101001 01101111 11100001

LE:
11110000 10110111 01010100 10110101
11110000 10110111 01010100 10110101 << 1
--------
FAIL

What's worse is that memory order changes depending on the word size used.

You can't peek up a byte from the same address as word, but the order is still the same.
Why should you do it anyway?

further little endian allows you to optimise by using wider word shifts if they're available without worring about the effect that word width has on memory order.
Look at the example above.

I don't actually care about bit ordering but I prefer BE =)

Xmas · Jan 31, 2011

Vitaly Vidmirov said:
Order of what? And why does it change?
+00 11
+01 22
+02 33
+03 44

are placed in register as is
bit 0 ... 31
[11 22 33 44]

Either you call the MSB "bit 0" for some reason or you're describing LE.

Vitaly Vidmirov · Feb 9, 2011

Xmas said:
Either you call the MSB "bit 0" for some reason or you're describing LE.

Because I use PowerPC bit numbering notation (in register): from MSB(0) to LSB(31/63)

Xmas · Feb 9, 2011

Vitaly Vidmirov said:
Because I use PowerPC bit numbering notation (in register): from MSB(0) to LSB(31/63)

I would intuitively number them the other way round since that nicely corresponds to the bit value 2^x.

And if you reverse this, your bitfield shift example works with LE, too.

Man from Atlantis · Jul 19, 2011

1st Silicon with NVIDIA Project Denver is an 8-Core ARM, 256 CUDA Core APU?

During the same month (December 2011), NVIDIA plans to tape out the first silicon based on Project Denver, which combines up to 8-core custom NVIDIA-ARM 64-bit CPU with a GeForce 600-class GPU. The company had a lot of issues in development of a CPU and the general consensus is that NVIDIA is take a conservative approach with a single 28nm PD CPU design and the 28nm Fermi-based design, i.e. the rumored Fermi-refresh in the form of notebook and lower-end desktop GeForce 600 Series cards (remember "GeForce 300"?). The interesting bit that we heard is that Project Denver is geared towards "full PhysX support", whatever that might be.

According to another source close to the subject, target for GPU part of the silicon is "at least 256 CUDA cores" which would put the product on pair with AMD's Trinity APU which will pair a Bulldozer-Enhanced CPU core with "Northern Islands" VLIW4 architecture and will be the key APU for AMD in 2012. Compute power-wise, NVIDIA doesn't want to clock it to heavens' high, but rather to squeeze each IPC (Instruction Per Clock) as possible. Still, it is realistic to expect 2.0-2.5GHz for CPU and similar clock for the GPU part, with memory controller and the rest of the silicon working at a lower rate to keep everything well fed.

Unlike AMD's APU design, where CPU and GPU parts connect to the memory controller at the speed of DDR3 memory, Project Denver is looking for a more direct communication between CPU and GPU cores, i.e. relying on the best GPU design can offer: high-bandwidth connection. NVIDIA is not taking the conventional route with L1, L2 and L3 cache design, since the GPUs have 1TB/s+ connections to its cache memory, similar approach is rumored for Denver core design as well. Just like in the GPUs, memory controller takes the larger portion of the die and connects CUDA cores with CPU cores and CPU will have priority access to the bandwidth needed.

rpg.314 · Jul 19, 2011

I thought Denver was meant for Maxwell.

But then, BSN just claimed a 8xA15 for T4 not too long ago.

trinibwoy · Jul 20, 2011

I didn't know BSN had started documenting Jen Hsun's greatest fantasies.

I.S.T. · Jul 21, 2011

Why do people pay attention to BSN/Fudzilla/Dude who wants to suck AMD's cock every day, err, I mean. Semi-Accurate?

They are all trash.

Edit: Changed Charlie's description to be more accurate and less inflaming to AMD employees.

rpg.314 · Aug 5, 2011

Charlie's take:

http://semiaccurate.com/2011/08/05/project-denver-is-more-than-a-t50-core/
http://semiaccurate.com/2011/08/05/what-is-project-denver-based-on/

entity279 · Aug 6, 2011

Crusoe-style CPU? I donn't know he's gonna navigate his way out of this one when it will be proven incorrect

I'm guessing it will because from where I'm standing this looks like a very "courageous" architectural decision and i would suspect nV would do something safer.

Arun · Aug 6, 2011

Argh, Charlie keeps referring to the Denver CPU itself as T50. That doesn't make any sense and it's so stupid it makes me want to bang my head against the wall. Txx refers to SoCs - it's like if you said the GF100 chip has 16 GF100 cores. And the rest of it is equally ridiculous. I could maybe believe that NV would do some fairly unusual things (e.g. speculative execution of both sides of a branch?) but Crusoe-style isn't one of them. I was nearly ready to take part of his Tegra 3/3.3/4 article seriously but this once again proves his source in that area are completely worthless.

AlexV · Aug 6, 2011

entity279 said:
Crusoe-style CPU? I donn't know he's gonna navigate his way out of this one when it will be proven incorrect

Clearly, it will be a case of NVIDIA changing its plans just to spite him, the lone ranger fighting the good fight. Alternately, there's enough squirming room for that statement, just like the OMAGAD NV DOES SW TESS OMAGAD thing.

I.S.T. · Aug 6, 2011

Why do people give that man attention? He just wants to sleep with AMD and Linus Torvald.

All day and all night.

Man from Atlantis · Jan 18, 2012

JHH: Denver is the world's first 64bit ARM processor and scheduled to ship in 2013

http://translate.google.com/transla...s.co.jp/docs/column/ubiq/20120116_504845.html

jlippo · Jan 19, 2012

Sounds very familar to what was proposed in this talk.
http://mediasite.colostate.edu/Medi...aspx?peid=22c9d4e9c8cf474a8f887157581c458a1d#

IE. the three layers of hetegenerous computing.
One big core meant for main house keeping and very fast single thread performance. (CPU)
Some cores not that good in single tread but close to computing units.(ARM)
Lots of computing cores for very paraller tasks. (APU/or what ever the little units are called.)

NVIDIA's Project Denver (ARM-based CPU)

Vitaly Vidmirov

Xmas

Porous

Vitaly Vidmirov

Xmas

Porous

Man from Atlantis

rpg.314

trinibwoy

Meh

I.S.T.

rpg.314

entity279

Arun

Unknown.

AlexV

Heteroscedasticitate

I.S.T.

Man from Atlantis

jlippo

Similar threads