NVIDIA Maxwell Speculation Thread

The only descriptions I've seen of Larrabee's configuration were 32 512-bit vector registers for each hardware thread, the L1 data cache is separate and 32 KB, the L2 below it is 256 KB.

I have not run across a description indicating the design had deviated from a reg file-> L1-> L2 standard CPU configuration.
 
The only descriptions I've seen of Larrabee's configuration were 32 512-bit vector registers for each hardware thread, the L1 data cache is separate and 32 KB, the L2 below it is 256 KB.

I have not run across a description indicating the design had deviated from a reg file-> L1-> L2 standard CPU configuration.

The "registers" for ~500 workitems cannot be stored in a register file with only 32 registers in all. They are allocated from cache, made resident, and then the sw runtime takes over, scheduling and switching workitems via load store instructions.
 
The "registers" for ~500 workitems cannot be stored in a register file with only 32 registers in all. They are allocated from cache, made resident, and then the sw runtime takes over, scheduling and switching workitems via load store instructions.

32 registers do not hold all the data needed, but that doesn't change the fact that the vector registers are physically separate and separately addressed pools of SRAM.
 
From SemiAccurate: "Nvidia’s Maxwell process choice."

Key points:
  • Likely release timeframe Q1 2014
  • Internal aim of an 80% increase in performance/W [I'm assuming over some Kepler]
  • The first Maxwell parts will be on 28 nm, not 20 nm
  • The rest of the article is for subscribers or can be bought for $50
 
That has to be some kind of idiotic joke.
With pay per click rates being what they are and $50, he only needs a few idiots who are willing to pay for his 'analysis' to come out ahead. Kinda sad to miss out on the entertainment value though...
 
With pay per click rates being what they are and $50, he only needs a few idiots who are willing to pay for his 'analysis' to come out ahead. Kinda sad to miss out on the entertainment value though...

Apparently, non-free articles become free after a month, so we'll just have to wait a bit. ;-)

(Which, incidentally, only makes the $50 fee even crazier.)
 
He sure puts a large value on his "insights" into the industry.
 
Apparently, non-free articles become free after a month, so we'll just have to wait a bit. ;-)
Apparently not all of them will.

SemiAccurate subscription page said:
The Curious [free membership level] have access 30 days after initial publication, and may not have access to all premium content like analysis.
And this report is under "Analysis."

If this report is true, then I'm guessing the Maxwell lineup may be something like the G80, G92, and GT200.

"Maxwell Prime": 28 nm, Q1 2014, 400-450 mm^2, 384-bit bus, game performance ≥ GeForce version of Tesla K20/K20X.
"Maxwell Lite": 20 nm, Q4 2014 or whenever 20 nm realistically shows up, 250-300 mm^2, ≥256-bit bus, performance ≈ Maxwell Prime depending on bandwidth bottlenecks?
"Big Maxwell": 20 nm, 2015, 500-550 mm^2, 512-bit bus, focusing on compute.

I would also assume the presence of multiple lower-end chips for both process nodes, but given info and rumors about GK11x and GK20x lineups, I'm wondering if GK20x will coexist with 28 nm Maxwell, and not in a rebranding way (so GK11x = 2013 and GK20x = 2014?). However, I think that if the "80%" performance increase per watt is for 28 nm Maxwell, then NVIDIA would move to Maxwell for the other segments quickly (especially for laptops), maybe unless that move would significantly delay 20 nm Maxwell. If the performance increase is for 20 nm Maxwell, then I'd imagine they might take their time.
 
Back
Top