Nvidia GT300 core: Speculation

chavvdarrr · Sep 30, 2009

1 day in near future, a writing on the screen when starting your brand new IBM PC compatible...

"no CPU detected, starting software emulation"

LOL

MfA · Sep 30, 2009

chavvdarrr said:
WTF "native C, C++, Fortran etc" means?!
NV were able to make ANSI C++ compiler for the new chip?

It's not really a big deal, hell there are C++ to C translators ... although I fail to see the point of Fortran apart from the warm and fuzzy feeling the name generates. It's not like even porting legacy code to it is an option, the level of algorithmic changes needed to suit a GPU make a rewrite the only realistic option. Fortran doesn't seem to me to be a great language to write kernels in.

PS. http://www.pgroup.com/resources/cudafortran.htm

trinibwoy · Sep 30, 2009

Well the question is what features of C++ do they currently NOT support? The answer to that question would probably provide hints as to what else they've changed

Scali · Sep 30, 2009

trinibwoy said:
Well the question is what features of C++ do they currently NOT support? The answer to that question would probably provide hints as to what else they've changed

I think the biggest issue is that of using function pointers. A lot of the object-oriented features of C++ are implemented through the manipulation of function pointers. As far as I know, they could only branch with fixed offset so far .

MfA · Sep 30, 2009

Scali said:
I think the biggest issue is that of using function pointers.

Aren't those required for DX11?

rpg.314 · Sep 30, 2009

The biggest C++ change would be new and delete inside kernels.

muzux2 · Sep 30, 2009

i think, GT300 packs an additional SM(Stream multiproceesor) up from 3 in GT200 and 16 TPC up from 10 in GT200..

As G80 had 8 TPCs with 2 SMs inside each TPC. Each SMs contains 8 SPs, thus have total of 16 SPs in each TPC, equals total of 128 SPs

8(TPC) * (8 * 2) = 128

GT200 had 10 TPC with 3 SMs inside each TPC. Each SMs contains 8 SPs, thus have total of 24 SPs in each TPC, equals total of 240 SPs

10(TPC) * (8 * 3) = 240

Now according to BSN rumour, i think, GT300 will have 16 TPC with 4 SMs inside each TPC. Each SMs containing 8 SPs, thus will have total of 32 SPs in each TPC, equals total
of 512 SPs..

16 (TPC) * (8 * 4) = 512

GT300 might be GT200 on Steriods... i may be wrong

trinibwoy · Sep 30, 2009

It's really shaping up like Nvidia built Larrabee while Intel was talking about building it. I'm itching to know what fixed function stuff they might have gotten rid of, or if there are any changes to the rendering pipeline.

muzux2 said:
GT300 might be GT200 on Steriods... i may be wrong

According to Rys, you are

Scali · Sep 30, 2009

MfA said:
Aren't those required for DX11?

Didn't think so... but I'm talking about current hardware here.

Ailuros · Sep 30, 2009

MfA said:
It's not really a big deal, hell there are C++ to C translators ... although I fail to see the point of Fortran apart from the warm and fuzzy feeling the name generates. It's not like even porting legacy code to it is an option, the level of algorithmic changes needed to suit a GPU make a rewrite the only realistic option. Fortran doesn't seem to me to be a great language to write kernels in.

PS. http://www.pgroup.com/resources/cudafortran.htm

It might be a big deal for the lazy ones out there *shrugs*

Scali · Sep 30, 2009

MfA said:
It's not really a big deal, hell there are C++ to C translators ...

I don't think those will work, since 'C for Cuda' isn't fully ANSI C. C++ can only be translated to C if it supports all features... like I said, function pointers are key to the object model.

Bouncing Zabaglione Bros. · Sep 30, 2009

trinibwoy said:
It's really shaping up like Nvidia built Larrabee while Intel was talking about building it. I'm itching to know what fixed function stuff they might have gotten rid of, or if there are any changes to the rendering pipeline.

If it's true, it looks like Nvidia went GPU->CPU while Intel are trying to do CPU->GPU ie both trying to solve the same problems from opposite starting points. It's certainly a very ambitious approach and a big step towards convergence of GPU/CPU.

I guess all those questions about Nvdia not having a x86 licence are kind of moot if you can talk to the new chip via a compiler the same way as you talk to any CPU.

Arty · Sep 30, 2009

Bouncing Zabaglione Bros. said:
Is this the first big leak? Sounds impressive on paper, though I do wonder if all the focus on GPGPU means that the gaming side of things will be taking a backseat.

Very typical of Theo, convenient how both hardware-Infos & bsn come out with this 'exclusive' 'breaking' news story AFTER Rys' hint

By the way, when is the webcast? (est)

Ailuros · Sep 30, 2009

Arty said:
Very typical of Theo, convenient how both hardware-Infos & bsn come out with this 'exclusive' 'breaking' news story AFTER Rys' hint

By the way, when is the webcast? (est)

There's a huge difference between an educated hw analysis and being first and 2nd at nothing.

KimB · Sep 30, 2009

MfA said:
It's not really a big deal, hell there are C++ to C translators ... although I fail to see the point of Fortran apart from the warm and fuzzy feeling the name generates. It's not like even porting legacy code to it is an option, the level of algorithmic changes needed to suit a GPU make a rewrite the only realistic option. Fortran doesn't seem to me to be a great language to write kernels in.

PS. http://www.pgroup.com/resources/cudafortran.htm

Quite a lot of scientific work is still done in Fortran, and though it is possible to get Fortran and C/C++ to play together, it can be difficult and fraught with difficulties with compiling properly. So having a native Fortran version of Cuda could be a boon for getting it adopted within the scientific community.

Scali · Sep 30, 2009

Arty said:
By the way, when is the webcast? (est)

Keynote is at 1 PM PT, which I assume is webcast live, see here for more info:
http://www.nvidia.com/object/gpu_technology_conference.html

trinibwoy · Sep 30, 2009

Arty said:
By the way, when is the webcast? (est)

4pm EST / 1pm PST

http://www.nvidia.com/object/gpu_technology_conference.html

3dilettante · Sep 30, 2009

trinibwoy said:
Looks like that's exactly what they're trying to do. Strange that there's no mention of any graphics specific bits so far. Not saying there aren't any but the focus seems to have veered sharply away from graphics.

The Rys blur-o-gram had a lot of the same colors in the area that the GT200 one had for shader and triangle setup. It looks like there's still some kind of texture block.
The compute portion appears to be heavily reworked, and the area that was the ROP section is still there, but I can't infer much from a gray (oddly dark gray...) smudge.

If the setup, texturing, and ROP specialized sections persist, the Fermi architecture would be the answer to the question "what if we made Larrabee without x86, and gave it ROPs and a rasterizer?"
The next question would be, "what if we built Larrabee with an inferior process", but I digress.

That's true, but the same could be said for G71->G80 which was an even bigger change. Though they are trying to do more stuff now which could have put a strain on resources.

The rumors seem to reflect that the birthing process for this new chip could have been smoother.

It's probably safe to assume that if they're serious about computing, performance of atomics would have been high on their todo list. Side question - are the existing caches on GPUS generally useful for non-texture data (not referring to the specialized caches like PTVC)?

I'm not sure.
They are pretty small, and they are structured to provide peak bandwidth for the common case of filtered texture fetches.
I'm not sure how much of their behavior changes if they are tasked with linearly addressed memory. If the data is structured to make the most of them, then their bandwidth can be used.
Their size and read-only nature makes them less than generally useful.

MfA · Sep 30, 2009

Arty said:
Very typical of Theo, convenient how both hardware-Infos & bsn come out with this 'exclusive' 'breaking' news story AFTER Rys' hint

Wouldn't that be NVIDIA's hint? (If you're under NDA you tell what you are told you can tell.)

dnavas · Sep 30, 2009

Rys said:
All will be revealed later today anyway, not long to go now.

Wow, no wonder the parking lot was full late last night! I thought we had a couple of weeks to go. I wonder if they brought the demo forward for competitive reasons?

Also, I realize Rys says everything has changed, but:
1) 16kb per 8-wide set of SPs does actually work out
2) I like that there are four blue dots and four sets of SPs in there
3) I wonder what bits in the chip run the C++ code
4) If DP runs half-speed, they really have done some work in there.
5) Isn't it great that each SP can run an instruction per clock per thread? Why, all I have to do to increase performance is add more threads! Infinite TFlops!

Nvidia GT300 core: Speculation

chavvdarrr

MfA

trinibwoy

Meh

Scali

MfA

rpg.314

muzux2

trinibwoy

Meh

Scali

Ailuros

Epsilon plus three

Scali

Bouncing Zabaglione Bros.

Arty

KEPLER

Ailuros

Epsilon plus three

KimB

Scali

trinibwoy

Meh

3dilettante

MfA

dnavas

Similar threads