Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Closed Thread
Old 30-Sep-2009, 18:22   #2776
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,863
Send a message via Skype™ to Jawed
Default

Quote:
Originally Posted by DegustatoR View Post
I fail to see any correlation between GT218 and DX11.
They both need a 40nm process and it's "safe" for IHVs to build an "easy" chip on a new process before attempting a behemoth.

Quote:
And it was late because of TSMC not NVIDIA. Which rises the question of who's to blame for it's power characteristics also.
You think NVidia was entirely blameless?

Jawed
__________________
Can it play WoW?
Jawed is offline  
Old 30-Sep-2009, 18:27   #2777
Mintmaster
Senior Member
 
Join Date: Mar 2002
Posts: 3,779
Default

Quote:
Originally Posted by FUDie View Post
Except that increasing engine clock 9% alone was enough to gain 5%. Increasing memory clocks by 9% as well couldn't get you more than additional 4%, so engine clock has more impact that memory clock.
That doesn't prove nAo wrong. I did analysis of these games with RV770, and it has even less dependence on BW for Crysis. Crysis is a bad game to evaluate this, too, as the timedemos/walkthroughs that most reviewers use definately have some parts that are CPU limited.
Mintmaster is offline  
Old 30-Sep-2009, 18:31   #2778
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,863
Send a message via Skype™ to Jawed
Default

Quote:
Originally Posted by trinibwoy View Post
Oh, ok, though I'm not sure if the register file and/or shared memory runs at the hot clock. For one thing, results from the pipeline are written 16 at a time which implies some sort of buffering.
That's due to banking. RF and SM are both twice as wide as the MAD SIMD.

Jawed
__________________
Can it play WoW?
Jawed is offline  
Old 30-Sep-2009, 18:32   #2779
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

Quote:
Originally Posted by Jawed View Post
That's due to banking. RF and SM are both twice as wide as the MAD SIMD.

Jawed
Yeah I know but I took the RF and SM clocks to be 600Mhz (for GTX280). Not sure what's the right way to calculate it.
__________________
What the deuce!?
trinibwoy is offline  
Old 30-Sep-2009, 18:33   #2780
FUDie
Member
 
Join Date: Sep 2002
Posts: 559
Default

Quote:
Originally Posted by Mintmaster View Post
That doesn't prove nAo wrong. I did analysis of these games with RV770, and it has even less dependence on BW for Crysis. Crysis is a bad game to evaluate this, too, as the timedemos/walkthroughs that most reviewers use definately have some parts that are CPU limited.
If you gain 8% from a 9% increase in engine and memory clocks, how can you claim it's CPU limited?

-FUDie
__________________
Ph.D. - Piled Higher and Deeper
FUDie is offline  
Old 30-Sep-2009, 18:41   #2781
dnavas
Member
 
Join Date: Apr 2004
Posts: 326
Default

Quote:
Originally Posted by Jawed View Post
That is interesting -- and I wonder which instructions dominate in the SFU as well. Might not be DIV at all. And to nao's point, not a lot of FMAs either. It would be instructive, I would think, to also know the breakdown of MUL vs. ADD. Clearly, of the programs they ran, the DIV:MUL ratio is less than the 1/2 I estimated (from FMA alone), but now I'm curious
dnavas is offline  
Old 30-Sep-2009, 18:43   #2782
Dave Baumann
Gamerscore Wh...
 
Join Date: Jan 2002
Posts: 12,950
Default

Quote:
Originally Posted by Mintmaster View Post
I did analysis of these games with RV770, and it has even less dependence on BW for Crysis. Crysis is a bad game to evaluate this, too, as the timedemos/walkthroughs that most reviewers use definately have some parts that are CPU limited.
Curiously, the analysis that we did for RV790 clock settings said that Crysis (or Crysis Warhead - forget which) was one of the few titles that gained with more bandwidth on this arch.

Generally speaking though, internal testing on Cypress has indicated similar findings as the FS overclocking does - it benefits more from engine speed than, at least, I expected.
__________________
Expand. Accelerate. Dominate.
Tweet Tweet!
Dave Baumann is offline  
Old 30-Sep-2009, 18:58   #2783
Mintmaster
Senior Member
 
Join Date: Mar 2002
Posts: 3,779
Default

Quote:
Originally Posted by FUDie View Post
If you gain 8% from a 9% increase in engine and memory clocks, how can you claim it's CPU limited?

-FUDie
I said parts are CPU (or PCI-e) limited, and your numbers are wrong, too. He overclocks the GPU by 9.4% and mem by 12.5%. He gets 7.2-8.5% gain, depending on resolution and settings. 100% GPU limited would have given over 10%.

Anyway, that's all besides the point. nAo is right in saying RV870 is more BW limited than RV770. Look at this graph:
http://www.firingsquad.com/hardware/...mages/lpoc.gif
Mintmaster is offline  
Old 30-Sep-2009, 18:58   #2784
SimBy
Member
 
Join Date: Jun 2008
Posts: 184
Default

Quote:
Originally Posted by Scali View Post
Yea, today's keynote is a key point for me
Will they give the final specs? Will there be actual working silicon? Will they mention a launch date?

I feel like I'm in a black hole in terms of upgrading right now. nVidia doesn't have DX11 yet, and not sure when they'll have it, or how good it's going to be.... AMD doesn't have OpenCL and DirectCompute yet, and no clue about when they'll deliver that.
I'm only going to upgrade to a card that does DX11 AND OpenCL/DirectCompute, out of the box (my 8800GTS already does DX10 and OpenCL/DirectCompute, would be pointless to get a HD5850 now if I still need to use my 8800GTS for software development, which is my primary use for the card).
Will be interesting to see who's first to offer me what I want.
What do you mean by AMD doesn't have DirectCompute and OpenCL?!

Not only does HD 5000 series support both, it's actually faster by a fair amount compared to nV GTX series.

http://www.anandtech.com/video/showdoc.aspx?i=3643&p=8

Thats nV Ocean demo for DX Compute.
SimBy is offline  
Old 30-Sep-2009, 19:03   #2785
CouldntResist
Member
 
Join Date: Aug 2004
Posts: 244
Post

Quote:
Originally Posted by Scali View Post
I think the biggest issue is that of using function pointers. A lot of the object-oriented features of C++ are implemented through the manipulation of function pointers. As far as I know, they could only branch with fixed offset so far .
This reminds me certain paper on compiler optimisation technology. The idea was to use novel technique to implement "virtual" (as in C++ nomenclature) method invocations in generated machine code.

Typical way of doing this, is to use virtual-method-tables and indirect branches. In the paper, they didn't generate any indirect branches. Instead, at each call site there was generated (inline) tiny binary search tree traversal to find right jump target among set of precalculated candidates. All done with conditional branches. The goal was to better utilise CPU's branch prediction resources, which allegedly were underutilised under the typical way.

This optimisation technique relied on whole-program-analysis (as opposed to dumb linking of separately compiled fragments, typical in C/C++ world) to make these search tree really tiny, or even to eliminate need of search altogether (in as much as 90% of virtual method invocations in tested programs).

If this technique was used, I think you could run fully OO code on a GPU, even now (ignoring SIMD, memory organisation etc. of course).
CouldntResist is offline  
Old 30-Sep-2009, 19:04   #2786
Mintmaster
Senior Member
 
Join Date: Mar 2002
Posts: 3,779
Default

Quote:
Originally Posted by Dave Baumann View Post
Generally speaking though, internal testing on Cypress has indicated similar findings as the FS overclocking does - it benefits more from engine speed than, at least, I expected.
I think most people expect BW to make more of a difference than it actually does. Take a look at my findings here:
http://forum.beyond3d.com/showthread.php?t=48761

In most games, the 4850 is BW limited for less than 30% of the frame time.
Mintmaster is offline  
Old 30-Sep-2009, 19:41   #2787
DegustatoR
Senior Member
 
Join Date: Mar 2002
Location: msk.ru/spb.ru
Posts: 1,311
Default

Quote:
Originally Posted by Arty View Post
That's hardly an excuse, AMD didn't suffer as much so it comes down to NV's design.
So RV740 avialability 6 months after it was announced means that AMD didn't suffer as much, eh? And it's price parity with 4850 surely mean the same thing?

Quote:
Originally Posted by Jawed View Post
They both need a 40nm process and it's "safe" for IHVs to build an "easy" chip on a new process before attempting a behemoth.
It's not "safe", it's easier, less risky and it's done so that later you won't make the same mistakes with a bigger chip.

Quote:
Originally Posted by Jawed View Post
You think NVidia was entirely blameless?
I don't have enough info for any conclusions right now.
And it puzzles me when I see someone who apparently does.
It puzzles me even more to read about G300 delays while the initally planned launch frame haven't even passed yet.
All we have right now is a delay of GT21x series for which TSMC is the one to blame. That's all. How it'll end up with GT21x power/price/performance and GF100 we'll eventually see.
DegustatoR is offline  
Old 30-Sep-2009, 19:58   #2788
Scali
Naughty Boy!
 
Join Date: Nov 2003
Posts: 2,127
Send a message via ICQ to Scali Send a message via MSN to Scali
Default

Quote:
Originally Posted by SimBy View Post
What do you mean by AMD doesn't have DirectCompute and OpenCL?!

Not only does HD 5000 series support both, it's actually faster by a fair amount compared to nV GTX series.

http://www.anandtech.com/video/showdoc.aspx?i=3643&p=8

Thats nV Ocean demo for DX Compute.
DirectCompute works, but only on the HD5800-series.
OpenCL isn't supported yet.
__________________
ZX81 -> C64 -> Hercules -> Plantronics CGA -> Paradise VGA -> Amiga ECS -> Amiga AGA -> Cirrus Logic 5428 VLB -> S3 Trio64 -> Matrox Mystique -> PCX2 -> Matrox G200 -> Matrox G450 -> GeForce2 GTS -> Kyro II -> Radeon 8500 -> Radeon 9600XT -> GeForce 7600GT -> GeForce 8800GTS -> HD5770
Scali is offline  
Old 30-Sep-2009, 20:04   #2789
Andrew Lauritzen
AndyTX
 
Join Date: May 2004
Location: British Columbia, Canada
Posts: 1,841
Default

Quote:
Originally Posted by Scali View Post
DirectCompute works, but only on the HD5800-series.
By my testing, DirectCompute doesn't work at all on any NVIDIA parts yet (maybe it does on Win7?)... not even DirectCompute 4. AMD is clearly a step ahead here with DirectCompute 5 working even on Vista.
__________________
The content of this message is my personal opinion only.
Andrew Lauritzen is offline  
Old 30-Sep-2009, 20:08   #2790
apoppin
Member
 
Join Date: Feb 2006
Location: Hi Desert SoCal
Posts: 255
Default

Quote:
Originally Posted by Scali View Post
Yea, today's keynote is a key point for me
Will they give the final specs? Will there be actual working silicon? Will they mention a launch date?

I feel like I'm in a black hole in terms of upgrading right now. nVidia doesn't have DX11 yet, and not sure when they'll have it, or how good it's going to be.... AMD doesn't have OpenCL and DirectCompute yet, and no clue about when they'll deliver that.
I'm only going to upgrade to a card that does DX11 AND OpenCL/DirectCompute, out of the box (my 8800GTS already does DX10 and OpenCL/DirectCompute, would be pointless to get a HD5850 now if I still need to use my 8800GTS for software development, which is my primary use for the card).
Will be interesting to see who's first to offer me what I want.
the Jensen Keynote will be live on Nvidia.com at 1 PM; i am here at the GTC now
- they say about 1/3rd of it requires 3D glasses .. so you know it will be a lot of 3D

what i am interested in is the PRESS conference after the keynote; it is at 2:45 PM
- i expect a lot more to be revealed then

The Fairmont Hotel, San Jose is such a cool place for a technology conference; Nvidia has an entire floor for it .. and (best of all, good) food is free for the press
__________________
Cum odio sui coepit veritas. Simul atque apparuit, inimca est. --
Tertullian, Apologeticus(VII, 3)
apoppin is offline  
Old 30-Sep-2009, 20:16   #2791
Jawed
Regular
 
Join Date: Oct 2004
Location: London
Posts: 9,863
Send a message via Skype™ to Jawed
Default

Quote:
Originally Posted by Mintmaster View Post
I think most people expect BW to make more of a difference than it actually does. Take a look at my findings here:
http://forum.beyond3d.com/showthread.php?t=48761

In most games, the 4850 is BW limited for less than 30% of the frame time.
Well it turns out that 512MB is too little for a lot of games - the assessment of bandwidth limitation needs to be done with a 1GB card.

Jawed
__________________
Can it play WoW?
Jawed is offline  
Old 30-Sep-2009, 20:20   #2792
nexus_alpha
Junior Member
 
Join Date: Nov 2006
Posts: 44
Send a message via MSN to nexus_alpha Send a message via Skype™ to nexus_alpha
Default

Specs from Bright side of news

3.0 billion transistors
40nm TSMC
384-bit memory interface
512 shader cores [renamed into CUDA Cores]
32 CUDA cores per Shader Cluster
1MB L1 cache memory [divided into 16KB Cache - Shared Memory]
768KB L2 unified cache memory
Up to 6GB GDDR5 memory
Half Speed IEEE 754 Double Precision

By comparison to ATI Rv870

20 SIMDS
16 kb L1 cache per SIMD = 320kb Texture cache
8 kb L1 cache per SIMD for computational work= 160kb computational cache

32 kb local data share L1 cache per SIMD = 640kb local data share cache

128 kb L2 cache per memory controller= 512 kb L2 cache

L1 cache speed 1 terabyte per second

L2-L1 cache speed 435 gb per second.
nexus_alpha is offline  
Old 30-Sep-2009, 20:20   #2793
Scali
Naughty Boy!
 
Join Date: Nov 2003
Posts: 2,127
Send a message via ICQ to Scali Send a message via MSN to Scali
Default

Quote:
Originally Posted by Andrew Lauritzen View Post
By my testing, DirectCompute doesn't work at all on any NVIDIA parts yet (maybe it does on Win7?)... not even DirectCompute 4. AMD is clearly a step ahead here with DirectCompute 5 working even on Vista.
DirectCompute 4 works out-of-the-box on Win7 with 190-release drivers.
It works on Vista aswell, but by default it is disabled through a registry key.
The release notes of the GPU Computing SDK tell you how to enable it (that's where Anandtech got the Ocean demo from).
So looks like nVidia is ahead. They've had support on release drivers for a while, and I don't think we need to compare the installed base
__________________
ZX81 -> C64 -> Hercules -> Plantronics CGA -> Paradise VGA -> Amiga ECS -> Amiga AGA -> Cirrus Logic 5428 VLB -> S3 Trio64 -> Matrox Mystique -> PCX2 -> Matrox G200 -> Matrox G450 -> GeForce2 GTS -> Kyro II -> Radeon 8500 -> Radeon 9600XT -> GeForce 7600GT -> GeForce 8800GTS -> HD5770

Last edited by Scali; 30-Sep-2009 at 20:42.
Scali is offline  
Old 30-Sep-2009, 20:37   #2794
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

Quote:
Originally Posted by Andrew Lauritzen View Post
By my testing, DirectCompute doesn't work at all on any NVIDIA parts yet (maybe it does on Win7?)... not even DirectCompute 4. AMD is clearly a step ahead here with DirectCompute 5 working even on Vista.
How did they write/show the DirectCompute wave demo in that case?
__________________
What the deuce!?
trinibwoy is offline  
Old 30-Sep-2009, 20:39   #2795
3dilettante
Senior Member
 
Join Date: Sep 2003
Location: Well within 3d
Posts: 4,121
Default

If the BSN specs are accurate, then the L2 size alone indicates that is one part of Larrabee's design that wasn't copied.
A scheme similar to Larrabee's (in particular the tiling) would need the capacity such an L2 affords, and the L2 given isn't much bigger than that of Cypress.
__________________
Dreaming of a .065 micron etch-a-sketch.
3dilettante is online now  
Old 30-Sep-2009, 20:50   #2796
Andrew Lauritzen
AndyTX
 
Join Date: May 2004
Location: British Columbia, Canada
Posts: 1,841
Default

Quote:
Originally Posted by Scali View Post
It works on Vista aswell, but by default it is disabled through a registry key.
Why on earth would they do that? The rest of "feature level 10" on DX11 interfaces seems to work fine... silly decision IMHO, but thanks for the pointer. I'll go look into that.

Quote:
Originally Posted by Scali View Post
So looks like nVidia is ahead. They've had support on release drivers for a while, and I don't think we need to compare the installed base
Huh? You're arguing that ComputeShader 4 support on G80+ HW with a registry key setting somehow puts them "ahead" of ATI's full ComputeShader 5 implementation that works "out of the box" on their latest hardware? From a developer's point of view, you and I have different definitions of "ahead"...

No point in arguing though, the key point is that I can write CS5 code right now on AMD parts, with no ETA on when I can do that on NVIDIA. This puts AMD as the obviously more useful piece of hardware at my disposal right now
__________________
The content of this message is my personal opinion only.
Andrew Lauritzen is offline  
Old 30-Sep-2009, 20:51   #2797
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 7,767
Default

GF100 vs. Cypress performance comparison
__________________
People are more violently opposed to fur than leather; because it's easier to harass rich ladies than motorcycle gangs.
Ailuros is offline  
Old 30-Sep-2009, 20:52   #2798
madyasiwi
Member
 
Join Date: Oct 2008
Posts: 107
Default

http://www.nvidia.com/object/fermi_architecture.html
madyasiwi is offline  
Old 30-Sep-2009, 20:55   #2799
fellix
Senior Member
 
Join Date: Dec 2004
Location: Varna, Bulgaria
Posts: 2,819
Send a message via Skype™ to fellix
Default

From this russian site, one curious line:

Отсутствие аппаратного блока тесселяции, данный функционал будет реализован программно; -- There is no hardware tessellation unit, the function is implemented on a program level;

__________________
Apple: China -- Brutal leadership done right.
Google: United States -- Somewhat democratic.
Microsoft: Russia -- Big and bloated.
Linux: EU -- Diverse and broke.
fellix is offline  

Closed Thread

Tags
nvidia, speculation

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:15.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.