Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 04-Dec-2003, 01:43   #1
g__day
Senior Member
 
Join Date: Jun 2002
Location: Sydney Australia
Posts: 575
Default What are the key tradeoffs ATi may be considering with R420?

Rather than speculate exactly what R420 may deliver - I just wondered what folk here see as the main competiting priorities on how they will allocate their transistor budgets for R420 - and the pros and cos of each choice.

I imagine there are many ways to decide where your transistor budget goes: more pipelines, deeper pipelines, fatter pipelines, number of stages and executions units in your pipelines are amongst the things you could trade off.

If PS 3.0 requires fp32 in the pixel pipeline maybe fp32 in teh pixel shaders will be given more priority from a marketing perspective - even though no games using these features might come out until R420 is heading towards obsolence - as is usual for new generations of cards and the features they introduce.

What do folk believe is the priority list for allocation of additional transistors and how many transistors overall do we believe R420 may be - 140M, 150M, 180M?

I'd say more and faster pipelines are for sure 12 * 1 maybe, maybe more execution units per pipelines - and am wondering in it will be R420 or R500 that delivers true fp32 in pixel shaders that can actually perform to decent levels.
g__day is offline   Reply With Quote
Old 04-Dec-2003, 02:10   #2
Tahir2
Itchy
 
Join Date: Feb 2002
Location: United Queendom
Posts: 2,859
Default

I don't think it will be a 12x1 architecture.
I believe DirectX9.0b requires *at least* FP24 support and so ATI will continue with level of precision for the R420.
__________________
Time is an illusion. Lunchtime doubly so - Douglas Adams
Tahir2 is offline   Reply With Quote
Old 04-Dec-2003, 03:04   #3
Tim Murray
chaos dunk
 
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
Default

I always thought ps_3_0 required FP24 rather than FP32.

R420 is FP24. R500 is FP32.
Tim Murray is offline   Reply With Quote
Old 04-Dec-2003, 03:48   #4
Megadrive1988
Senior Member
 
Join Date: May 2002
Posts: 4,310
Default

as for pipeline configuration..........

A.) 16*1
B.) 12*1
C.) 8*2
Megadrive1988 is offline   Reply With Quote
Old 04-Dec-2003, 04:30   #5
Xmas
Off-season
 
Join Date: Feb 2002
Location: On the pursuit of happiness
Posts: 3,019
Default

PS3.0 precision requirement is still FP24, despite contrary rumors.
Xmas is offline   Reply With Quote
Old 04-Dec-2003, 04:54   #6
jvd
Naughty Boy!
 
Join Date: Feb 2002
Location: new jersey
Posts: 12,731
Send a message via AIM to jvd
Default

what no 12 x 3 ???
__________________
Freexbox 360 !!!
Free Psp!
jvd is offline   Reply With Quote
Old 04-Dec-2003, 05:07   #7
Tim Murray
chaos dunk
 
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
Default

It has eleven and three-fourth pipelines, just like the PS2. (Honestly, it's probably a 12x1 or 16x1. That currently depends on whom you ask at the moment--nobody really knows.)
Tim Murray is offline   Reply With Quote
Old 04-Dec-2003, 06:13   #8
Arun
Unknown.
 
Join Date: Aug 2002
Location: UK
Posts: 4,883
Default

Quote:
Originally Posted by The Baron
That currently depends on whom you ask at the moment--nobody really knows.)
If only we had a fricking transistor count figure! *sigh*
But ATI employees are not only happier than NVIDIA's most likely, but they're also an awful lot more loyal. NVIDIA's technique against leaks is employee termination, while ATI's is misinformation and insisting to employees it just damages the company and nothing else.

ATI is stopping leaks better by asking employees nicely than NVIDIA who's threatening them of termination.

NVIDIA's Information Security department is a spy film ripoff at best, beggining their questions by the good ole: "Where were you at this hour, on this day?"
But I guess that doesn't mean they still do a pretty good job. ATI's doing a better one though, and without all this useless BS.

Anyway, on topic: Even though the R420 should, in theory, support VS3.0./PS3.0. - its goal is to have astonishingly good VS2.0./PS2.0. speed, not simply "okay" 3.0. speed. So I suspect some features might be implemented in a rather simple way, but hey, that's always better than having to do it on the CPU


Uttar
Arun is offline   Reply With Quote
Old 05-Dec-2003, 01:26   #9
Hyp-X
Irregular
 
Join Date: Feb 2002
Posts: 1,170
Default

Well, I except:
- roughly the same texturing speed (per clock)
- the same speed in VS but with full VS3.0 support.
- 2x the speed in PS arithmetic ops (per clock)
- full PS3.0 support but relatively low performance when dynamic branches are used
- FP24 in PS
- same AA and AF algorithm and texturing compromises
Hyp-X is offline   Reply With Quote
Old 05-Dec-2003, 01:38   #10
ClyssaN
Member
 
Join Date: Jan 2003
Location: Portugal
Posts: 131
Default

MuFu has disappeared...
__________________
ANIMATED ALGORITHMS IN JAVA
http://gdias.visaodigital.pt/index2.php
ClyssaN is offline   Reply With Quote
Old 05-Dec-2003, 02:55   #11
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 7,830
Default

So far ATI has shown IMO that their basic design philosophy with the R3xx family was to save as many transistors as possible. The real question is if the same design philosophy has made it's way into upcoming product families too or not. Somehow I recall an official statement that could suggest a significant change in said philosophy...
Ailuros is offline   Reply With Quote
Old 05-Dec-2003, 03:05   #12
g__day
Senior Member
 
Join Date: Jun 2002
Location: Sydney Australia
Posts: 575
Default

Hyp-X - 2 times the speed in Pixel Shading - are you thinking a 12 * 1 pipeline architecture with a 500MHz - 600MHz core speed on 0.13mircon to give you that twice the speed? Actually climbing from a 400MHz to a 535MHz core * 50% more well utilised pipes would get you around the right ball park, provided you could switch those chips fast enough given the capitance of all those parallel curcuits. That might seem reasonable to me.

How many transistors does each fp24 pixel shading pipeline roughly require? A move from 8 x 1 to 12 x 1 would take roughly how much of your transistor budget?
g__day is offline   Reply With Quote
Old 05-Dec-2003, 03:23   #13
Dave Baumann
Gamerscore Wh...
 
Join Date: Jan 2002
Posts: 12,989
Default

Quote:
Originally Posted by Ailuros
So far ATI has shown IMO that their basic design philosophy with the R3xx family was to save as many transistors as possible.
How so? Surely the design philosophy was "get the job done".

They did have a change in philosophy and that was actually the opposite that you describe. Previously they had always been working to specific die size constraints, with R300 these constraints were removed (within reason).
__________________
Expand. Accelerate. Dominate.
Tweet Tweet!
Dave Baumann is offline   Reply With Quote
Old 05-Dec-2003, 04:29   #14
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 7,830
Default

I probably didn't use a very good description of what I actually meant.

Quote:
....We then iterate over all these. Sometimes features are cut, if they aren't "good enough" or if they cost too much (area wise). Or features can evolve and change, if we find better ways of doing things. It sounds more complicated than it is. It's really simply coming up with ideas, and then figuring out exactly how to make them work.
http://www.driverheaven.net/ericdemers/

A simple example would be FP24, while by far not the only one, especially considering that there wasn't much headroom left at 15nm.

Moving to 13nm there's a lot more headroom open to include whatever they had in mind for a next generation design, without facing restrictions up to the same degree. Here the way I see it they can either choose a relatively conservative approach or try to hit the absolute possible maximum that would be possible for the specific manufacturing process. While it may sound as an oxymoron as one could easily say that 13nm would be "outsourced" just as much as 15nm, if in a hypothetical case they truly have shot for a 200M design more or less I don't see a necessity for "middle of the road" sollutions there (at the possible sacrifice of margins of course).

Of course could it be that I'm completely wrong, I wish I could remember who and where made a "sacrifice of margins" relevant comment in the first place. All of course open to corrections as always
Ailuros is offline   Reply With Quote
Old 05-Dec-2003, 04:46   #15
demalion
Senior Member
 
Join Date: Feb 2002
Location: CT
Posts: 2,024
Default

Well, given the details of per pipeline calculation capabilities, it seems pretty feasible for ATI to explore transistor efficient ways of expanding it. I.e., more, carefully selected for common applicability, instructions supported for supplemental arithmetic ops in a pipe.

I still have an impression that a focus of functionality implementation will be with "interconnect" management focus (F-Buffer and, this name was dropped the other day, V-Buffer). However, I don't know how extensive this focus need be for their target compared to achieving computational power increase, nor how expensive it has to be in transitors. If it is relatively cheap...at least in terms of what is added/replaced overall for implementation...this would leave a lot of the room from process improvement for the pure computation concerns above. Actually, given the strong performance of the existing generation, this would also seem to point towards AF and AA improvements as very likely if a significant transistor count increase is allowed. Unless there is some sort of HyperZ suite improvement that seems important, but is transistor hungry?

Come to think of it, ATI seems to focus on improvements in all aspects concurrently for these steps, though an "either/or" for AA and AF could still satisfy a "Smoothvision 3.0" if either the budget is significantly lower than I expect (200 M at the outside, 180 as a guesspectation) or the performance target is higher than I expect (+50% seems like an upper limit for perception of clear improvement). Not sure about "HyperZ IV", though.

That's with PS/VS 3.0 not being markedly more transistor expensive than I expect tipping the picture downwards, and without any R300 size rabbits in the engineering hat tipping it upwards. Or, of course, both happening and "evening out" :P.
demalion is offline   Reply With Quote
Old 05-Dec-2003, 06:15   #16
akira888
Member
 
Join Date: Jul 2003
Location: Houston
Posts: 652
Default

Quote:
Originally Posted by Ailuros
...13nm..15nm...15nm...
If those processes come out next year we'll all be doing real time Toy Story in no time. :P
akira888 is offline   Reply With Quote
Old 05-Dec-2003, 10:34   #17
Unit01
Member
 
Join Date: Feb 2003
Posts: 190
Default

Quote:
Originally Posted by akira888
Quote:
Originally Posted by Ailuros
...13nm..15nm...15nm...
If those processes come out next year we'll all be doing real time Toy Story in no time. :P
My 3D Prophet 2 - GF 2GTS said it could produce toy story graphics
__________________
I'll think of something later
Unit01 is offline   Reply With Quote
Old 05-Dec-2003, 11:14   #18
Tim
Member
 
Join Date: Mar 2003
Location: Denmark
Posts: 867
Default

Quote:
Originally Posted by akira888
Quote:
Originally Posted by Ailuros
...13nm..15nm...15nm...
If those processes come out next year we'll all be doing real time Toy Story in no time. :P
10 billion transistors clocked at up to 50GHz sounds realistic at 15nm. I don't think real time Toy Story is any challenge what so ever for these chips.
Tim is offline   Reply With Quote
Old 05-Dec-2003, 11:22   #19
Hyp-X
Irregular
 
Join Date: Feb 2002
Posts: 1,170
Default

Quote:
Originally Posted by g__day
Hyp-X - 2 times the speed in Pixel Shading - are you thinking a 12 * 1 pipeline architecture with a 500MHz - 600MHz core speed on 0.13mircon to give you that twice the speed? Actually climbing from a 400MHz to a 535MHz core * 50% more well utilised pipes would get you around the right ball park, provided you could switch those chips fast enough given the capitance of all those parallel curcuits. That might seem reasonable to me.

How many transistors does each fp24 pixel shading pipeline roughly require? A move from 8 x 1 to 12 x 1 would take roughly how much of your transistor budget?
I don't expect it to be 12x1.
More like 8x1 but with more PS-ops per pipe and/or higher utilization.
I don't think 12x1 is practical nor neccessary.

As for the number of texturing units it will depend how much extra stuff they want to put in for dynamic shaders.
If you aware of the nv3x's texldd performance you know what I'm talking about.
Hyp-X is offline   Reply With Quote
Old 05-Dec-2003, 16:00   #20
Chalnoth
 
Join Date: May 2002
Location: New York, NY
Posts: 12,679
Default

Quote:
Originally Posted by Ailuros
A simple example would be FP24, while by far not the only one, especially considering that there wasn't much headroom left at 15nm.
Hehe, you mean 150nm?

Anyway, just as a side note, 10nm is about the physical limit of semiconductor technologies. Due to the physics, you just can't create features smaller, not without using radically different technology. And getting to 10nm will be an incredible challenge, as the behavior of semiconductors on that scale changes dramatically.
Chalnoth is offline   Reply With Quote
Old 05-Dec-2003, 18:47   #21
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 7,830
Default

Yes I did. And I'm not going to edit the former post. Too many zero's or dots, too much work.
Ailuros is offline   Reply With Quote
Old 05-Dec-2003, 23:10   #22
T2k
Senior Member
 
Join Date: Jun 2002
Location: The Slope & TriBeCa (NYC)
Posts: 2,004
Default

Quote:
Originally Posted by Ailuros
Of course could it be that I'm completely wrong, I wish I could remember who and where made a "sacrifice of margins" relevant comment in the first place. All of course open to corrections as always
Well, I believe Dave did it on 1st of Nov:

(...)
ATI mentioned to Analysts that for the next cycle that in the high end, and to a lesser extent mainstream, they would be willing to sacrifice some margin in order to gain performance. The implication being that perhaps R420 will be a relatively large chip with numerous pipelines in order to gain that performance.


__________________
by T2k!

Athlon 64 FX-53 | Gigabyte GA-K8NSNXP-939 | Corsair TWINX1024-4000 | Corsair HydroCool200™ Xtreme Water Cooling | ATI RADEON™ X800 XT PE | HP A7217A 24" CRT | Canopus DVRex-RT + an Apple dual G5 1.8GHz
T2k is offline   Reply With Quote
Old 05-Dec-2003, 23:12   #23
T2k
Senior Member
 
Join Date: Jun 2002
Location: The Slope & TriBeCa (NYC)
Posts: 2,004
Default

Quote:
Originally Posted by Hyp-X
Quote:
Originally Posted by g__day
Hyp-X - 2 times the speed in Pixel Shading - are you thinking a 12 * 1 pipeline architecture with a 500MHz - 600MHz core speed on 0.13mircon to give you that twice the speed? Actually climbing from a 400MHz to a 535MHz core * 50% more well utilised pipes would get you around the right ball park, provided you could switch those chips fast enough given the capitance of all those parallel curcuits. That might seem reasonable to me.

How many transistors does each fp24 pixel shading pipeline roughly require? A move from 8 x 1 to 12 x 1 would take roughly how much of your transistor budget?
I don't expect it to be 12x1.
More like 8x1 but with more PS-ops per pipe and/or higher utilization.
I don't think 12x1 is practical nor neccessary.
Agreed, neither of that...
__________________
by T2k!

Athlon 64 FX-53 | Gigabyte GA-K8NSNXP-939 | Corsair TWINX1024-4000 | Corsair HydroCool200™ Xtreme Water Cooling | ATI RADEON™ X800 XT PE | HP A7217A 24" CRT | Canopus DVRex-RT + an Apple dual G5 1.8GHz
T2k is offline   Reply With Quote
Old 05-Dec-2003, 23:23   #24
nobie
Member
 
Join Date: Nov 2003
Location: Texas, USA
Posts: 353
Default

As far as transistor count, I speculate R420 won't have more transistors than NV40 (which is 175m according to Uttar). Maybe ~150m (which would be proportional to the difference between R350 and NV35). I think they'll have trouble getting much more performance than R350, without doing something drastically different in the design, which doesnt seem likely. I'm not trying to spell doom for them though, still have more faith in ATI than Nvidia...
nobie is offline   Reply With Quote
Old 06-Dec-2003, 03:47   #25
Ailuros
Epsilon plus three
 
Join Date: Feb 2002
Location: Chania
Posts: 7,830
Default

Ahhh thanks T2k.
Ailuros is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
ATI, Connect 3D and OLDI form Strategic Relationship Dave Baumann Press Releases 0 22-May-2003 18:03
The startup that saved ATI megadrive0088 Console Technology 33 25-Apr-2003 21:00
NV30 Moves into 2003 Dave Baumann Beyond3D News 32 12-Nov-2002 09:01
ATI Reports Fourth Quarter and Year-End Financial Results Dave Baumann Press Releases 0 03-Oct-2002 09:30
ATI unleashes revolutionary RADEON™ visual processors Dave Baumann Press Releases 0 19-Jul-2002 12:47


All times are GMT +1. The time now is 22:13.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.