If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Senior Member
Join Date: Jun 2002
Location: Sydney Australia
Posts: 575
|
Rather than speculate exactly what R420 may deliver - I just wondered what folk here see as the main competiting priorities on how they will allocate their transistor budgets for R420 - and the pros and cos of each choice.
I imagine there are many ways to decide where your transistor budget goes: more pipelines, deeper pipelines, fatter pipelines, number of stages and executions units in your pipelines are amongst the things you could trade off. If PS 3.0 requires fp32 in the pixel pipeline maybe fp32 in teh pixel shaders will be given more priority from a marketing perspective - even though no games using these features might come out until R420 is heading towards obsolence - as is usual for new generations of cards and the features they introduce. What do folk believe is the priority list for allocation of additional transistors and how many transistors overall do we believe R420 may be - 140M, 150M, 180M? I'd say more and faster pipelines are for sure 12 * 1 maybe, maybe more execution units per pipelines - and am wondering in it will be R420 or R500 that delivers true fp32 in pixel shaders that can actually perform to decent levels. |
|
|
|
|
|
#2 |
|
Itchy
Join Date: Feb 2002
Location: United Queendom
Posts: 2,859
|
I don't think it will be a 12x1 architecture.
I believe DirectX9.0b requires *at least* FP24 support and so ATI will continue with level of precision for the R420.
__________________
Time is an illusion. Lunchtime doubly so - Douglas Adams |
|
|
|
|
|
#3 |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
I always thought ps_3_0 required FP24 rather than FP32.
R420 is FP24. R500 is FP32. |
|
|
|
|
|
#4 |
|
Senior Member
Join Date: May 2002
Posts: 4,310
|
as for pipeline configuration..........
A.) 16*1 B.) 12*1 C.) 8*2 |
|
|
|
|
|
#5 |
|
Off-season
Join Date: Feb 2002
Location: On the pursuit of happiness
Posts: 3,019
|
PS3.0 precision requirement is still FP24, despite contrary rumors.
__________________
Binary prefixes for bits and bytes |
|
|
|
|
|
#6 |
|
Naughty Boy!
|
what no 12 x 3 ???
|
|
|
|
|
|
#7 |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
It has eleven and three-fourth pipelines, just like the PS2. (Honestly, it's probably a 12x1 or 16x1. That currently depends on whom you ask at the moment--nobody really knows.)
|
|
|
|
|
|
#8 | |
|
Unknown.
Join Date: Aug 2002
Location: UK
Posts: 4,883
|
Quote:
But ATI employees are not only happier than NVIDIA's most likely, but they're also an awful lot more loyal. NVIDIA's technique against leaks is employee termination, while ATI's is misinformation and insisting to employees it just damages the company and nothing else. ATI is stopping leaks better by asking employees nicely than NVIDIA who's threatening them of termination. NVIDIA's Information Security department is a spy film ripoff at best, beggining their questions by the good ole: "Where were you at this hour, on this day?" But I guess that doesn't mean they still do a pretty good job. ATI's doing a better one though, and without all this useless BS. Anyway, on topic: Even though the R420 should, in theory, support VS3.0./PS3.0. - its goal is to have astonishingly good VS2.0./PS2.0. speed, not simply "okay" 3.0. speed. So I suspect some features might be implemented in a rather simple way, but hey, that's always better than having to do it on the CPU Uttar |
|
|
|
|
|
|
#9 |
|
Irregular
Join Date: Feb 2002
Posts: 1,170
|
Well, I except:
- roughly the same texturing speed (per clock) - the same speed in VS but with full VS3.0 support. - 2x the speed in PS arithmetic ops (per clock) - full PS3.0 support but relatively low performance when dynamic branches are used - FP24 in PS - same AA and AF algorithm and texturing compromises |
|
|
|
|
|
#10 |
|
Member
Join Date: Jan 2003
Location: Portugal
Posts: 131
|
MuFu has disappeared...
|
|
|
|
|
|
#11 |
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,830
|
So far ATI has shown IMO that their basic design philosophy with the R3xx family was to save as many transistors as possible. The real question is if the same design philosophy has made it's way into upcoming product families too or not. Somehow I recall an official statement that could suggest a significant change in said philosophy...
|
|
|
|
|
|
#12 |
|
Senior Member
Join Date: Jun 2002
Location: Sydney Australia
Posts: 575
|
Hyp-X - 2 times the speed in Pixel Shading - are you thinking a 12 * 1 pipeline architecture with a 500MHz - 600MHz core speed on 0.13mircon to give you that twice the speed? Actually climbing from a 400MHz to a 535MHz core * 50% more well utilised pipes would get you around the right ball park, provided you could switch those chips fast enough given the capitance of all those parallel curcuits. That might seem reasonable to me.
How many transistors does each fp24 pixel shading pipeline roughly require? A move from 8 x 1 to 12 x 1 would take roughly how much of your transistor budget? |
|
|
|
|
|
#13 | |
|
Gamerscore Wh...
Join Date: Jan 2002
Posts: 12,989
|
Quote:
They did have a change in philosophy and that was actually the opposite that you describe. Previously they had always been working to specific die size constraints, with R300 these constraints were removed (within reason). |
|
|
|
|
|
|
#14 | |
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,830
|
I probably didn't use a very good description of what I actually meant.
Quote:
A simple example would be FP24, while by far not the only one, especially considering that there wasn't much headroom left at 15nm. Moving to 13nm there's a lot more headroom open to include whatever they had in mind for a next generation design, without facing restrictions up to the same degree. Here the way I see it they can either choose a relatively conservative approach or try to hit the absolute possible maximum that would be possible for the specific manufacturing process. While it may sound as an oxymoron as one could easily say that 13nm would be "outsourced" just as much as 15nm, if in a hypothetical case they truly have shot for a 200M design more or less I don't see a necessity for "middle of the road" sollutions there (at the possible sacrifice of margins of course). Of course could it be that I'm completely wrong, I wish I could remember who and where made a "sacrifice of margins" relevant comment in the first place. All of course open to corrections as always |
|
|
|
|
|
|
#15 |
|
Senior Member
Join Date: Feb 2002
Location: CT
Posts: 2,024
|
Well, given the details of per pipeline calculation capabilities, it seems pretty feasible for ATI to explore transistor efficient ways of expanding it. I.e., more, carefully selected for common applicability, instructions supported for supplemental arithmetic ops in a pipe.
I still have an impression that a focus of functionality implementation will be with "interconnect" management focus (F-Buffer and, this name was dropped the other day, V-Buffer). However, I don't know how extensive this focus need be for their target compared to achieving computational power increase, nor how expensive it has to be in transitors. If it is relatively cheap...at least in terms of what is added/replaced overall for implementation...this would leave a lot of the room from process improvement for the pure computation concerns above. Actually, given the strong performance of the existing generation, this would also seem to point towards AF and AA improvements as very likely if a significant transistor count increase is allowed. Unless there is some sort of HyperZ suite improvement that seems important, but is transistor hungry? Come to think of it, ATI seems to focus on improvements in all aspects concurrently for these steps, though an "either/or" for AA and AF could still satisfy a "Smoothvision 3.0" if either the budget is significantly lower than I expect (200 M at the outside, 180 as a guesspectation) or the performance target is higher than I expect (+50% seems like an upper limit for perception of clear improvement). Not sure about "HyperZ IV", though. That's with PS/VS 3.0 not being markedly more transistor expensive than I expect tipping the picture downwards, and without any R300 size rabbits in the engineering hat tipping it upwards. Or, of course, both happening and "evening out" :P. |
|
|
|
|
|
#16 | |
|
Member
Join Date: Jul 2003
Location: Houston
Posts: 652
|
Quote:
|
|
|
|
|
|
|
#17 | ||
|
Member
Join Date: Feb 2003
Posts: 190
|
Quote:
__________________
I'll think of something later |
||
|
|
|
|
|
#18 | ||
|
Member
Join Date: Mar 2003
Location: Denmark
Posts: 867
|
Quote:
|
||
|
|
|
|
|
#19 | |
|
Irregular
Join Date: Feb 2002
Posts: 1,170
|
Quote:
More like 8x1 but with more PS-ops per pipe and/or higher utilization. I don't think 12x1 is practical nor neccessary. As for the number of texturing units it will depend how much extra stuff they want to put in for dynamic shaders. If you aware of the nv3x's texldd performance you know what I'm talking about. |
|
|
|
|
|
|
#20 | |
|
Join Date: May 2002
Location: New York, NY
Posts: 12,679
|
Quote:
Anyway, just as a side note, 10nm is about the physical limit of semiconductor technologies. Due to the physics, you just can't create features smaller, not without using radically different technology. And getting to 10nm will be an incredible challenge, as the behavior of semiconductors on that scale changes dramatically.
__________________
April 20, 1979 - America must never forget. |
|
|
|
|
|
|
#21 |
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,830
|
Yes I did. And I'm not going to edit the former post. Too many zero's or dots, too much work.
|
|
|
|
|
|
#22 | |
|
Senior Member
Join Date: Jun 2002
Location: The Slope & TriBeCa (NYC)
Posts: 2,004
|
Quote:
(...) ATI mentioned to Analysts that for the next cycle that in the high end, and to a lesser extent mainstream, they would be willing to sacrifice some margin in order to gain performance. The implication being that perhaps R420 will be a relatively large chip with numerous pipelines in order to gain that performance.
__________________
by T2k! Athlon 64 FX-53 | Gigabyte GA-K8NSNXP-939 | Corsair TWINX1024-4000 | Corsair HydroCool200™ Xtreme Water Cooling | ATI RADEON™ X800 XT PE | HP A7217A 24" CRT | Canopus DVRex-RT + an Apple dual G5 1.8GHz |
|
|
|
|
|
|
#23 | ||
|
Senior Member
Join Date: Jun 2002
Location: The Slope & TriBeCa (NYC)
Posts: 2,004
|
Quote:
__________________
by T2k! Athlon 64 FX-53 | Gigabyte GA-K8NSNXP-939 | Corsair TWINX1024-4000 | Corsair HydroCool200™ Xtreme Water Cooling | ATI RADEON™ X800 XT PE | HP A7217A 24" CRT | Canopus DVRex-RT + an Apple dual G5 1.8GHz |
||
|
|
|
|
|
#24 |
|
Member
Join Date: Nov 2003
Location: Texas, USA
Posts: 353
|
As far as transistor count, I speculate R420 won't have more transistors than NV40 (which is 175m according to Uttar). Maybe ~150m (which would be proportional to the difference between R350 and NV35). I think they'll have trouble getting much more performance than R350, without doing something drastically different in the design, which doesnt seem likely. I'm not trying to spell doom for them though, still have more faith in ATI than Nvidia...
|
|
|
|
|
|
#25 |
|
Epsilon plus three
Join Date: Feb 2002
Location: Chania
Posts: 7,830
|
Ahhh thanks T2k.
|
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| ATI, Connect 3D and OLDI form Strategic Relationship | Dave Baumann | Press Releases | 0 | 22-May-2003 18:03 |
| The startup that saved ATI | megadrive0088 | Console Technology | 33 | 25-Apr-2003 21:00 |
| NV30 Moves into 2003 | Dave Baumann | Beyond3D News | 32 | 12-Nov-2002 09:01 |
| ATI Reports Fourth Quarter and Year-End Financial Results | Dave Baumann | Press Releases | 0 | 03-Oct-2002 09:30 |
| ATI unleashes revolutionary RADEON™ visual processors | Dave Baumann | Press Releases | 0 | 19-Jul-2002 12:47 |