If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
|
|
#1 |
|
Mostly Harmless
|
The original thread here having become unwieldy. . . .
The B3D Forum Conventional Wisdom Watch (Minority Reports noted): D3D10, 500M+ transistors, release sometime between September and end of the CY, probably 80nm (tho I've seen a minority report for 90nm), possibly GDDR3 with > 256-bit bus rather than GDDR4, HDR+AA. More new goodness on the AA side too, details unclear. Non-unified ps/vs. Power-hungry beastie, almost certainly with improved cooling vs G71. Taking requests to add to this list. Do we have a unit count we are willing to point at as the Conventional Wisdom at this point? Xbit reported 48ps. . .willing to go with that for the moment? Depending on how lazy I and my brethren are, we might try to keep this OP updated with particularly interesting new tidbits as they come in downstream, as an experiment to see how it works. Should be noted with "Update:" Please note that this post is just meant to reflect the speculation included herein (and the previous thread, of course), rather than an official position of B3D, Inc! Some relevant linkage along the way from the previous thread, for which only the authors are responsible for the accuracy thereof (i.e. don't bitch to me!): http://www.xbitlabs.com/news/video/d...220100915.html http://www.cooltechzone.com/Special_..._200604092276/ http://www.beyond3d.com/forum/showthread.php?t=30014 http://www.dailytech.com/article.aspx?newsid=2785 http://www.theinquirer.net/default.aspx?article=32385 http://www.beyond3d.com/forum/showth...737#post775737 http://www.theinquirer.net/default.aspx?article=32768 http://www.theinquirer.net/default.aspx?article=32856 http://www.digitimes.com/NewsShow/Ma...pages=A1&seq=2 http://gpu-fun.spaces.live.com/Perso...9&_c=links:119 http://www.beyond3d.com/forum/showpo...&postcount=361 http://www.theinquirer.net/default.aspx?article=33260 http://www.extremetech.com/article2/...1987258,00.asp http://translate.google.com/translat...&hl=en&ie=UTF8 http://www.beyond3d.com/forum/showpo...&postcount=485 http://www.beyond3d.com/forum/showpo...&postcount=493 http://www.forbes.com/2006/08/18/nvi...rtner=yahootix http://www.beyond3d.com/forum/showpo...&postcount=688 Update 9/12/2006: CW seems to be looking 600-700mhz core. http://www.theinquirer.net/default.aspx?article=34319 48ps confirmed, here, follow the link download "graphics track": http://www.beyond3d.com/forum/showthread.php?t=33605 Update 9/18/2006: VR-Zone takes their shot at immortality as either prophets or buffoons: http://www.vr-zone.com/?i=4007 Update 9/29/2006: Some interesting pics here, including 12 memory chips, indicating the likliehood of a 384-bit memory bus and 768MB framebuffer: http://www.beyond3d.com/forum/showpo...&postcount=620 Update 10/05/2006: DailyTech mostly confirms VR-Zone's specs: http://www.beyond3d.com/forum/showpo...&postcount=802 *Added the "u" to rumours for our Brittanic overlord.
__________________
"We'll thrash them --absolutely thrash them."--Richard Huddy on Larrabee "Our multi-decade old 3D graphics rendering architecture that's based on a rasterization approach is no longer scalable and suitable for the demands of the future." --Pat Gelsinger, Intel ". . .its taking us longer than we would have liked to get a [Crossfire game] profiling system out there" --Terry Makedon, ATI, July 2006 "Christ, this is Beyond3D; just get rid of any f**ker talking about patterned chihuahuas! Can the dog write GLSL? No. Then it can f**k off." --Da Boss |
|
|
|
|
#2 |
|
Senior Member
Join Date: Jan 2003
Location: Ottawa, Ontario
Posts: 1,783
|
G80 is new and improved? Wow I didn't even see the old one.
|
|
|
|
|
#3 |
|
Meh
Join Date: Mar 2004
Location: New York
Posts: 9,810
|
Complicated thing, that english language eh?
I have to say though - you know you're at a quality establishment when the rumours thread is so well structured
__________________
What the deuce!? |
|
|
|
|
#4 |
|
Mostly Harmless
|
__________________
"We'll thrash them --absolutely thrash them."--Richard Huddy on Larrabee "Our multi-decade old 3D graphics rendering architecture that's based on a rasterization approach is no longer scalable and suitable for the demands of the future." --Pat Gelsinger, Intel ". . .its taking us longer than we would have liked to get a [Crossfire game] profiling system out there" --Terry Makedon, ATI, July 2006 "Christ, this is Beyond3D; just get rid of any f**ker talking about patterned chihuahuas! Can the dog write GLSL? No. Then it can f**k off." --Da Boss |
|
|
|
|
#5 |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
I still think GDDR3 plus a 512-bit bus is way more likely than GDDR4 on a 256-bit bus. (Take what they did during the NV30 era and reverse it!)
Okay, I guess I should elaborate. Given that ATI is already using GDDR4 and at the moment, only Samsung is producing GDDR4 in quantity, it would be foolish to assume that supplies would be plentiful enough to risk the G80's performance on its availability. With R580+ showing a nice jump in performance due to increased bandwidth, I think we can assume pretty easily that the next-generation chips, with geometry shaders and a ridiculous amount of fillrate compared to G71/R580, need as much bandwidth as possible. So... hooray 512-bit bus. |
|
|
|
|
#6 |
|
Mostly Harmless
|
Did you read the last two pages of the previous thread, sleepy-head?
__________________
"We'll thrash them --absolutely thrash them."--Richard Huddy on Larrabee "Our multi-decade old 3D graphics rendering architecture that's based on a rasterization approach is no longer scalable and suitable for the demands of the future." --Pat Gelsinger, Intel ". . .its taking us longer than we would have liked to get a [Crossfire game] profiling system out there" --Terry Makedon, ATI, July 2006 "Christ, this is Beyond3D; just get rid of any f**ker talking about patterned chihuahuas! Can the dog write GLSL? No. Then it can f**k off." --Da Boss |
|
|
|
|
#7 |
|
chaos dunk
Join Date: May 2003
Location: Mountain View, CA
Posts: 3,274
|
|
|
|
|
|
#8 |
|
Artist formely known as Vysez
Join Date: Mar 2004
Location: Paris, France
Posts: 3,899
|
The bus width of G80 is interesting, as much as the number of RAM chips present on the board...
__________________
- Power corrupts and absolute power is kinda neat. - If at first you don't succeed, put it out for beta test. --Internets |
|
|
|
|
#9 |
|
Nutella Nutellae
Join Date: Feb 2002
Location: San Francisco
Posts: 4,309
|
This is not a rumour but since NVIDIA has a couple of patents about this I suspect G80 might use its PS units to perform blending operations between incoming fragments and the frame buffer.
Basicly, even though D3D10 does not expose this AFAIK, PS units would be able to issue a special instruction which fetches into some registers all the subsamples colors potentially covered by a fragment and then blend them in the pixel shader. A 'smart' driver would be able to dynamically patch a shader everytime we change blending modes. I'm not saying that's easy to implement in hw (there are obviously some serious coherency/processing order issues to solve first It is also quite straightforward to expect more and more fixed function units to be slowly phagocytized by programmable units as we have more of them and more complex/more powerful/more accurate ALUs Marco
__________________
[twitter] More samples, we need more samples! [Dean Calver] First they ignore you, then they laugh at you, then they fight you, then you win. [Mahatma Gandhi] The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way |
|
|
|
|
#10 |
|
Member
Join Date: Aug 2002
Posts: 306
|
I don´t get one of Jen-Hsun Huang´s little sneak-peeks out of my head. Some months ago, he said something like: "With our next-generation graphics architecture, we want to further increase programming flexibility" and actually i´m still wondering what exactly he had in mind when he specifically mentioned "flexibility", while he was speaking a little about their future plans. Along with Jen-Hsung´s saying that "they want to innovate where it makes sense, instead of innovating like crazy" (like they did with NV30), i keep questioning myself what exactly would make sense here and in the future, WRT their first incarnation of a part that has to have enough potential to be at least worth another 2-3 years.
We´ve already seen some patents, but i´m still at a point where i can´t really see what he may have meant by that. Maybe i´m reading a little bit too much into it, but if there are any ideas, don´t hesitate to post them here. |
|
|
|
|
#11 | |
|
Meh
Join Date: Mar 2004
Location: New York
Posts: 9,810
|
Quote:
Actually in a situation like this would the PS need its own link to the memory controller or will it go through the TMUs (whatever those might look like) ??
__________________
What the deuce!? Last edited by trinibwoy; 11-Sep-2006 at 16:27. |
|
|
|
|
|
#12 |
|
Member
Join Date: Aug 2002
Posts: 306
|
One of the questions that comes to mind is, how exactly will it work? Looking at current PCB designs there is no place at all for 2 more RAM chips on one side (well, physically there is, but you would need to increase the PCB either in length or put them at the back) because you have to keep in mind that there is a limit as to how close you can put them against each other (because of termination, etc.) and when you place them further away this could lead to some potential problems. There is a reason why 8 chips per side is the maximum right now. You´d need a fair amount of intelligent pathing when only 2 modules are placed further away.
|
|
|
|
|
#13 | |
|
Senior Member
Join Date: Jul 2004
Location: NY, NY
Posts: 2,680
|
Quote:
Hmm interesting. I would think the same, a more programmable AA engine. Not sure but if PS would go through the TMU's wouldn't that lock the TMU's? I think they would need thier own connections to the memory control. |
|
|
|
|
|
#14 |
|
Naughty Boy!
Join Date: Aug 2004
Location: Stuttgart, Germany
Posts: 5,008
|
128 bits wide?
__________________
I have thought some of nature's journeymen had made men, and not made them well, they imitated humanity so abominably. |
|
|
|
|
#15 |
|
B3D Shockwave Rider
Join Date: Feb 2002
Posts: 1,813
|
My guess.
G80 is two g70 improved cores with geometry shaders added to the architechture. Improved A.A. and HDR support along with other tweaks. The Sony PS3 RSX is comprised of just one of these cores.Two cores would require too much power and produce too much heat in a console form factor.
__________________
When God plays an online shooter he plays Shadowrun. He buys resurrection first round and selects Dwarf. www.shadowrunshow.com |
|
|
|
|
#16 | |
|
Member
Join Date: Aug 2003
Location: Derry, NH
Posts: 563
|
Quote:
G80 has been in development much too long to be as simple as two G70s slapped together. |
|
|
|
|
|
#17 | |
|
Member
Join Date: Aug 2003
Location: Derry, NH
Posts: 563
|
Quote:
two PCB design? like the 7950, only one board is all RAM? obviously expense is a huge issue with that, etc. |
|
|
|
|
|
#18 | |
|
Meh
Join Date: Mar 2004
Location: New York
Posts: 9,810
|
Quote:
__________________
What the deuce!? |
|
|
|
|
|
#19 |
|
Meh
Join Date: Mar 2004
Location: New York
Posts: 9,810
|
You could certainly make a case for dedicated framebuffer space/bandwidth on a high-end card given today's resolutions and HDR/AA requirements. You wouldnt have the crossbar complexity that sireric described earlier and it may even simplify accesses for the other clients like the TMU's. That's assuming that you can keep that dedicated bus saturated enough to justify its existence.
__________________
What the deuce!? |
|
|
|
|
#20 |
|
Moderate Nuisance
Join Date: Feb 2002
Posts: 4,664
|
Nice summary, geo. Kind of horrible to think we can deflate 29 pages into close to 29 words.
Should we also add 600+MHz and maybe even accept a 384bit bus, given trumphsiao's chirping in the penultimate page of the previous thread? He's been right before, IIRC. The "4:1 concept architecture" is the most interesting part. Are we talking 48 PS "processors" : 16 ROPs in G80 (assuming it still has discrete ROPs)? Are we talking 64 PS ALUs : 16 ROPs in R600, assuming an extra PS ALU per "pipe" (though I'd expect this at a very high core clock)? (Or does G80 stick with 24 pixel shader "pipes/processors"--two DX9 PS ALUs each--but add two extra DX10 PS ALUs each? Nah, too NV30ish, if it's even possible.) I've also heard 16 VS/GS processors, too, though I forget where (possibly in one of the OP's links). 48 pixel shader "processors" at 600+MHz sounds power and transistor hungry to me, weakly corroborating other rumors and perhaps hinting at 96 PS ALUs. It also makes Brimstone's "two G70s" theory not incredibly far-fetched, also considering 16 VS/GS shaders. That's twice a G70 in G70 terms, but obviously NV's been modifying the heck out of everything, so obviously it's not that simple. What does the rumored new AA engine signify, updating the ROPs or folding them into the PSs? Finally, nAo's talking about this, right? I've been out of the loop awhile, thus the more-than-usual silly questions. Last edited by Pete; 11-Sep-2006 at 20:51. |
|
|
|
|
#21 |
|
Join Date: May 2002
Location: New York, NY
Posts: 12,679
|
I still don't think it makes much sense to have a dedicated bus. Yes, it is simpler, but GPU's have had unified buses for many years now. I doubt they'd take a step backwards like this.
After all, don't forget that it's not just the memory bandwidth that is being dedicated, but also the memory space. All individual areas of memory space are highly-variable in today's GPU designs.
__________________
April 20, 1979 - America must never forget. |
|
|
|
|
#22 | |
|
Regular
|
Quote:
Whether that unit is a decoupled pipeline that runs alongside the ALU pipeline, or is integrated as macros into the ALU pipeline, who knows... I expect the former initially. So the end result is one point of access to memory. --- There's an interesting, minor, corrollary with streamout in my view: Streamout writes data to memory that then needs to be read back (sometime soon!) for rendering to continue. Streamout is a geometry (vertex) specific technique. A lot of pixel shading techniques would benefit from writing a pixel value and then (sometime soon!) reading it for rendering to continue. As it happens, in both cases "sometime soon!" is blocked - the dev is forced to flush things out and the whole thing is fairly clunky. It makes the parallelism of the GPU much easier to implement, but programmers apparently have been screaming they want "immediate read after write" for donkey's years. So, in my view, both streamout and ROP-output make natural targets for "more timely" writing/reading. Apart from what we might see in G80 (prolly only exposed in OGL 3.0? or as an NVidia extension in OGL?) I'm doubtful that this "fully programmable ROP" (and streamout?) will come any time soon, i.e. to DX. I'm still unclear on the mechanics of read-after-write in a pixel shader. How restrictive would it end up?, and would those restrictions nullify most of the benefit devs have been dreaming about? Jawed |
|
|
|
|
|
#23 |
|
Nutella Nutellae
Join Date: Feb 2002
Location: San Francisco
Posts: 4,309
|
Maybe we shouldn't think about 12 memory chips around one GPU.. what about 6 mem chips x 2 GPUs?
The original patents I was referring to are these ones: Pixel load instruction for a programmable graphics processor Position conflict detection and avoidance in a programmable graphics processor Position conflict detection and avoidance in a programmable graphics processor using tile coverage data BTW..while I was checking those patents I found a new interesting one (LOL): what's the difference between a costant value held in a texture or in a costant register in the end? well, the latter must reside closer to your 'heart', so here we go: Shader cache using a coherency protocol
__________________
[twitter] More samples, we need more samples! [Dean Calver] First they ignore you, then they laugh at you, then they fight you, then you win. [Mahatma Gandhi] The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way |
|
|
|
|
#24 |
|
Senior Member
|
It's clear, that in a DX10 GPU there is little or no more place for fixed-function parts, so the ROPs either must go for full programmability or their functions shall fall back to the fragment pipes and thus all legacy blending/sampling op's must be emulated on driver/API level (as was for T'n'L).
I honestly bet for the second option, as it will save some level of complexity (in favour of extra VS/PS units) and will "close" more the memory interface to the fragment core, if it has now to deal with the burden of framebuffer op's in sampling/blending & etc. The other thing also is the support for virtual addressing in the GPU - will be there an extra (mini)AGU for each fragment pipe/quad or this function will be too consumed by the new "multipurpose" ALU's?
__________________
Apple: China -- Brutal leadership done right.
Google: United States -- Somewhat democratic. Microsoft: Russia -- Big and bloated. Linux: EU -- Diverse and broke. |
|
|
|
|
#25 | |
|
Nutella Nutellae
Join Date: Feb 2002
Location: San Francisco
Posts: 4,309
|
Quote:
At the same time I believe they will decouple TMUs from PS units since now they have to massively use them to serve multiple clients (VS/GS/PS). I also wonder if they are going to have a single big L2 (texture) cache which will serve all texturing requestes from all possible clients or whether they will have a multiple dedicated L2s. Wouldn't be nice having your pixel shader slowing down cause a mad vertex shading is thrashing all your texture cache, lol Marco
__________________
[twitter] More samples, we need more samples! [Dean Calver] First they ignore you, then they laugh at you, then they fight you, then you win. [Mahatma Gandhi] The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way |
|
|
|
| Thread Tools | |
| Display Modes | |
|
|