If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#3301 | |
|
Naughty Boy!
Join Date: Dec 2009
Posts: 399
|
Quote:
And they did the same with the gtx275 and gtx295. Last edited by Sontin; 10-Jan-2010 at 11:44. |
|
|
|
|
|
|
#3302 | |
|
Member
Join Date: Jun 2008
Location: Looking for a place to call home
Posts: 144
|
Quote:
L2 cache shared? No way, it's on chip. Dual core design gpu? Come on, no one still believes that... Fermi should be doing texture filtering through alus (but I ask for Rys on this), thus I'm not that sure that Fermi has "128TFU"... |
|
|
|
|
|
|
#3303 | |
|
Member
Join Date: Dec 2009
Posts: 581
|
Quote:
|
|
|
|
|
|
|
#3304 | |
|
Naughty Boy!
Join Date: Dec 2009
Posts: 399
|
Quote:
|
|
|
|
|
|
|
#3305 | ||
|
Member
Join Date: Jun 2008
Location: Looking for a place to call home
Posts: 144
|
Quote:
Quote:
|
||
|
|
|
|
|
#3306 |
|
Member
Join Date: Dec 2009
Posts: 581
|
|
|
|
|
|
|
#3307 |
|
Naughty Boy!
Join Date: Dec 2009
Posts: 399
|
|
|
|
|
|
|
#3308 |
|
Member
Join Date: Apr 2004
Posts: 810
|
http://www.overclock.net/nvidia/6411...enchmarks.html
Mod edit: That sums up the post nicely, thanks. Copy&pasting someone else's content is a bit of a faux pas, and when it's lengthy noisy content, it's even more improper. Thanks. Last edited by jimmyjames123; 10-Jan-2010 at 13:02. |
|
|
|
|
|
#3309 | |
|
Member
Join Date: May 2007
Posts: 249
|
Quote:
Yeah, it could quite be "dual core" by dedicating 1/3rd of the memory bus to inter-GPU communication assuming adress bus is R/W, but L2 wouldn't be shared that way and that wouldn't stick with other "specs". |
|
|
|
|
|
|
#3310 |
|
Member
Join Date: Jan 2010
Posts: 117
|
Maybe you need to look at this http://www.freepatentsonline.com/7616206.pdf There is describe of efficient private bus utilizing free MC to create fast connect. There are also some interesting methods of tile like interleveof render targets using cache to help hide added latency. Has some more new patents about this link but im too lazy to search them
|
|
|
|
|
|
#3311 | |
|
Regular
|
Quote:
|
|
|
|
|
|
|
#3312 | |
|
Member
Join Date: May 2007
Posts: 249
|
Quote:
With this "NUMA-like" approach, inter-GPU bandwidth must be equal to local memory bandwidth to achieve optimal efficiency, and that would still imply a quite high latency. Even with 1/3 of the bus dedicated to inter-GPU comm, it would still be quite bad and that would give a composite 512bit bus, which is not in line with the "specs" given. |
|
|
|
|
|
|
#3313 |
|
Naughty Boy!
Join Date: Dec 2009
Posts: 399
|
Really, what's the problem? He is counting the two L2 Cache together. Yeah, it's not right but AMD did the same with Hemlock: http://forum.beyond3d.com/showpost.p...postcount=4698
|
|
|
|
|
|
#3314 |
|
Member
Join Date: Jun 2008
Posts: 335
|
CES is over in a few hours, and we know nothing really new, I guess that Rahja dude was full of shit?
|
|
|
|
|
|
#3315 | |
|
Member
Join Date: Jan 2010
Posts: 117
|
Quote:
|
|
|
|
|
|
|
#3316 |
|
Meh
Join Date: Mar 2004
Location: New York
Posts: 9,809
|
Not that it isn't pure fantasy but why would inter-GPU bandwidth need to be equal when the load on that path would be far lower than GPU<->Mem? I'm not sure what purpose it would serve anyway, didn't both AMD and Nvidia claim that their current proprietary links have sufficient bandwidth for their purposes?
__________________
What the deuce!? |
|
|
|
|
|
#3317 |
|
Member
Join Date: Apr 2004
Posts: 810
|
"Rahja" said that more info would be available after the "12th". According to Chris Ray, an NDA is expiring on that day, or at least changing to some extent, so that they will be allowed to pass new info on GF100. Chris said that we will get this information "very soon" (so presumably sometime within the next few days, or at least soon this month, based on his wording).
|
|
|
|
|
|
#3318 | |
|
Member
Join Date: Jun 2008
Posts: 335
|
Quote:
|
|
|
|
|
|
|
#3319 | |
|
Junior Member
Join Date: Dec 2009
Posts: 31
|
AnandTech finally says something on Fermi @ CES
Quote:
http://www.anandtech.com/tradeshows/...spx?i=3719&p=3 |
|
|
|
|
|
|
#3320 | |
|
Naughty Boy!
Join Date: Dec 2009
Posts: 399
|
Quote:
|
|
|
|
|
|
|
#3321 | |
|
Senior Member
|
Quote:
2 way coherent caches using a 64 bits of mem bus looks to be a BIG improvement. Otherwise, seems rather sensible. Clocked a bit less than expected. If some SM's had been fused off, it could have been quite believable. Cons: A downclocked, castrated, cherrypicked GF100 hits 225W. So power is a big question here. Can't shake off the feeling that this thing was derived/made up by applying the 285->295 formula. |
|
|
|
|
|
|
#3322 |
|
Senior Member
Join Date: May 2005
Posts: 2,038
|
Yes, it's more or less a catalogue product now. In my country GTX285 are often more expensive and harder to find than HD5870. The same for GTX275/HD5850. Only a few overpriced ~1,8-2GB models are available. GT200 is EOL, availability is really poor, so it's logical to expect, that the GPU is produced no more. nVidia has some reasons to keep it in pricelists - maybe it's better to offer a virtual competitor than nothing.
I'd expect, that this situation will last until the launch of Fermi mainstream parts.
__________________
Sorry for my English. But I hope it's better than your Czech |
|
|
|
|
|
#3323 | |
|
Regular
|
Quote:
The problem with sidebusses and non AFR parallel rendering is that loadbalancing is rather difficult ... the naive approach is simple round-robin, but then all framebuffer writes are 50/50 local/remote ... which is going to take a whole lot of bandwidth. Personally I would do things like this ... - Vertex processing is divided round robin (vertex buffers are fully replicated) - All write buffers are roughly tiled (say 64x64 or more) and checkerboard divided between the GPUs - All transformed vertices get tiled and then written to buffers in the memory of the relevant GPUs for tesselation or rasterization (icky, but the writes would be done with special types of non temporal load/stores ... if the vertices get consumed while not evicted from L2 they never have to be written to external memory) - All read buffers (including former write buffers) are replicated on demand on a tile by tile basis, which is to say they that if a tile from a buffer is accessed which is not stored locally that tile gets replicated (My thinking on as needed replication is that it will be as or more efficient than NUMA with caching, for instance with dynamic textures reused across multiple frames it is clearly superior, and certainly more efficient than doing full buffer replication all the time since that introduces too much latency in between rendering steps.) How much bandwidth that would consume? Hell if I know, would have to implement it in a simulator and run traces (lossless compression could probably cut the data for the tiled triangle writes by 66%, but because of the fact you are working with floating point numbers it's not cheap). Last edited by MfA; 10-Jan-2010 at 13:44. |
|
|
|
|
|
|
#3324 | |
|
Member
Join Date: May 2007
Posts: 249
|
Quote:
These specs are one of the worst fakes ever, except perhaps G80's hybrid water/air cooling I don't recall something worse. |
|
|
|
|
|
|
#3325 | |
|
Naughty Boy!
Join Date: Dec 2009
Posts: 399
|
Quote:
|
|
|
|
|
![]() |
| Tags |
| delay, fermi, geforce, gf100 |
| Thread Tools | |
| Display Modes | |
|
|