Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Closed Thread
Old 09-Jul-2012, 17:47   #13351
ERP
Moderator
 
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,669
Default

Things do change, but it's not usually radical.

The most common case is that for whatever reason they can't manufacture the original design at the original specs, or a part provided by a vendor doesn't meet spec.
But things change for other reasons like reactions to a competitor MS doubled the memory on 360.
3DO probably made the biggest change I've ever seen when they added a second CPU to M2, but that never had a release date.
There are rumors that N64 had a fairly significant change (downgrade), but the only teams affected were the "dream team" members.

Things are a lot more stable now than they used to be, the original Saturn devkits were the size of a small fridge, were missing half the hardware ran really hot and tended to last about 5 minutes before dieing, but that was when all the console manufacturer did was ship hardware with badly translated register documentation.

It should also be noted that not all developers get initial devkits at the same time and not all developers see all of the devkits that are produced. For example there will probably be a small run of pre release hardware used by the OS team, since you can't ship devkits without the OS.
ERP is offline  
Old 09-Jul-2012, 18:13   #13352
bkilian
Senior Member
 
Join Date: Apr 2006
Posts: 1,539
Default

Quote:
Originally Posted by Mianca View Post
That wikipedia number is misleading because it seems to be calculated in a weird way.

As far as my own math goes, RSX is a ~250 GFLOPS chip [24*2*4 way Pixel ALUs + 8*5 way Vertex ALUs)@550Mhz].

Also, consider that XENOS is rated @ 240GFLOPs [48*5 way unified ALUs @500Mhz] - and is generally considered faster than RSX.

Curiously enough, Pitcairn XT offers ~ 2.500 GFLOPs. So that should basically be the target spec (although, as I mentioned earlier, Pitcairn@1Ghz is way too power hungry to make its way into a console). An optimized and underclocked Oland-derivative should probably be capable of reaching those target numbers within a reasonable power budget, though.

They'd roughly need 24CUs running @800Mhz to end up with the target of 2.500 GFLOPs.
The Xbox GPU (according to a presentation the Xbox guys gave us when we were starting HD DVD development) can and does routinely achieve max throughput, and apparently RSX doesn't. So in real terms, the stated max numbers are misleading. I don't know if current gen GPUs are similarly misleading.
bkilian is offline  
Old 09-Jul-2012, 18:19   #13353
archangelmorph
Senior Member
 
Join Date: Jun 2006
Location: London
Posts: 1,551
Default

Quote:
Originally Posted by bkilian View Post
The Xbox GPU (according to a presentation the Xbox guys gave us when we were starting HD DVD development) can and does routinely achieve max throughput, and apparently RSX doesn't. So in real terms, the stated max numbers are misleading. I don't know if current gen GPUs are similarly misleading.
Do you have a source/citation for that?
__________________
blog
twitter
archangelmorph is offline  
Old 09-Jul-2012, 18:37   #13354
liolio
French frog
 
Join Date: Jun 2005
Location: France
Posts: 4,973
Default

Quote:
Originally Posted by archangelmorph View Post
Do you have a source/citation for that?
DO we need a source for that? I mean ain't that the whole point behind the shift to unified shader architecture?
Xenos thanks to the edram may also suffer from less bottlenecks involving bandwidth?
__________________
Sebbbi about virtual texturing
The Law, by Frederic Bastiat
'The more corrupt the state, the more numerous the laws'.
- Tacitus
liolio is offline  
Old 09-Jul-2012, 18:40   #13355
kagemaru
Senior Member
 
Join Date: Aug 2010
Location: Ohio
Posts: 1,350
Default

Quote:
Originally Posted by Acert93 View Post
A number of developers have been open about stating how they would have preferred the eDRAM budget be dedicted top the GPU proper and I think that would be the major conclusion if MS presented a Cape Verde+Xenos eDRAM setup.
I thought the biggest complaint regarding eDRAM in the 360 was how the frame buffer had to reside in the eDRAM without any way to bypass it. Isn't there some way to use eDRAM for bandwidth without forcing the frame buffer to sit on the eDRAM? Apologies if this is a dumb question but some of the dev complaints seem to indicate the eDRAM could have been implemented differently in the 360.

Quote:
Originally Posted by Mianca View Post
That wikipedia number is misleading because it seems to be calculated in a weird way.

As far as my own math goes, RSX is a ~250 GFLOPS chip [24*2*4 way Pixel ALUs + 8*5 way Vertex ALUs)@550Mhz].
Not a big deal, but RSX is 500Mhz.

Quote:
Originally Posted by archangelmorph View Post
Do you have a source/citation for that?
I'm guess he's estimating this through the efficiencies gained through a unified shader architecture versus a more discrete shader model where you can't tailor your game's shader load specifically to your GPU spec 100% of the time. Meaning at some point your GPU may be underutilized. I can be wrong of course.
kagemaru is offline  
Old 09-Jul-2012, 19:08   #13356
ERP
Moderator
 
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,669
Default

Quote:
Originally Posted by kagemaru View Post
Meaning at some point your GPU may be underutilized. I can be wrong of course.
On a none unified GPU at all points it's under utilized, you're either pushing simply shaded tris in which case the pixel shaders are underutilized, or you're doing complex shading in which case your vertex shaders are under utilized.
Real games do both at different points in a frame, drawing shadows you're not doing any pixel shading, when you're doing the pretty lighting model, pixel shading is dominant.

On 360 the EDRAM also makes a difference, since it means the frame buffer memory is never the bottleneck.

Having said that I would be surprised if real games saw 100% utilization on a 360 for a significant portion of a frame, there are still other things that gate through put and cause ALU's to sit idle. Texture fetches, triangle setup, number of rops etc etc.
ERP is offline  
Old 09-Jul-2012, 21:08   #13357
Shifty Geezer
uber-Troll!
 
Join Date: Dec 2004
Location: Under my bridge
Posts: 30,364
Default

Quote:
Originally Posted by antwan View Post
Great explanation!
I guess that comparing 2005 and 2012 GPU designs on the basis of mm2 is pretty useless then.
Depends what you're comparing. For cost purposes, taking 1 mm^2 of transistors to be about the same price no matter what year it's made, then a console with a given silicon area will cost about the same. Hence the idea that if 300 mm^2 was the cost-effective limit for a $300 console in year a, 300 mm^2 would also be the limit for a $300 console in year b. Factors are at play, but it seems an okay ballpark reference to me. That might be an unrealistic assumption.

Another reference point is Moore's law. If transistor density increases 2x every 18 months, then a 10x increase in power in the same chip area happens every 5ish years, which is our expected console generation, and the OP's (and most us) original expectation.
__________________
Shifty Geezer
...
Flashing Samsung mobile firmwares. Know anything about this? Then please advise me at -
http://forum.beyond3d.com/showthread.php?p=1862910
Shifty Geezer is offline  
Old 10-Jul-2012, 00:05   #13358
antwan
Naughty Boy!
 
Join Date: May 2012
Posts: 200
Default

Quote:
Originally Posted by Acert93 View Post
First: how do you know they have eDRAM? Do you actually know stuff or are you frequently tossing out predictions as facts? Just curious because you cut of the discussion with these facts and if they are facts, great as there is no point chasing pointless rabbit trails. But if not...

Second: 205mm^2 (Xenos logic) versus 135mm^2 is a LOT. 52% more silicon real estate to be exact and based on area "sunk" into basic architecture (see above on these supposed scalings--you get more performance on additional space) that is going to be substantially faster (looking at the 40-60% range).

I guess the issue is you have chosen to select Cape Verde as a baseline and then anything about that is impressive, tasty, etc To review:

G71 @ 90nm: 196mm^2
RSX @ 90nm: 240-258mm^2 (depending on source)
Xenos @ 90nm: 180mm^2 (Mintmaster)
eDRAM @ 90nm: 70mm^2 (~105m transistors, of which about 80m eDRAM and 25m ROP logic -- so 205mm^2 logic)

Cap Verde @ 28nm: 123mm^2
Pitcairn @ 28nm: 212mm^2
Oland @ 28nm: 235mm^2
Mars @ 28nm: 135mm^2

Some comparing in budget:

RSX: Cape Verde: -53%
RSX: Pitcairn: -18%
RSX: Oland: -9%
RSX: Mars: -48%

Xenos : Cape Verde: -40%
Xenos : Pitcairn: +3%
Xenos : Oland: +14%
Xenos : Mars: -35%

My premise has continued to be comparing to the relative real estate of last generation. Given that as a general ball park baseline (for all the reasons I have cited in this thread) the GPU's you mention (Cape Verde, this supposed Mars) are major, substantial drop offs from last generation in terms of space. They also far and away fall short of desktop discrete GPUs from 4-5 years *before* the launch of the new consoles.

Sure, if this was 2010 maybe Cape Verde would be a nice upgrade, but we are talking about 2013+. And there is no other way to put this: putting a GPU that is 40-50% smaller than RSX and Xenos is making a HUGE cut in silicon real estate. And if we are to buy the major push in the GPU industry of GPGPU which supposedly it taking up more tasks something like a CELL processors does the GPU will be asked to do more with less total space.

There is nothing exciting about this. Heck, we will be seeing AMD/Intel SoCs in the next couple years with better than console performance.

Also I don't see 28nm as an excuse. It isn't news about node transitions but there is always someone whining (in this case NV). 28nm, which had select parts rolling out in the late 2011, is going to be very mature by late 2013. Layout, yield, power, and pricing are all going to be at a more mature point than 90nm in 2005.

If MS/Sony are going with chips that are nearly 50% smaller it has a lot to do with their visions for their platforms (read: general entertainment devices that are equally focused on core gaming, casual gaming, new peripheral gaming, media streaming, service provider, digital distribution, etc) instead of a core gaming experience first (which has tens of millions of customers out there) with those other features coming along as a robust package. I have no problem with that, but I also don't think it should be sugar coated against low end hardware like Cape Verda -- or to throw out "Kinect 2 will be so much more accurate and default hardware so it will work with core" all along pretending that the basics, e.g. it lacks proper input for core genres (like FPS, driving, sports) that regardless of precision Kinect is a total non-starter in an FPS or the like because you cannot move. The only solution is rails which is a huge step back. Which may appeal to casuals but, again, is a major assault and demand for core gamer concession.

TL;DR: Cape Verda/Mars are a huge reduction in hardware from last gen. HUGE.
Sorry but I don't quite get the logic:
You are comparing (estimated) die sizes of 90nm 2005 GPU architectures against 28nm 2012 ones. The 28nm fab process allows for a lot more transistors on the same die size. Not to mention the GPUs themselves having improved vastly (to (over)simplify it; even the same amount of transistors would yield a lot more performance in the 2012 GPUs.
So I believe your conclusion based on the comparison of the die sizes is not correct.

A chip that is 50% smaller on the outside could have 5 times the performance (2012 vs 2005).
__________________
"If we look at this objectively, then color is definitely scientifically better."
antwan is offline  
Old 10-Jul-2012, 00:46   #13359
Sonic
Senior Member
 
Join Date: Feb 2002
Location: San Francisco, CA
Posts: 1,705
Default

Quote:
Originally Posted by antwan View Post
Sorry but I don't quite get the logic:
You are comparing (estimated) die sizes of 90nm 2005 GPU architectures against 28nm 2012 ones. The 28nm fab process allows for a lot more transistors on the same die size. Not to mention the GPUs themselves having improved vastly (to (over)simplify it; even the same amount of transistors would yield a lot more performance in the 2012 GPUs.
So I believe your conclusion based on the comparison of the die sizes is not correct.

A chip that is 50% smaller on the outside could have 5 times the performance (2012 vs 2005).
5 times the performance isn't going to cut it, and I think that is part of Acert's point. If they can go for bigger and still be profitable then they should do it, as that will help them in the long run. It will be entirely sad to see these companies release sub-par hardware, especially in the graphics department, and then get rocked when their biggest competitor comes in with overpriced hardware and still manages to sell millions in minutes on day one.




But for real, any new details about the PS4 APU? The cores have changed from bulldozer to jaguar cores are the current rumors. Are Jaguar cores significantly less powerful than bulldozer cores?
Sonic is offline  
Old 10-Jul-2012, 00:47   #13360
Sonic
Senior Member
 
Join Date: Feb 2002
Location: San Francisco, CA
Posts: 1,705
Default

And what's this patent application? http://appft1.uspto.gov/netacgi/nph-...do&RS=nintendo

Have we been over it before?
Sonic is offline  
Old 10-Jul-2012, 00:59   #13361
Kb-Smoker
Member
 
Join Date: Aug 2005
Posts: 614
Default

Quote:
Originally Posted by Sonic View Post



But for real, any new details about the PS4 APU? The cores have changed from bulldozer to jaguar cores are the current rumors. Are Jaguar cores significantly less powerful than bulldozer cores?
Not buying that rumor. Not a good source....
Kb-Smoker is offline  
Old 10-Jul-2012, 01:23   #13362
onQ
Senior Member
 
Join Date: Mar 2010
Posts: 1,050
Default

Quote:
Originally Posted by Sonic View Post
And what's this patent application? http://appft1.uspto.gov/netacgi/nph-...do&RS=nintendo

Have we been over it before?
That seems to be inline with my thoughts of what Sony or Microsoft will do Next-Gen, release a new console then make upgrades like the iOS devices to keep the hardware fresh for years to come.






Versions of a multimedia computer system architecture are described which satisfy quality of service (QoS) guarantees for multimedia applications such as game applications while allowing platform resources, hardware resources in particular, to scale up or down over time. Computing resources of the computer system are partitioned into a platform partition and an application partition, each including its own central processing unit (CPU) and, optionally, graphics processing unit (GPU). To enhance scalability of resources up or down, the platform partition includes one or more hardware resources which are only accessible by the multimedia application via a software interface. Additionally, outside the partitions may be other resources shared by the partitions or which provide general purpose computing resources."


This is clearly the next Xbox

Last edited by onQ; 10-Jul-2012 at 02:47.
onQ is offline  
Old 10-Jul-2012, 03:03   #13363
LightHeaven
Junior Member
 
Join Date: Jul 2005
Posts: 512
Default

Quote:
Originally Posted by archangelmorph View Post
Do you have a source/citation for that?
I remember reading back then that since Rsx's pixel pipelines are coupled to the texture units whenever one of them is reading a texture it can't process a single thing.

That plus the non unified shader architecture could account for some substantial under utilization of the hardware.
LightHeaven is offline  
Old 10-Jul-2012, 03:24   #13364
N2O
Member
 
Join Date: Apr 2010
Posts: 280
Default

Quote:
Originally Posted by onQ View Post
This is clearly the next Xbox
Good luck with that if you talk about spec
N2O is offline  
Old 10-Jul-2012, 03:55   #13365
Acert93
Artist formerly known as Acert93
 
Join Date: Dec 2004
Location: Seattle
Posts: 7,807
Default

Quote:
Originally Posted by antwan View Post
I notice quite a lot of "mm2" or transistor numbers throwing around. Why is that?
"more is better"
Quote:
Originally Posted by antwan View Post
Acer93 was using it as some kind of.... performance measurement ( ? ) though.
I like number of transistors a lot more, for that kind of comparison (still doesn't really relate to performance that much, if at all).

edit: maybe someone has a list of "fps per mm2", or "transistor per fps" for a number of GPU's, then you would see how much sense that makes (or doesn't make) when comparing architectures.
Quote:
Great explanation!
I guess that comparing 2005 and 2012 GPU designs on the basis of mm2 is pretty useless then.
Maybe you could quote me because I don't think you understood what I said or the meaning of my posts.

Transistors are the physical microscopic "parts" that compose the logic and memory in modern electronics. Transistors aren't a great gauge of performance because (a) people count transistors differently, (b) different architectures use transistors more or less efficiently, (d) memory is more dense than logic hence one design may have a large amount of cache but less logic, etc. There have even been examples of moving down a process (hence close proximity), or time to refine a design, has resulted in an architecture reducing transistor count. Importantly, due to these reasons, my OP was only looking at Moore's Law as a guideline (~2x density every 18-24 months) and the scaling of transistors as a general guideline to what we could project n-process nodes into the future, with the caveat of architectural issues (e.g. features often come at the expense of performance; yet more advanced features may make certain desirable techniques performant whereas the older architecture scaled up would not, etc).

mm^2 (area, e.g. 10mm x 10mm is a 100mm^2 chip) is not a direct comparison of performance either. A ~ 250mm^2 RSX on 90nm is going to have about 1/10th the transistors of a 250mm^2 GPU of a similar architecture on 28nm. Area also doesn't tell us about the architecture and what kind of frequencies that architecture allows.

What mm^2 does allow us to do, as long as we take market conditions into consideration, is get a barometer of cost. This is not a 100% correlation due to said market considerations e.g. costs change over time (namely nodes tend to get more expensive), production early on a process is more expensive than when it is mature, wafers get bigger (which can reduce chip costs in the long run), etc.

So whether we are on 90nm process or a 28nm process a 225mm^2 (15mmx15mm) chip on a 300mm (12 inch) wafer (which are round) nets about 245 die. Assuming that your wafer cost is exactly the same in 2005 on 90nm as it is in 2013 on 28mm (bad assumption) you could get the same number of chips at the same cost. There are too many variables at play to get an exact cost change--e.g. we would have to know when the consoles would ship for one. But assuming late 2013 28nm will probably be more mature than 90nm was in 2005 at TSMC. But then again iirc there was more competition in the fab space in 2005 and costs have been increasing over time. But the transition to 300mm wafers is pretty standard (I don't remember if all 90nm production was on 300mm or some on 200mm).

What may also be missed in there is costs have not always gone up. Dies have gotten bigger over time and IHVs have been able to fit more product onto a board and make profits. Another dynamic is how the GPUs and CPUs have swallowed up other onboard chips (e.g. Intel's FSB was on the northbridge iirc, now IGPs that were on the Motherboard chipset are APUs on the CPU die, etc).

I fully admit there are many, many factors and variables. My posts recently looking at die size (area, mm^2) did touch on the very fact that die size alone doesn't tell the whole story--e.g. the new "barrier" may have shifted from cost of the die to TDP. If that is the case a larger chip aimed at a specific TDP (which would likely have lower voltage, frequency, and an architecture aimed at a set TDP) may provide more performance than a smaller higher clocked chip which will hit the TDP wall sooner. The other aspect is the bus size. If you want a 256bit bus you will need to be a certain size. Just as importantly you can go larger as you need to consider the limits at the next node. e.g. If you chip just fits a 256bit bus on 28nm then a shrink to 20nm may not help you as the chip bus is so large (they don't shrink fast at all). Hence so much talk of 128bit busses. Of course it may be cheaper to target 28nm for a much longer product cycle as 20nm may be expensive and integrating FINFETs may not be an option (or even available).

Anyways, I think you misunderstood my posts.

One of the walk away points in general is that I think we will see Sony and MS reduce the silicon budget (but rest assured they will claim the chips costs them even more...) and those budgets will be shifted to things like Kinect2 and media experiences (think of how Sony sold out for Blue Ray; this gen it will be selling out to the "Cloud"). I guarantee you that we will soon be inundated with a flood "n more transistors!" and the like and gawdy numbers like "6x faster!" when, I predict, we are going to see a large reduction in investment into silicon. Which is fine, just as long as they are not obscured by useless expressions like "But it is 6x faster with 5x the transistors!"

Likewise the performance levels being discussed these days (e.g. Cape Verde) is 2009 GPU performance and doesn't even crack 30fps in Crysis 2 at 1080p. Which, again, is fine as long as it isn't trumpted as some leap in performance. Hence my responses that these chip specs are quite old by today's standards, don't show any benefit of the extra long generation, don't offer technology that displays products far and away a generational leap over the current consoles, and would reflect a steep decline in silicon budgets. That last point is the one I have been mainly discussing. Steep cuts in chip size may be the best move for these companies, but my purpose to to look at the flip side of "Cape Verde is sooo much faster than Xenos!" and say, "Yes, but in terms of silicon footprint it is a massive reduction over RSX/Xenos and for those angling at n performance increase Cape Verde class hardware offers little in terms of offering the performance for a generational leap graphically." And the benchmarks show that. Alas this is more a back and forth of what people want/desire, expect, what is good strategy, what is possible, etc. If you take anything away from the previous posts is 1TFLOPs GPUs or Cape Verde would be much smaller than Xenos/RSX and there is no reason more could not be obtained from these GPUs will similar TDP limits if that, instead of die size, is the limiter.

The other point I was discussing was larger chips must have a very high TDP when this isn't necessarily true. It seems frequency accelerated toward the TDP wall faster than die size on some designs and processes so a larger, lower clocked GPU will likely provide more performance per Watt. Hence a larger die does not, out of hand, have to violate power constraints. In fact a larger chip within TDP budgets would indicate the design didn't run away from the performance issues and toss their hands up and say, "Well, lets just invest the extra money on Cloud Services." Which, again, may be the best strategy but not necessarily a technical barrier.

EDIT:

Quote:
Originally Posted by antwan View Post
A chip that is 50% smaller on the outside could have 5 times the performance (2012 vs 2005).
Ok, I know you don't understand my posts. Go back over them again please before telling me my comparison is incorrect and has faulty logic. I am obviously not denying a GPU 1/2 the size on 28nm won't be faster than Xenos. In between frequency increases, architecture, and sheer logic increases it will be much faster.

Yet a 135mm^2 GPU is a huge reduction in physical real estate from, say, a 258mm^2 RSX and it represents a massive shift in console design and, frankly, purpose. If you are fine with a 135mm^2 GPU, that is fine but also neither here nor there in terms of my point.

What I would say is that looking at gaming benchmarks Cape Verde class GPUs struggle with games on high quality at 1080p which doesn't paint a rosey picture for traditional and progressive visual enhancements. It also has specific architectural considerations, as I noted in those posts, that why would someone argue for DDR4 with a lot of bandwidth AND eDRAM with Cape Verde obviously doesn't need much more bandwidth than what DDR4 would offer.

It actually seems that an APU with a Cape Verde class GPU with a large pool of DDR4 on a wide bus would be a very well balance system design--one large pool of memory (e.g. 8GB), CPU and GPU on a single large die which makes the wide DDR4 bus possible, and the close proximity of the GPU and CPU should cut down on some bandwidth.

If you wanted a cheap console that was able to balance a CPU, GPU, and Memory well equiped to work together this seems perfect to be quite honest. eDRAM doesn't seem necessary for this class of GPU and only complicates the design and makes it more expensive. Which was one of my points.
__________________
"In games I don't like, there is no such thing as "tradeoffs," only "downgrades" or "lazy devs" or "bugs" or "design failures." Neither do tradeoffs exist in games I'm a rabid fan of, and just shut up if you're going to point them out." -- fearsomepirate
Acert93 is offline  
Old 10-Jul-2012, 06:23   #13366
Squilliam
Beyond3d isn't defined yet
 
Join Date: Jan 2008
Location: New Zealand
Posts: 3,155
Default The reason for the *rumoured* 8GB RAM and heavy CPU performance in Durango

Perhaps they intend to use much more 'procedural content' in their next generation console. The reason why they have so much memory is so they can store content without having to recompute all the time and this is also the reason why the design seems to be more CPU centric.

Consoles have two major throughput bottlenecks, the optical drive and the internet as well as a storage problem if they want to create SKUs which don't have mechanical HDDs. Procedural content seems to solve both these problems by amplifying the quantity of 'real' data which can pass through by acting as a form of heavy compression. If an optical drive realistically tops out at 33MB/S and the internet tops out at 20Mbps realistically then this is a major problem which they need to solve in order to actually deliver and make use of stored content.

If they can increase the compression of data significantly then they can achieve a much higher throughput without having to resort to expensive technology such as flash in order to speed data delivery. This makes online distribution significantly more feasible for a wider variety of customers and it may mean they can get away with using solid state HDD's instead of mechanical HDDs if they prove cost effective. They could use 64-128GB of flash memory (not SSD) and offer easy expansion options for customers as well.
__________________
It all makes sense now: Gay marriage legalized on the same day as marijuana makes perfect biblical sense.
Leviticus 20:13 "A man who lays with another man should be stoned". Our interpretation has been wrong all these years!
Squilliam is offline  
Old 10-Jul-2012, 07:13   #13367
Mianca
Member
 
Join Date: Aug 2010
Posts: 330
Default

Quote:
Originally Posted by onQ View Post
This is clearly the next Xbox
If they're really going for a scalable core system, they'll probably need a strongly customized GPU solution with some kind of advanced tile based rendering routine, aka something like Imagination Technology's PowerVR approach or a highly customized AMD design* that's at least as scalable.

They'd probably have to start with [(core system)*2]@28nm at launch to keep the next "jump" [(core system*3)@22nm] manageable.

=========

*Looking a little closer, what if those 64 ALUs mentioned in the leaked presentation actually are 4D-ALUs?

I always found the way AMD introduced their 4VLIW design rather curious. The fact that they put so much extra effort into an allegedly "transitional" architecture (with GCN already around the corner) just didn't make a lot of sense – and using that new arc in only ONE dedicated chip (Cayman) didn’t exactly make the move more reasonable from an economical point of view. Now, instead of going for a direct jump from VLIW5 to GCN, they use VLIW4 in Trinity.

So … what if there’s more to that VLIW4 architecture than meets the (retail) eye?

What I’m trying to say is that 6-8 smallish CPU cores (like ARM or Jaguar) + 64 VLIW4 ALUs + a XENOS-like daughter die with some eDRAM would basically make for a nicely balanced “core system”. Put two of those systems on an interposer, make sure the GPU parts can efficiently work together (evolved tile based rendering), add a lot of RAM and you’ve got the heart of a very nice, very scalable next-gen console.

On a side note, two GPU parts with 64 VLIW4 ALUs each would, interestingly, combine to offer just over 1TFLOP processing power @1Ghz core clock …

Last edited by Mianca; 10-Jul-2012 at 08:45.
Mianca is offline  
Old 10-Jul-2012, 11:32   #13368
french toast
Senior Member
 
Join Date: Jan 2012
Location: Leicestershire - England
Posts: 1,634
Default

Whilst interesting...that would be let down.
I thought some time ago that we would be getting a vliw 4 design....with something like 32 rops 64 texture units....and something crazy like 2000 shaders.. clocked @ around 750mhz that would make a very nice console gpu indeed.

Use a 128bit bus to 4gb ddr 4 and a unified pool of read/write edram of some 64mb in size...and we have a winner.

Last edited by french toast; 10-Jul-2012 at 11:50.
french toast is offline  
Old 10-Jul-2012, 12:09   #13369
Mianca
Member
 
Join Date: Aug 2010
Posts: 330
Default

It would be a let down for a system that's supposed to survive a ten years life cycle without major updates ... but it should be more than sufficient for a system that's actually designed to scale over time.

Assuming that they really aim to keep upscaling their hardware with future revisions (number of core systems, clock speeds), they don't need an overly powerful system at launch. What they do need is a system that's actually profitable at launch - and forward-compatible to run software that will soon be optimized to run on future, enhanced hardware.

The latter might also serve as a possible explanation for the huge amount of RAM that's been rumored: While the launch system won't have to be designed to run all future games and additional services in their highest fidelity, it has to have the memory reserves to be capable of at least somehow running them - possibly with some future services in the background. And that will - even in lower fidelity - require lots of RAM.

I'm still not sure whether, as a customer, I actually like the idea of hardware updates every few years. With many people continuously migrating to the "best" revision, it would certainly earn them a lot of additional money, though. Just like Apple with their continuous revisions of new iPhones and IPads, they could basically just keep milking a huge part of their user base with every new revision.

Last edited by Mianca; 10-Jul-2012 at 12:26.
Mianca is offline  
Old 10-Jul-2012, 13:04   #13370
N2O
Member
 
Join Date: Apr 2010
Posts: 280
Default

So onQ123 according to your neogaf post you still think the old next-gen Xbox doc's spec have something to do right now?
Just aksing
N2O is offline  
Old 10-Jul-2012, 13:59   #13371
onQ
Senior Member
 
Join Date: Mar 2010
Posts: 1,050
Default

Quote:
Originally Posted by N2O View Post
So onQ123 according to your neogaf post you still think the old next-gen Xbox doc's spec have something to do right now?
Just aksing
Leaked Documents 4X - 6X Xbox 360 for games

Info that we have heard about the Next Xbox 1 - 1.5 TFLOPS


Xbox 360 GPU 240 GFLOPS

6X 240 GFLOPS = 1.44 TFLOPS
onQ is offline  
Old 10-Jul-2012, 14:19   #13372
N2O
Member
 
Join Date: Apr 2010
Posts: 280
Default

Both outdated,6x target not exists anymore
Well,just wait it

Last edited by N2O; 10-Jul-2012 at 14:26.
N2O is offline  
Old 10-Jul-2012, 14:37   #13373
AlNets
Posts may self-destruct
 
Join Date: Feb 2004
Location: In a Mirror Darkly
Posts: 15,176
Default

Quote:
Originally Posted by Rangers View Post


immintrin.h, eh?

Curious.
__________________
"You keep using that word. I do not think it means what you think it means."
Never scale-up, never sub-render!
(╯□)╯︵ □ Flipquad
AlNets is offline  
Old 10-Jul-2012, 14:49   #13374
upnorthsox
Senior Member
 
Join Date: May 2008
Posts: 1,422
Default

Quote:
Originally Posted by french toast View Post
Whilst interesting...that would be let down.
I thought some time ago that we would be getting a vliw 4 design....with something like 32 rops 64 texture units....and something crazy like 2000 shaders.. clocked @ around 750mhz that would make a very nice console gpu indeed.

Use a 128bit bus to 4gb ddr 4 and a unified pool of read/write edram of some 64mb in size...and we have a winner.
128bit DDR4 is going to get you about 40GB of BW. Add to it in a UMA configuration you're talking a probable effective bandwidth under 30GB. EDRAM just isn't going to make up that much. And this with the specter of discreet PC gpu's going to 500GB to a TB of bandwidth with stacked memory a year or 2 after you launch.

I'm not seeing alot of win there.
upnorthsox is offline  
Old 10-Jul-2012, 15:03   #13375
McHuj
Member
 
Join Date: Jul 2005
Location: Austin, Tx
Posts: 749
Default

Quote:
Originally Posted by AlStrong View Post
immintrin.h, eh?

Curious.
Well, if there would be any truth here, I guess you could say this would confirm an x86 architecture, no?

Quote:
Originally Posted by upnorthsox View Post
128bit DDR4 is going to get you about 40GB of BW. Add to it in a UMA configuration you're talking a probable effective bandwidth under 30GB. EDRAM just isn't going to make up that much. And this with the specter of discreet PC gpu's going to 500GB to a TB of bandwidth with stacked memory a year or 2 after you launch.

I'm not seeing alot of win there.
With DDR4, I'm guessing you'd have to go to a 256-bit bus, but would that basically say you'll always be limited to 8 DDR chips. I don't think they make chips with 64-bit I/O.

I think they'll come up with a solution that isn't drastically bandwidth limited, I don't think MS would hamper themselves like that.
McHuj is offline  

Closed Thread

Tags
$599, 1 million troops, 1.21 gigawatts, blast processing v2.0, deal with it, don't cry for me acertina, duct tape, finfets everywhere, flops capacitor, george foreman, giant enemy crabs, i want to believe, impossibru, iphone disappear acert, it belongs in a museum, liquid cooling, little big grumpy mod, ludicrous speed, microsoft-sony.com, noooooooooooooooooooooooo, nothing but bits, over 9000, subscriptions everywhere, unlimited power

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 23:15.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.