Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
Transitioning to the most advanced node is a slow process because it's initially expensive, especially for larger designs. This cannot be more true for eDRAM due to the way it is manufactured.

Well, Microsoft did it two times on the XB360, so it probably pays off.
 
As I understood the eDRAM of the Xbox360 is located on a seperate die (as are the 24MB 1T-SRAM main ram of the Wii) and so the bandwidth from GPU <-> eDRAM is "only" 32GB/sec while the bandwidth between eDRAM logic and memory is 256GB/sec. If the eDRAM/1T-SRAM of the WiiU would be located on the same die, would this mean that they could use the full bandwidth of 256GB/sec (supposing same frequency of 500 MHz)?

Quoting myself: Could someone please answer this? Thx!
 
The FinancialSuccess = f(eDRAM) relationship is both OT and mind-blowing. Please let it go and return to the topic at hand, thank you.

That wasn't the relationship. But as it's OT I'll have to let it go!

The eDRAM needs an I/O as well. It's not insignificant either.
I think he is referring to on die edram.

Yeah, I was referring to on die edram with the comment AIStrong quoted, but as I've jumped around a lot, talking about a number of consoles (including a lot of 360) that wasn't clear. The 360 seems to be a bit of an anomaly with it's never-integrated daughter die, and it's fair to point out there are additional costs associated with it.

I can't see Nintendo going this route though. I hope they use an "APU" (to use AMD's term) and that it has a large pool of on die edram that can be used as developers wish. I think that's probably the ideal way to structure a console chip where you want to maximise performance per watt and performance per $ over the life of the system.

Paying for all your memory to have high bandwidth makes a lot of sense on a PC graphics card, but to me not so much on console where buffer sizes are probably smaller and more predictable. I think Nintendo probably see it this way too.
 
Last edited by a moderator:
As I understood the eDRAM of the Xbox360 is located on a seperate die (as are the 24MB 1T-SRAM main ram of the Wii) and so the bandwidth from GPU <-> eDRAM is "only" 32GB/sec while the bandwidth between eDRAM logic and memory is 256GB/sec. If the eDRAM/1T-SRAM of the WiiU would be located on the same die, would this mean that they could use the full bandwidth of 256GB/sec (supposing same frequency of 500 MHz)?

Yeah, bandwidth to on die edram is unlimited for all intents and purposes.
 
Well, Microsoft did it two times on the XB360, so it probably pays off.

I'm just saying they were rather late, and so shouldn't be relied upon. The two shrinks MS employed were 80nm and 65nm. It took forever for 65nm to show up (Valhalla die size estimations imply 65nm). The 80nm shrink was more due to switching foundries (NEC to TSMC) than any big plan - it's also just a half-node.
 
I'm just saying they were rather late, and so shouldn't be relied upon. The two shrinks MS employed were 80nm and 65nm. It took forever for 65nm to show up (Valhalla die size estimations imply 65nm). The 80nm shrink was more due to switching foundries (NEC to TSMC) than any big plan - it's also just a half-node.

Didn't know about the first one. So it was 3 times in fact, because they are already @45nm, no?
 
I think the edram might still be 65 nm. The move from NEC 90nm to TSMC 80nm may have also been driven by the need to reduce thermal issues contributing to RROD. Did MS ever confirm 80nm or did they keep quiet about it?

It could be that they can't shrink the daughter die (containing the edram) any further because of I/O requirements and so there's no point in going further, or it could be there's no suitable process that's smaller or that's cost effective.
 
The combined die is 45nm litho, but as I said, the die size of the edram in the Valhalla revision implies 65nm (64mm^2 @ 80nm to 45mm^2). If 45mm^2 is 40nm litho, then they have some really shit scaling.

It could be that they can't shrink the daughter die (containing the edram) any further because of I/O requirements and so there's no point in going further, or it could be there's no suitable process that's smaller or that's cost effective.

eDRAM manufacturing is more complex. 40nm may also not have been ready or feasible at the time (early 2010). It certainly wouldn't have been inexpensive.

http://forum.beyond3d.com/showpost.php?p=1532902&postcount=5280

We're straying from the topic at hand though.
 
As I understood the eDRAM of the Xbox360 is located on a seperate die (as are the 24MB 1T-SRAM main ram of the Wii) and so the bandwidth from GPU <-> eDRAM is "only" 32GB/sec while the bandwidth between eDRAM logic and memory is 256GB/sec. If the eDRAM/1T-SRAM of the WiiU would be located on the same die, would this mean that they could use the full bandwidth of 256GB/sec (supposing same frequency of 500 MHz)?

The 256GB/s is the bandwidth between the ROPs and the eDRAM. The 32GB/s is just for the resolve step on the way to main memory.
 
Yeah, bandwidth to on die edram is unlimited for all intents and purposes.
Hurm, well, on-chip buses can burn quite a bit of power too, probably a reason why Cell's on-chip bus runs at only half core clock for example.
 
eDRAM on-die or on a daughter chip, like any memory pool, requires a bus to be addressed. And that has a variable cost depending on the type of interface the engineers choose to go with.

And since someone talked about the memory requirement for the various frame buffers for the Xbox 360, I'll just point to Wavey's excellent article on C1 for the formula:
http://www.beyond3d.com/content/articles/4/5

While I'm at it, I'll also link to the PS3 hardware scaling capabilities article, since it has also been mentioned:
http://www.beyond3d.com/content/articles/16/

Oh, and thanks y'all for the positive feedback on the article!
 
30MB 1T-SRAM-Q @45nm (33,6mm²) would only be slightly bigger than the 3MB 1T-SRAM @90nm in the Wii (26,4mm²), so I wouldn't say 16MB is overly optimistic for CPU and GPU, quite the contrary from my perspective.
It's tiny so it could be workable but for some reason it doesn't add that well with earlier rumors about system even though I know devs doesn't seem to know the final specifications.

Say we're looking at four low clocked OoO powerPC cores sharing between 2 and 4 MB of L2, a 320SP custom GPU R7xx and 16 MB of on board scratch pad memory, I feel like most returns and rumors would have been a bit more flattering for the system. My belief is that the system would run most games @ 1080p with the same perfs as the PS360 (or close to 1080p as lot of games are close to 720p nowadays). Even @720p the advantage would also be significant (not a generational jump for sure but still), more shading power (more and more efficient architecture), high quality textures filtering, and some neat gains vs what happened with the 360 as the scratch pad memory would be available to the GPU ALUs without having to write back to main RAM. Then there is the amount of RAM 1 GB would be enough to make a significant difference too.
All this taken in account I believe that the difference would be perceivable even at early level of development.

Anyway I somehow hope you're right as my wife would happily welcome a N system in our place and so I will if the system get rid on some of the most offending defaults of this gen even if it doesn't bring the whole thing to another level.
 
Last edited by a moderator:
I just looked through the IBM press release again, and I'm now almost certain that we're looking at an SoC and the eDRAM won't be L3 cache:

The all-new, Power-based microprocessor will pack some of IBM's most advanced technology into an energy-saving silicon package that will power Nintendo's brand new entertainment experience for consumers worldwide. IBM's unique embedded DRAM, for example, is capable of feeding the multi-core processor large chunks of data to make for a smooth entertainment experience.
The way they describe the purpose of the eDRAM doesn't scream L3 cache to me. Were it cache, they'd probably have written just that. "Feeding data" sounds like a memory pool to me. And the press release calls it a "silicon package", which could very well mean an APU, in which case that memory pool would almost certainly be shared between CPU and GPU.
 
But they decisive wrote [...] capable of feeding the multi-core processor large chunks of data [...]. This is exactly what a cache does, no?
You are right on the "silicon package" passage though, this really sound like a SoC.
 
But they decisive wrote [...] capable of feeding the multi-core processor large chunks of data [...]. This is exactly what a cache does, no?
You are right on the "silicon package" passage though, this really sound like a SoC.
Sure, but I still think they'd say cache if it were cache. Looking through IBM's archives, they actually built chips were the eDRAM wasn't used as L3 cache (Cu-32).
 
Sure, but I still think they'd say cache if it were cache. Looking through IBM's archives, they actually built chips were the eDRAM wasn't used as L3 cache (Cu-32).
They are building the POWER A2/EN where it's used as L2, that's it.
 
But they decisive wrote [...] capable of feeding the multi-core processor large chunks of data [...]. This is exactly what a cache does, no?
You are right on the "silicon package" passage though, this really sound like a SoC.

If IBM wanted to say SoC dont you think they would have just said it?
Instead, maybe they are actually saying SoP.

System on a Package.

System-on-Package (SOP) technology based on silicon carriers has the potential to provide modular design flexibility and high-performance integration of heterogeneous chip technologies and to support robust chip manufacturing with high-yield/low-cost chips for a wide range of two- and three-dimensional product applications. Key technology enablers include silicon through-vias, high-density wiring, high-I/O chip interconnection, and supporting test and assembly technologies. The silicon through-vias are a key feature permitting efficient area array signal, power, and ground interconnection through these thinned silicon packages. High-density wiring and high-density chip I/O interconnection can enable tight integration of heterogeneous chip technologies which approximate the performance of an integrated system-on-chip with a ‘‘virtual chip’’ using the silicon package for integration. Silicon carrier fabrication leverages existing manufacturing capability and mid-UV lithography to provide very dense package wiring following CMOS back-end-of-line design rules. Further, the thermal expansion of the silicon carrier package matches the chip, which helps maintain reliability even as the high-density chip microbump interconnections scale to smaller size.

"IBM will continue to do research—research and engineering services are IBM's strengths—not low cost manufacturing," Petrov wrote. He said IBM has invested in silicon-on-insulator process technology, eDRAM, through-silicon-via technology (TSV), and 3D silicon
http://itmanagement.earthweb.com/datbus/article.php/3881086/Is-IBMs-Foundry-Business-Next-to-Go.htm

IBM will link chips together in a relatively new way that the company says will improve performance and cut power consumption.
The technology, called through-silicon vias, or TSV, involves connecting different components--processors and memory, for example--or different cores inside of two respective chips through thousands of tiny wires that will carry data back and forth. Now, chips mostly transfer data over channels called buses, which can get overwhelmed, embodied in wires. With TSV, far more data can be transferred per second in a less energy-intensive manner.

IBM is not the first company to talk about TSV (Intel is), but could be one of the first to commercially exploit them. IBM will deliver samples of communication chips with TSV to customers later this year and begin commercial production in 2008. TSV will reduce power consumption in silicon germanium chips, a favorite of IBM's, by around 40 percent. In these chips, microscopic holes will be drilled into the chip and filled with tungsten to create the TSVs


TSVs additionally economize on motherboard space because chips are stacked into vertical towers. Several chip companies stack chips vertically in packages now, but these chips are generally wired together through buses, so they achieve the space advantages but not all of the bandwidth benefits. Typically, bus ports are on the side of chips. TSVs sprout from the comparatively spacious lower or upper surface of a chip and drill through the silicon.

Packaging chips together through TSV also potentially will change how chips get sold. Rather than buy processors and memory or different communications chips from different vendors, computer makers will buy complete packages of prewired chips. Thus, companies like IBM or Intel will potentially find themselves selling standard types of memory again because it will be prepackaged with its premier semiconductors.
http://news.cnet.com/IBM-connects-chips-for-better-bandwidth/2100-1006_3-6175355.html


Check out the 3D system roadmap

http://wenku.baidu.com/view/38da00d9d15abe23482f4d59.html

http://fuji.stanford.edu/events/spring01/slides/shermanSlides.pdf


Could this be the direction that Nintendo is taking with the WiiU?

Three-dimensional (3D) chip integration may provide a
path to miniaturization, high bandwidth, low power, high
performance and system scaling. Integration options can
leverage stacked die and/or silicon packages depending on
applications. The enabling technology elements include: (i)
through-silicon-vias (TSV) with thinned silicon wafers, (ii)
fine pitch wiring, (iii) fine pitch interconnection between
stacked die, (iv) fine pitch test for known-good die, and (v)
power delivery, distribution and thermal cooling technology.
Applications may range from miniaturization of portable
electronics like image sensors and cell phones to power
efficient, high performance computing solutions such as
servers and super computers.
http://ecadigitallibrary.com/pdf/58thECTC/s13p1p49.pdf
 
Status
Not open for further replies.
Back
Top