Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
Back on the fab thing ...

Is there actually any evidence that the GPU is manufactured on TSMC 40nm? Entropy's entire argument was based on Jim Sterling's word being fact, but we now know from Jim Sterling that it isn't a fact, so what else is there?

Wii U on Renesas 40/45nm with lower transistor density than AMD's "high density library" TSMC 40nm Radeon parts works rather well, just as it always has. If you want to use Occam's Razor, this is what it gets you IMO.

Another issue with TSMC 40nm would be that the Wii U simd block would have to be 15~20% higher density than Brazos to achieve 80 shaders per block:

brazos_wiiu_simdfeuev.jpg


... Which would mean that either Brazos SIMD blocks are far more complex than Wii U SIMD blocks, and/or that Renesas and Nintendo are kicking AMDs ass at AMDs own game despite using TSMC (whom AMD have many years of experience working with and with whom they have much experience and developing high density GPUs).

An interesting example of how density can change depending on tools and how layout is done (though this is talking about applying AMD GPU layout tools to their upcoming CPUs):

http://www.tomshardware.com/news/Steamroller-High_Density_Libraries-hot-chips-cpu-gpu,17218.html

It seems there can be a hell of difference based on who and how. I would not automatically assume that a Renesas designed and manufactured 40/45nm chip should be the same size as an AMD designed and TSMC manufactured 40nm chip.
 
How would you perform 4D transforms and whatnot with just 3 simple units? The T-unit doesn't execute those instructions AFAIK (that's why AMDs earlier designs were VLIW5 to begin with...)
Based on admittedly basic descriptions of the t-unit as an ALU with additional logic to handle transcendental and data type conversion operations I thought that the t-unit was a superset or with minimal changes can be made a superset of the simple ALU. Ideally the hardware would be there already and those instructions were just never routed to the t-unit when there were 4 simple ALUs available in VLIW5. Perhaps that's not the case.
 
For what it's worth I think Fourth Storm's reasoning is sound and I agree with him. He does a better job of explaining it than I have too. The work on identifying the TMUs shows real commitment. :D

Based on the evidence for 8 TMUs (I'd originally assumed 16) it makes me think again that maybe textured fill rate rather than ROP BW might be involved in some of high overdraw alpha-texture performance hits we saw in multiplatform Wii U games (CoD etc).

Thanks, haha, you've made some admirable contributions yourself. It's been fun cracking this thing. Anyway, I still think bandwidth might play into the alpha issues somewhat if we're just comparing what's available to the ROPs. 256 GB/s vs probably 70.4 GB/s might have caused some of those stutters if devs were working off 360 code as a base. I want to say that the TMUs gained some efficiency between Xenos and R700 to somewhat mitigate the impact of having less on Latte but I'd have to look into it a bit more.

Back on the fab thing ...

Is there actually any evidence that the GPU is manufactured on TSMC 40nm? Entropy's entire argument was based on Jim Sterling's word being fact, but we now know from Jim Sterling that it isn't a fact, so what else is there?

Wii U on Renesas 40/45nm with lower transistor density than AMD's "high density library" TSMC 40nm Radeon parts works rather well, just as it always has. If you want to use Occam's Razor, this is what it gets you IMO.

Another issue with TSMC 40nm would be that the Wii U simd block would have to be 15~20% higher density than Brazos to achieve 80 shaders per block:

brazos_wiiu_simdfeuev.jpg


... Which would mean that either Brazos SIMD blocks are far more complex than Wii U SIMD blocks, and/or that Renesas and Nintendo are kicking AMDs ass at AMDs own game despite using TSMC (whom AMD have many years of experience working with and with whom they have much experience and developing high density GPUs).

An interesting example of how density can change depending on tools and how layout is done (though this is talking about applying AMD GPU layout tools to their upcoming CPUs):

http://www.tomshardware.com/news/Steamroller-High_Density_Libraries-hot-chips-cpu-gpu,17218.html

It seems there can be a hell of difference based on who and how. I would not automatically assume that a Renesas designed and manufactured 40/45nm chip should be the same size as an AMD designed and TSMC manufactured 40nm chip.

Jim Morrison, it was actually, but yes I agree. That article describes exactly what we've been talking about and that's only comparing two different techniques from one foundry.

I was looking into your post on the last page of Xenos actually being capable of "only" 216 GFLOPs. It would be nice for someone who's actually worked on the hardware around here to confirm, but seeing as it came straight from AMD's official presentation, I have no reason to doubt it. From what I gather, the scalar unit is capable of processing only 1 floating point operation per clock (MUL or ADD) whereas the vector units can do 2 with MAD. So 9 floating point ops per ALU * 48 ALUs * 500 = 216 GFLOPs.

The other thing I noticed in that difficult to comprehend Japanese translation was that Xenos can work on 64 threads total simultaneously whereas Latte should be able to work on 64 threads per SIMD. I would imagine that this would contribute to less stalls and a greater efficiency on Latte's part. There is also that release which mentioned Nintendo licensing Green Hills MULTI. As I understand it, their software is pretty advanced and should aid in extracting as much ILP as possible out of game code. As VLIW is very reliant on ILP, I can see this also giving Latte a decent boost in efficiency.
 
I was looking into your post on the last page of Xenos actually being capable of "only" 216 GFLOPs. It would be nice for someone who's actually worked on the hardware around here to confirm, but seeing as it came straight from AMD's official presentation, I have no reason to doubt it. From what I gather, the scalar unit is capable of processing only 1 floating point operation per clock (MUL or ADD) whereas the vector units can do 2 with MAD. So 9 floating point ops per ALU * 48 ALUs * 500 = 216 GFLOPs.

I think you are correct on this part

The other thing I noticed in that difficult to comprehend Japanese translation was that Xenos can work on 64 threads total simultaneously whereas Latte should be able to work on 64 threads per SIMD. I would imagine that this would contribute to less stalls and a greater efficiency on Latte's part.

No Xenos works with threads of 32 vertices or 64 fragments, but a lot more are available to hide memory latency. Up to 2048 vertices and 4096 fragments can be in flight depending on the resources available (it depends on the number of registers needed by the shaders).
 
If you have read the GX documentatiin you can see how the CPU has direct access to the eFB and you can even do texture interpolation using the CPU (this means access to the eTM).
 
If you have read the GX documentatiin you can see how the CPU has direct access to the eFB and you can even do texture interpolation using the CPU (this means access to the eTM).

As far as I know it doesn't have direct access to the embedded framebuffer; you can access it through a GPU register but you can't address the memory directly. The texture memory is GPU managed and you have to program a strategy yourself.

It surprises me how this thread still continues.

Question: I'm sure a 160SP GPU might keep up with PS360, but we know the CPU and memory bandwidth are bad too. How does that affects the performance as a whole? Would a shitty '09 160SP GPU, shitty memory and shitty CPU be able to keep up just as well? I find that hard to believe.

Also, why is Nintendo forced to pick an existing SIMD core from AMD? I read Grail's and Function's argumentation on the subject but I think you cannot call the GPU's design a change in layout. It is a completely custom GPU so why can't they have requested any changes for the SIMDs? Why would Nintendo even require a PC GPU's SIMD for a low end console?

I think I agree with Entropy here; AMD should be perfectly capable of changing designs (relocate SRAM registers, adding removing SIMD resolution) w/o high risk on failure.
 
Question: I'm sure a 160SP GPU might keep up with PS360, but we know the CPU and memory bandwidth are bad too. How does that affects the performance as a whole? Would a shitty '09 160SP GPU, shitty memory and shitty CPU be able to keep up just as well? I find that hard to believe.

How would 320 shaders help a system that was already crippled by BW (ignoring for a moment Fourth Storms possible identification of 8 TMUs)?

I actually doubt the Wii U main ram BW issues are as much of an issue as people think. These points have been gone over many times in this thread, but basically if your buffers, buffer sampling, and most of your texture reads are taking place in the edram then there should be a drastic reduction in the amount of main memory BW you need to achieve the same results. Even my dual core Athlon 64 with DDR 400 normally noticeably outperformed the 360, so I doubt something with twice as much main memory BW should given sufficient offloading of work to the edram.

Also, why is Nintendo forced to pick an existing SIMD core from AMD? I read Grail's and Function's argumentation on the subject but I think you cannot call the GPU's design a change in layout. It is a completely custom GPU so why can't they have requested any changes for the SIMDs? Why would Nintendo even require a PC GPU's SIMD for a low end console?

Customisation is slow and expensive and you only do it if there's a benefit. What is the point in Nintendo getting AMD to engineer a completely unique SIMD module with redesigned register banks?

I'm currently trying to find a comment about the three next gen consoles from an AMD engineer that I foolishly forgot to bookmark (the one where he talks about the degree of customisation work they did for the three consoles).

I think I agree with Entropy here; AMD should be perfectly capable of changing designs (relocate SRAM registers, adding removing SIMD resolution) w/o high risk on failure.

It's not the location of the sram that's the issue, it's the size and number of register banks.

Entropy's entire argument is founded on "TSMC 40 nm" being a fact (chipworks comment == fact). We know from chipworks (via fourth Storm) that this wasn't a fact. We have many other indicators that this is a Renesas chip. We also know from AMD that even with the same fab and process, tools can make a potentially huge difference to how large a functionally identical unit is. There is no reason that a Renesas layout SIMD unit on Renesas process (or even a TSMC process) should be the same size as an AMD SIMD unit on TSMC. And make no mistake that while this is AMD graphics IP, that this is a Renesas designed chip.

I'm not surprised about this thread still going, but I am a little surprised that it keeps going round in circles. "Custom" and "TSMC 40 nm" seem to be providing infinite fuel for dismissing everything that we know about who's involved in the chip, what we can see on the chip (thanks to Chipworks and Fourth Storm), the console as a whole, and even the games.
 
I think 160SP at low speed doesn't use much power either (low speed like about 500MHz. Radeon 6670 and the GPU in A10-5700 run at 800MHz for instance)

Well 500MHz, that's pushing it. Let's take a single slot, low profile, passive Radeon 5450, it runs at 650MHz, with 80SP. And it's more powerful at running games that one can think. A ddr3 variant of that card can probably run a fair amount of console ports at lowish res.

Question: I'm sure a 160SP GPU might keep up with PS360, but we know the CPU and memory bandwidth are bad too. How does that affects the performance as a whole? Would a shitty '09 160SP GPU, shitty memory and shitty CPU be able to keep up just as well? I find that hard to believe.

Well PS360 themselves are really shitty, about 8600GT level and their CPU were garbage. It still worked, because :
- that's still a fair bit powerful
- optimization, lack of API/driver call overhead
- game content specifically developed for the feature sets and performance abilities, which is at least as much important
 
How would 320 shaders help a system that was already crippled by BW (ignoring for a moment Fourth Storms possible identification of 8 TMUs)?
Wouldn't that make performance even worse? With a single tmu you need two instructions to blend two textures instead of a single instruction.

I actually doubt the Wii U main ram BW issues are as much of an issue as people think. These points have been gone over many times in this thread, but basically if your buffers, buffer sampling, and most of your texture reads are taking place in the edram then there should be a drastic reduction in the amount of main memory BW you need to achieve the same results. Even my dual core Athlon 64 with DDR 400 normally noticeably outperformed the 360, so I doubt something with twice as much main memory BW should given sufficient offloading of work to the edram.
Agreed, though I wouldn't assume that all of current ports are optimized to use eDram as required. The development increases cost and requires more sales... On a nintendo system...

Customisation is slow and expensive and you only do it if there's a benefit. What is the point in Nintendo getting AMD to engineer a completely unique SIMD module with redesigned register banks?
The entire GPU seems to be custom (or is each section dissected by now?). What's the point in that? They could have went with a off the self GPU and a Hollywood addon which may be cheapest of all. Nintendo isn't known for doing that, I never saw PC hardware that had a 24 bit framebuffer or indirect texturing such as Flipper had. May be my own shortcoming but still...

I'm currently trying to find a comment about the three next gen consoles from an AMD engineer that I foolishly forgot to bookmark (the one where he talks about the degree of customisation work they did for the three consoles).
Did you find something good?

It's not the location of the sram that's the issue, it's the size and number of register banks.
Yes, I misquoted a bit. But Wuu sram banks seem to be bigger. Perhaps they didn't need concurrent access or so. Point is, does it really increase costs that much compared to designing a complete custom GPU?

Entropy's entire argument is founded on "TSMC 40 nm" being a fact (chipworks comment == fact). We know from chipworks (via fourth Storm) that this wasn't a fact. We have many other indicators that this is a Renesas chip. We also know from AMD that even with the same fab and process, tools can make a potentially huge difference to how large a functionally identical unit is. There is no reason that a Renesas layout SIMD unit on Renesas process (or even a TSMC process) should be the same size as an AMD SIMD unit on TSMC. And make no mistake that while this is AMD graphics IP, that this is a Renesas designed chip.
I hear you. I didn't see many solid facts btw. For example, I'm really interested on how Marcan measured the clockspeeds. Running it in Wii mode?

I'm not surprised about this thread still going, but I am a little surprised that it keeps going round in circles. "Custom" and "TSMC 40 nm" seem to be providing infinite fuel for dismissing everything that we know about who's involved in the chip, what we can see on the chip (thanks to Chipworks and Fourth Storm), the console as a whole, and even the games.
Indeed, some of the subjects passed by 5 times or so...

Well, only Nintendo can pull of a fart like this. So 160SP? I wouldn't be suprised at all if it was true. Proving it is a bit harder though, hope someone leaks a bench result soon.
 
No Xenos works with threads of 32 vertices or 64 fragments, but a lot more are available to hide memory latency. Up to 2048 vertices and 4096 fragments can be in flight depending on the resources available (it depends on the number of registers needed by the shaders).

Thanks for the clarification. In looking into this topic, I did find something interesting in an article describing Cayman's architecture. I don't know if R700 is the same, but it noted that each SIMD in Cayman could have up to 8 work-groups in flight at once - each from a different kernel, and that each workgroup could be composed of 1 or 4 wavefronts. If that is true also of R700, then a 2 SIMD Latte could possibly have the same amount of fragments/threads in flight as Xenos at any given time (4096). Or am I looking at this the wrong way?

I'm currently trying to find a comment about the three next gen consoles from an AMD engineer that I foolishly forgot to bookmark (the one where he talks about the degree of customisation work they did for the three consoles).

I believe you're looking for this.

The entire GPU seems to be custom (or is each section dissected by now?). What's the point in that? They could have went with a off the self GPU and a Hollywood addon which may be cheapest of all. Nintendo isn't known for doing that, I never saw PC hardware that had a 24 bit framebuffer or indirect texturing such as Flipper had. May be my own shortcoming but still...

Customization can mean alot of things. It's true that Latte is a highly custom chip, but not necessarily in the areas that many have speculated on. Compared to a standard Radeon, Renesas have not only adapted the AMD designs to their own fabrication process, they've also completely changed the memory subsystem, integrated the southbridge/ARM926/DSP, added a 60x bus to the cpu, and etc. It also seems to me like they beefed up the constant cache and global data share. Then, of course, they sprinkled in some extra transistors for translating Hollywood instructions into Radeon code.

I don't think they've really messed with the shader cores and TMUs too much. The floor plan looks different than RV770, llano, etc, and I think that's part of the reason why so many people assume that it's a weird design, but that needn't be so. Marcan described Latte as a pretty conventional Radeon with Hollywood BC being accomplished via a shim layer. That AMD rep in the interview I linked above also describes their role in Wii U as licencing the IP. If Nintendo wanted something beyond R700, they could have had it. There would be no need to futz around with register allocations, ALU functions, etc. I think the design they have meets their needs, and they figured simply having a multicore processor and unified shader model would be enough to satisfy 3rd parties looking to port their games.
 
I was thinking about the CPU interface the other night. I haven't seen too much speculation on it, but I think I've come up with a pretty decent guess at bandwidth. If we look at Gekko and Broadway, they had a FSB bandwidth of 1.3 GB/s and 1.9 GB/s respectively. If we scale that to Espresso's clock speed of 1.24 Ghz, you come to around 3.3 GB/s. But there are 3 cores, so multiplying by 3, we come to 9.9 GB/s.

Gekko and Broadway had a FSB to core clock ratio of 3:1, but the architecture supports ratios as far as 2:1 according to the user's manual I've read online (link seems to be down now actually, but the PPC750cxe one is up still). If they're running the FSB as ~620Mhz and utilizing a 128-bit bus from CPU to GPU (it is an MCM after all, and Marcan has noted that the 60x bus is more substantial than the one used in Wii), then we get to about 9.9 GB/s.

I still find it odd that Nintendo didn't go with even multipliers this time between CPU and GPU (meanwhile Sony/MS have). Those PPC750 cores must really not have wanted to go any higher.
 
The other thing I noticed in that difficult to comprehend Japanese translation was that Xenos can work on 64 threads total simultaneously whereas Latte should be able to work on 64 threads per SIMD. I would imagine that this would contribute to less stalls and a greater efficiency on Latte's part. There is also that release which mentioned Nintendo licensing Green Hills MULTI. As I understand it, their software is pretty advanced and should aid in extracting as much ILP as possible out of game code. As VLIW is very reliant on ILP, I can see this also giving Latte a decent boost in efficiency.
The definition of a thread has changed over the years which could lead to confusion. What was called a thread in Xenos is now called a wavefront and wavefronts have up to 64 threads.

No Xenos works with threads of 32 vertices or 64 fragments, but a lot more are available to hide memory latency. Up to 2048 vertices and 4096 fragments can be in flight depending on the resources available (it depends on the number of registers needed by the shaders).
You're close to correct. Xenos threads were composed of 64 vertices, there were just half as many of them available as pixel threads.
 
My statement

Thanks for the clarification. In looking into this topic, I did find something interesting in an article describing Cayman's architecture. I don't know if R700 is the same, but it noted that each SIMD in Cayman could have up to 8 work-groups in flight at once - each from a different kernel, and that each workgroup could be composed of 1 or 4 wavefronts. If that is true also of R700, then a 2 SIMD Latte could possibly have the same amount of fragments/threads in flight as Xenos at any given time (4096). Or am I looking at this the wrong way?



I believe you're looking for this.



Customization can mean alot of things. It's true that Latte is a highly custom chip, but not necessarily in the areas that many have speculated on. Compared to a standard Radeon, Renesas have not only adapted the AMD designs to their own fabrication process, they've also completely changed the memory subsystem, integrated the southbridge/ARM926/DSP, added a 60x bus to the cpu, and etc. It also seems to me like they beefed up the constant cache and global data share. Then, of course, they sprinkled in some extra transistors for translating Hollywood instructions into Radeon code.

I don't think they've really messed with the shader cores and TMUs too much. The floor plan looks different than RV770, llano, etc, and I think that's part of the reason why so many people assume that it's a weird design, but that needn't be so. Marcan described Latte as a pretty conventional Radeon with Hollywood BC being accomplished via a shim layer. That AMD rep in the interview I linked above also describes their role in Wii U as licencing the IP. If Nintendo wanted something beyond R700, they could have had it. There would be no need to futz around with register allocations, ALU functions, etc. I think the design they have meets their needs, and they figured simply having a multicore processor and unified shader model would be enough to satisfy 3rd parties looking to port their games.


1.) Please do not suggest that Reneasas has produced the gpu die. I know this is your opinion but make it clear that this is just YOUR opinion.

The fact is we only know that Renesas made the MCM (source: Iwata ask)
So stay rational. And if you have some evidence than show it to us.

The question which semiconductor fabrication company made the wii u gpu die (Globalfoundries, tsmc, Renesas,...) is at the moment unkown.


2.)
It is obvious that you are trying to interpret everything in that way that it all fits in your theory. But this is a dangerous method, because we are searching nintendos concepts and not yours. Itself a probabilistic statement needs a solid basis and too many people here made many probabilistic statements without any reasonable starting point --> so pure seculations = for what --> for the fun ???

3.) If you state that "we" (who is we???) have understood most of the gpu floorplan by simply compare different gpu dies than this is such a kind of probabilistic statement without any evidence that your starting point is reasonable . You imply that this is so simple. Only speculations and no evidence here.

Therefore, if you have no evidence , please do not sell it as such.
 
Customization can mean alot of things. It's true that Latte is a highly custom chip, but not necessarily in the areas that many have speculated on. Compared to a standard Radeon, Renesas have not only adapted the AMD designs to their own fabrication process, they've also completely changed the memory subsystem, integrated the southbridge/ARM926/DSP, added a 60x bus to the cpu, and etc. It also seems to me like they beefed up the constant cache and global data share. Then, of course, they sprinkled in some extra transistors for translating Hollywood instructions into Radeon code.

I don't think they've really messed with the shader cores and TMUs too much. The floor plan looks different than RV770, llano, etc, and I think that's part of the reason why so many people assume that it's a weird design, but that needn't be so. Marcan described Latte as a pretty conventional Radeon with Hollywood BC being accomplished via a shim layer. That AMD rep in the interview I linked above also describes their role in Wii U as licencing the IP. If Nintendo wanted something beyond R700, they could have had it. There would be no need to futz around with register allocations, ALU functions, etc. I think the design they have meets their needs, and they figured simply having a multicore processor and unified shader model would be enough to satisfy 3rd parties looking to port their games.
Well, there is a lot of 'may', 'not necessary', 'I think' in your explanation... Don't get me wrong btw, I really digested your (and Thraktor's) insights as if it was fries, very good reading and much effort from your side. I gonna read the link you posted, don't have much time lately.

@EpyonXYZ, you may be right that statements in this board are taken as truth by some people but please understand that Fourth promoted a 320 shader design as long as he could... But then Darth Function came along and seduced him to the dark side:)
 
You're close to correct. Xenos threads were composed of 64 vertices, there were just half as many of them available as pixel threads.

Yes you're perfectly correct, I was confused by the number of threads in flight. So to correct my previous post : Xenos threads are composed of 64 elements (equally fragments or vertices) and you can have up to 32 vertex threads (32 x 64 = 2048 vertices like I said previously) and up to 64 fragments threads (64 x 64 = 4096 fragments) in flight.
 
1.) Please do not suggest that Reneasas has produced the gpu die. I know this is your opinion but make it clear that this is just YOUR opinion.

The fact is we only know that Renesas made the MCM (source: Iwata ask)
So stay rational. And if you have some evidence than show it to us.

The question which semiconductor fabrication company made the wii u gpu die (Globalfoundries, tsmc, Renesas,...) is at the moment unkown.


2.)
It is obvious that you are trying to interpret everything in that way that it all fits in your theory. But this is a dangerous method, because we are searching nintendos concepts and not yours. Itself a probabilistic statement needs a solid basis and too many people here made many probabilistic statements without any reasonable starting point --> so pure seculations = for what --> for the fun ???

3.) If you state that "we" (who is we???) have understood most of the gpu floorplan by simply compare different gpu dies than this is such a kind of probabilistic statement without any evidence that your starting point is reasonable . You imply that this is so simple. Only speculations and no evidence here.

Therefore, if you have no evidence , please do not sell it as such.

https://chipworks.secure.force.com/...Str=CatalogSearchInc&searchText=nintendo wiiu

Chipworks said:
Chipworks report below analyzes the eDRAM in the GPU component fabricated by Renesas.

Hope that settles it.

As for the rest of my post, perhaps there is some speculation in there, but it's based on an analysis of the hardware blocks. They look alot like what we've seen in other chips, so it's a reasonable conclusion to draw that they are similar in design. Could they have messed around with the individual ALUs, while keeping the same register layout of RV770? I guess so, but there's absolutely no evidence that indicates that.

I never said "we" either. I said "many." If you want to include yourself in that group, that's fine. It's not meant to be an insult. If anything, I'm the one who's probably devoted too much time to staring at this thing. I actually do think it's fairly simple to recognize the TMU and L1 components that are common in Llano, Brazos, and Latte. Once one moves past the hangup that the TMUs must be in physical contact with the shader blocks (they're not in RV770 btw - they border the LDS blocks there), everything falls into place quite nicely.
 
Considering a power budget ~ 35W. How much can we assume would be taken by the GPU? The CPU is speculated to have a very low power draw due to it's small size/low clocks, and then you need to power the fan, blu-ray drive, motherboard, memory etc, not to mention sending a constant video stream to the tablet.

I assume if you know how much power the GPU takes you should be able to say what are the possible GPU configurations.
 
Last edited by a moderator:
https://chipworks.secure.force.com/catalog/ProductDetails?sku=NIN-C10234F5&viewState=DetailView&cartID=&g=&parentCategory=&navigationStr=CatalogSearchInc&searchText=nintendo%20wiiu



Hope that settles it.

As for the rest of my post, perhaps there is some speculation in there, but it's based on an analysis of the hardware blocks. They look alot like what we've seen in other chips, so it's a reasonable conclusion to draw that they are similar in design. Could they have messed around with the individual ALUs, while keeping the same register layout of RV770? I guess so, but there's absolutely no evidence that indicates that.

I never said "we" either. I said "many." If you want to include yourself in that group, that's fine. It's not meant to be an insult. If anything, I'm the one who's probably devoted too much time to staring at this thing. I actually do think it's fairly simple to recognize the TMU and L1 components that are common in Llano, Brazos, and Latte. Once one moves past the hangup that the TMUs must be in physical contact with the shader blocks (they're not in RV770 btw - they border the LDS blocks there), everything falls into place quite nicely.


1.) "fabricated by Renesas" does not necessarily mean lithographically produced, because the definition of fabrication also means invention. So there is also the possibility that they mean the edram is just the conceptual invention of Renesas (a IP).

1.1.) If it is lithographically produced by a Fab of Renesas than why should the person of Chipworks (Jim M ???) speculate that tsmc had lithographically produced it?
This is fishy!

Question:
Is Renesas anyhow able to lithographically produced such a complex Chip
like a console GPU in a 40nm process? Have they any experience with such a big chip? Would it make sense to let Renesas lithographically produce such a chip?
 
Last edited by a moderator:
1.) "fabricated by Renesas" does not necessarily mean lithographically produced, because the definition of fabrication also means invention. So there is also the possibility that they mean the edram is just the conceptual invention of Renesas (a IP).

"fabricated by Renesas" does not even have to mean an invention of Renesas! It could just mean that it was fabricated next to (by) Renesas! </sarcasm>

What's next, are we going to have a discussion about what the definition of "is" is?

1.1.) If it is lithographically produced by a Fab of Renesas than why should the person of Chipworks (Jim M ???) speculate that tsmc had lithographically produced it?
This is fishy!

Because speculation isn't fact and in this case his speculation was wrong.

Question: Is Renesas anyhow able to lithographically produced such a complex Chip like a console GPU in a 40nm process? Have they any experience with such a big chip? Would it make sense to let Renesas lithographically produce such a chip?

Yes. Yes. Yes.
 
Status
Not open for further replies.
Back
Top