Wii U hardware discussion and investigation *rename

Status
Not open for further replies.
We do know DDR3 can handle a lot of accesses in flight though..
Do we? What's your source for that? AFAIK DDR3 handles one outstanding read or write request at a time, in the order each command is received.
Dude, you're just annoying the shit out of me so I'm not gonna bother to reply to all of your drivel other than to say "typically" does not mean "cannot do more than". And DDR3 can still only SERVICE one request at a time. End of fucking story.

You believe that anything that can only service one request at a time can not have multiple requests in flight? That's just completely incorrect. The whole point of newer high speed buses with higher latencies is to push more data through a single set of wires (one request at a time) while keeping a pipeline of requests queued up. That's the only way to keep those wires filled with data as much as possible. Just because it processes the requests in-order and sequentially does not mean there aren't multiple commands outstanding at a time.

Perhaps that is too generic a response for you. Let's look at DDR3 specifically.

Reading data from a DDR3 DRAM is a two step process. First a command is sent with a row address and bank. This tells the DRAM to open up a particular page in that bank. Then you have to wait a number of cycles while the DRAM does the page open (tRDC in the DDR3 spec). Once the page is open, a command is sent with a column address and bank. This tells the DRAM to read the data at the provided column address on the previously opened page (row) and return it. Then you wait column access latency (CL in the DDR3 spec) before the DRAM puts the read data on the data bus and sends it back.

In DDR3-1600 the row to column delay (page open delay) is 11 cycles and CAS latency is also 11 cycles. Since any read is a burst of 8, the completion of the read data takes an additional 4 cycles. If DDR3 couldn't have multiple accesses in flight then you would be limited to 8 bits of data every 26 cycles. The *only* way you can keep the data bus full is by issuing multiple commands to the DRAM while it is still processing the previous commands.

If you ignore the page open cost and only consider the CAS (column command) as the actual read then a DDR3 interface has to have 3 reads outstanding to cover that 11 cycles of delay. Keeping the read data bus fully occupied with data requires a constant stream of CAS commands every 4 cycles. Include the page open cost and you need at least 6 reads outstanding (in one form or another) to keep DDR3 going at peak rate.

Now let's put those row/page commands back into the mix as well, since they are a required portion of any read. We know we have to keep issuing CAS commands every 4 cycles to keep the data bus filled, but that leaves 3 idle cycles on the command bus. Any good memory controller utilizes those cycles to send out the row commands to open up a page for future CAS commands to read from. When that happens you have the DRAM doing multiple things at the same time. While it is reading data for address A from bank 0, it is also opening row B of bank 1 and maybe even opening row C of bank 2. If you've scheduled things right, then by the time you're done with your reads for A you can immediately issue your reads for B because you've use the A reads to hide the page open cost for B. Then while the reads for B are being issued you also send out a close-page command for A (you're done with it at this point), and throw in a page open for address D of bank 3. Not only are there multiple reads outstanding, there are different aspects of those reads going on simultaneously!


When we get to DDR4 it becomes even more complicated. In DDR3 and earlier, the minimum cas to cas delay (tCCD) for same bank accesses has always been less than or equal to the burst length. You could keep the data bus full of read data by issuing a steady stream of column addresses that are all in the same row on a single bank. In DDR4 that is no longer true. As interface speed ratchets up, CCD has increased to be bigger than burst length. If you want to keep the data bus full of read data you now have to keep at least 2 banks open and interleave individual reads for the two banks. Combined with the higher interface speed and longer latencies (in clock cycles) and you have even more reads outstanding in various forms.
 
Thanks for the explanation. Would you care to explain us how much slower is DDR3 in comparison to GDDR3 working at the same speed? I mean, latencies on DDR3 must be HUGE and that is something that will hurt WiiU a lot in comparison with PS3 or 360.
 
CAS latency of GDDR3 700MHz can go as low as 10, so it might be 14.3ns against DDR3-1600's (CL=11) 13.75ns.
 
Thanks for the explanation. Would you care to explain us how much slower is DDR3 in comparison to GDDR3 working at the same speed? I mean, latencies on DDR3 must be HUGE and that is something that will hurt WiiU a lot in comparison with PS3 or 360.

when (lower end) PC VGAs started using DDR3 instead of just GDDR3 the difference was always considered irrelevant,

so I don't really think it would "hurt a lot", what would certainly hurt is the fact that you have 64bit memory bus!?

I've just wonder, with DDR3 being so cheap, wouldn't make sense to go for 192, 256bit bus? is the PCB/chip cost really so high? not long ago (2008) reasonable priced (well under $200) PC VGAs like the 9600 had 256bit 1800MHz GDDR3.
 
You believe that anything that can only service one request at a time can not have multiple requests in flight? That's just completely incorrect. The whole point of newer high speed buses with higher latencies is to push more data through a single set of wires (one request at a time) while keeping a pipeline of requests queued up. That's the only way to keep those wires filled with data as much as possible. Just because it processes the requests in-order and sequentially does not mean there aren't multiple commands outstanding at a time.

Nice write-up. Now, to complement that, there should be a small expose regarding the circumstances where and why you don't actually achieve maximum bus utilization.
I shouldn't speak for Grall, but he did use the word "typically", which really makes all the difference.
 
I shouldn't speak for Grall, but he did use the word "typically", which really makes all the difference.

He said that DDR3 can't service more than one request at a time, "end of fucking story." That's a pretty strong statement.

The typically was where he said that PC benchmarks don't typically achieve greater than 50% peak bandwidth utilization which is just as wrong as if he had actually said that they can't. PC benchmarks typically achieve much higher.

A CPU's memory loads will tend to be those that normally go through cache. In order to keep up with potential bandwidth the cache hierarchy/memory subsystem has to keep enough line loads in each level up to the LLC in flight. This means a combination of enough software or automatic prefetching issued by the CPU to say it'll need those lines in advance or enough reordering of the CPU's execution window to encounter the loads in advance. Of course the memory controller itself has to be good at doing all the stuff BobbleHead mentioned.

On x86 processors it can be easy to achieve good bandwidth utilization via the string instructions, if they're well optimized for this. On other CPUs you may be able to easily achieve good utilization with more explicit engines that set up a big block load or store into cache. Consoles often have DMA engines and there can be ones that are tightly integrated with the CPU such that they can interface with at least some cache level. On consoles you're more likely to actually use features like this.
 
Last edited by a moderator:
What if?

I am back. Did you miss me??? Yeah I am sure. I will say what I am going to say until I am banned again while others saying similar type things were allowed to keep on (Can we say bias???). By the way it was a nice touch deleting the post of the person that at least entertained the possibility of what I said.

Now I shall continue.
Um um um.

What if, the Wii U memory controller was a more efficient than normal controller?
What if, the memory was overclocked?
What if, the Wii U CPU was a hybrid of Power 7 and Power 8 technologies?
What if, the CPU had an L4 cache? Okay maybe that's too much.
Some updated in on power technology. https://www.ibm.com/developerworks/mydeveloperworks/wikis/home?lang=en#/wiki/Power%20Systems/page/Understanding%20Processor%20Utilization%20on%20POWER%20Systems%20-%20AIX
Some older basic info. http://www.itjungle.com/tfh/tfh082211-story01-fig01.jpg
What if, the GPU was not closer to a GPU mentioned or maybe a hybrid of one one of these?
http://www.amd.com/us/products/notebook/apu/mainstream/Pages/mainstream.aspx#7
(Note the wattage and that all are DirectX 11 compatible)
What if, the GPU was part of a HSA setup integrating ARM technology more?
http://developer.amd.com/resources/...hat-is-heterogeneous-system-architecture-hsa/
Here are some more links you may find interesting.
http://www.zeldainformer.com/news/c...ii-u-is-powerful-its-just-next-generation-pow
http://www.ign.com/blogs/natyconnor...nsense-of-flops-and-flippers-and-clock-speeds

Some of the reasons I consider at least some of these things a possibilty is timing. Big tech companies look for flagships to introduce new technology. If, they can demonstrate efficiency with a well not product it can bring additional business their way. Another reason for me considering the Wii U is a bit more efficient than many think is because of this. I read an article about a Nintendo developer I can't recall if, he was a game developer or Wii U system developer. Either way he would know. He was telling how Nintendo considered a 1+1 of what I understood to mean basically a Wii and Wii U in the same housing. So if, they did a way with that I think they did so because it must have achieved enough balance of power and efficiency they were seeking. I think not having to spend as much on backwards compatibility and more on the power would have been something they would seek. Since they didn't go that route I think there must be some muscle in the tank.


Another thought in this is that some of the technologies implemented in the Wii U may not be commonly used yet. Like if, someone were trying to get an 8-Track to play on an ipod and then throwing the ipod to the ground cause it won't work for it. Another thing is what build of something is close to that of the Wii U with the GPGPU/MCM setup, with the short pipeline, the built in DSP to compare it to??? The Wii U was designed for all things to work together so trying to compare this and that individually I doubt will give it true justice. One last thing I believe I read that the streaming to the Gamepad may be handled by a separate streaming process that doesn't involve the GPU. Well that is all I have to say for right now or til I am banned again.
 
A note. Since I don't know how to edit a post if, it is possible I want to correct a few errors above. At the top I meant than instead of that after efficient and also I meant GPGPU/MCM instead of GPU/MCM near the bottom of my statement.
 
I am back. Did you miss me??? Yeah I am sure. I will say what I am going to say until I am banned again while others saying similar type things were allowed to keep on (Can we say bias???). By the way it was a nice touch deleting the post of the person that at least entertained the possibility of what I said.

Not a great way to reintroduce yourself to be honest.
 
What if, the Wii U memory controller was a more efficient than normal controller?
What if it's not.

What if, the memory was overclocked?
Why not just use higher clocked memory?

What if, the Wii U CPU was a hybrid of Power 7 and Power 8 technologies?
No.

What if, the CPU had an L4 cache? Okay maybe that's too much.
Yeah, too much. Plus, a level three cache is normally a prerequisite for a level four cache.


You've holed the bottom of the barrel there.


Guy back in July thinking the WiiU used Power 7?

<snip stream of conciousness>

You'll be needing this thread:

http://forum.beyond3d.com/showthread.php?t=62800

It's probably good practice to get banned while posting in the correct thread.
 
What if it's not.


Why not just use higher clocked memory?


No.


Yeah, too much. Plus, a level three cache is normally a prerequisite for a level four cache.



You've holed the bottom of the barrel there.



Guy back in July thinking the WiiU used Power 7?



You'll be needing this thread:

http://forum.beyond3d.com/showthread.php?t=62800

It's probably good practice to get banned while posting in the correct thread.

As far as the right thread I think the title should be changed. It seems people are not free to discuss hardware and investigate or they do little investigating.

What did you think about the GPU links?

The guy in July seemed to be right on about many things.

As far as what type CPU how can you be so dismissive without examining it? To give an example one I was attempting to upgrade from a dual core to a quad core processor and got bad info from tech support about compatibility. I ended up getting a quad and installing it and turned my pc on only to find out it wouldn't boot because it was incompatible even though it was the same type socket. My point being seemingly no difference can mean a world a difference. No matter how many courses, years of experience, or credentials there is no substitute for first hand experience. In most cases it may be good enough but, not all. I mean has anyone x-rayed the CPU? Does anyone really know more than the nm, transistors, speed and that it is an IBM Power processor?
 
People can say what they will about me but often they lack the evidence to back up what they say while puffing up their chest like a big shot. Anyways does anyone have any comments on the GPU links I posted?
 
As far as the right thread I think the title should be changed. It seems people are not free to discuss hardware and investigate or they do little investigating.

The guy in July seemed to be right on about many things.
The guy in July said it was Power7. You've another link you provide saying the CPU isn't Power based.

Anyway, enough of this. I'm not going to have this level of noise on this board. A medical conference wouldn't waste its time entertaining some audience member wanting to talk about the healing powers of waving Mooku Grass over someone's head without any evidence in support, and we're not going to tolerate random hypothesising. We know the specs of Wii U because the hackers opened it up and reported them. If you cannot accept that, that's your prerogative, but if you continue trying to discuss it here you'll be permanently banned. You are lacking the basic understanding and are denying the value of that understanding everyone else has, which means you cannot contribute to intelligent discussion here, and you evidently don't want to learn anything either.
 
The guy in July said it was Power7. You've another link you provide saying the CPU isn't Power based.

As far as the guy in July. Well it was July. He didn't have the benefit of more current knowledge. Whether the CPU was technically a Power 7 or not it is meaningless. It is a custom processor that is Power based. Regardless of that small detail. I think most of it was pretty right on.


You give little evidence to support your claims. Things I have said do not necessarily conflict with the hackers report. You do lack the intellect but, not intelligence to understand what I have to say. Others understand what I say and many agree or agree that many things I say are possible. If, it makes you feel better to just dismiss what I said be my guest. The problem I have with things many things you and others have to say is you are not speaking from first hand experience. How many times in life have you looked at something from far away and then when you get closer. I do not respect people as titles, I respect people as people. If, you do not have evidence to support what you say I will not support it no matter what your title, years of experience or schooling. So where is your evidence to support what you say? Instead of you just saying this in that. I didn't claim the things I said are definitive fact as you are. So show me conclusive evidence to support your claims not speculative evidence, conclusive there is a difference. If, you were in a court of law and you said to the judge well the defendant is just caca so I should win my case. The judge would say case dismissed. Why not man up instead of being juvenile? It's unbecoming. Again anyone have any thoughts on the GPU links I posted?


Anyway, enough of this. I'm not going to have this level of noise on this board. A medical conference wouldn't waste its time entertaining some audience member wanting to talk about the healing powers of waving Mooku Grass over someone's head without any evidence in support, and we're not going to tolerate random hypothesising. We know the specs of Wii U because the hackers opened it up and reported them. If you cannot accept that, that's your prerogative, but if you continue trying to discuss it here you'll be permanently banned. You are lacking the basic understanding and are denying the value of that understanding everyone else has, which means you cannot contribute to intelligent discussion here, and you evidently don't want to learn anything either.

Look up ^
 
Look up ^
Do you mean your comment about the GPU? That Wii U's GPU is a mobile APU with integrated x86 CPU? Or a DX11 HSA including an ARM processor that nobody's bothered to crow about?

How's about we just go with the developer comments that it's a tri-core Wii-like PPC CPU and DX10.1 R7 series GPU with 1GB RAM available as described to Eurogamer and later corroborated independently? Why ignore all that evidence in the belief it could be some special architecture that no-one anywhere has talked about and for which there's no evidence at all?


Edit: This line of discussion brought to an end by theizzzeee ignoring my response and pushing his completely ridiculous line of reasoning, leading to a permanent noise reduction measure put into place.
 
Not sure if this is worth entertaining but okay.. (okay he's banned now, never mind :( Maybe I should remove this, I dunno..)

What if, the Wii U memory controller was a more efficient than normal controller?

What's a normal controller? There are very efficient controllers out there, and I would expect very efficient of a modern console design. But you could have 100% efficiency and the bandwidth would still be poor.

What if, the memory was overclocked?

There's no way Nintendo would do this. They need a guarantee that the memory is properly specified or else face unknown and very risky, potentially devastating failure rates. If they needed the higher clock speed they'd pay the small additional price for the faster RAM.

What if, the Wii U CPU was a hybrid of Power 7 and Power 8 technologies?

What if everything we know about its die size, clock speed, and power consumption contradicts any possibility of being based on a modern high end POWER core at all?

IBM doesn't have magic alien technology. They pay for higher single threaded performance with disproportionately more die area and power consumption.

What if, the GPU was not closer to a GPU mentioned or maybe a hybrid of one one of these?
http://www.amd.com/us/products/notebook/apu/mainstream/Pages/mainstream.aspx#7
(Note the wattage and that all are DirectX 11 compatible)

We know the GPU doesn't have an integrated CPU. How would that help it exactly? Less die area for graphics..

What if, the GPU was part of a HSA setup integrating ARM technology more?

What "ARM technology" could it possibly be integrating, and how would that make the situation any better?


These people are almost as clueless as you are...

The only way anyone is grossly underestimating Wii U is if it's using magic technology that none of us could begin to understand.
 
Status
Not open for further replies.
Back
Top