Would GTA V be possible on Wii U?

Karamazov · Nov 26, 2014

as a reminder of what NPCs are doing in GTA 4 and 5 thanks to euphoria

Deleted member 11852 · Nov 26, 2014

I love that presentation

Shifty Geezer · Nov 26, 2014

Goodtwin said:
Heck, Scarface The World is Yours was a pretty good open world game on the Wii, honestly, it had more NPC's running around than Watch Dogs does.

But at what AI complexity? You can't just go by body count or on-screen results and conclude that comparable looking games in terms of amount of content require comparable workloads and should perform similarly on the same hardware. AI is one of those features that can expand limitlessly. Heck, we haven't got a computer with enough power to simulate one human brain yet, so clearly simulating a city is going to be an impossibly complex job. One so complex, it's not the slightest bit realistic in the solution used, but the approximations can be improved exponentially to provide more realistic results.

DSoup said:
Against the in-order 3.2Ghz PPE in PlayStation 3?

Why are we ignoring the SPEs for AI? Ray tracing for line-of-sight and audio awareness, spatial searching for proximity tests, etc. are part of AI and great (or moderate) fits for SPEs.

Deleted member 11852 · Nov 26, 2014

Shifty Geezer said:
Why are we ignoring the SPEs for AI? Ray tracing for line-of-sight and audio awareness, spatial searching for proximity tests, etc. are part of AI and great (or moderate) fits for SPEs.

These are calculations are necessary to feed into AI but aren't AI themselves. The core AI is taking all of these inputs (what can I see, what can I hear, where I am positioned in relation to the world, what is my health, what is my motivation/goal, what are my capabilities, what can I shoot, what should I avoid, where can I take cover etc) and make a decision about what the subject is going to do. The SPUs strength doesn't lie in non-linear processing and having to load each parallelized AI job with all of this the data and yet still need to react to other AIs for whom that data isn't accessible because it's on another SPU makes this type of processing better suited to the PPE.

Not impossible but far from optimum.

Shifty Geezer · Nov 26, 2014

AI agents wouldn't need to be aware of other AI's decision making (in fact, that'd go beyond AI to precognition!). Each agent only needs evaluate its place. This is serial processing. And I'm sure clever algorithms can create fancy multi-dimensional datasets that encapsulate the various state of play for key objects for fast evaluation, calculating them all individually and then creating an overarching representation.

But even if not, Wii U's maths powers are poop, but these are important in AI. Calculating distances, intercepts, collisions, mathematical weightings, etc. Cell may not be the best at churning through finite state machines, but Wii U certainly has its own fair share of AI shortcomings such that it shouldn't be assumed Espresso can handle everything AI that PS360 can.

Deleted member 11852 · Nov 26, 2014

Shifty Geezer said:
AI agents wouldn't need to be aware of other AI's decision making (in fact, that'd go beyond AI to precognition!). Each agent only needs evaluate its place. This is serial processing. And I'm sure clever algorithms can create fancy multi-dimensional datasets that encapsulate the various state of play for key objects for fast evaluation, calculating them all individually and then creating an overarching representation.

You're right. AIs don't need to react to other AIs instantly, it'd be more realistic to leave it a frame or two to mimic human reactions.

blastingthosejets · Dec 14, 2014

Shifty Geezer said:
AI agents wouldn't need to be aware of other AI's decision making (in fact, that'd go beyond AI to precognition!). Each agent only needs evaluate its place. This is serial processing. And I'm sure clever algorithms can create fancy multi-dimensional datasets that encapsulate the various state of play for key objects for fast evaluation, calculating them all individually and then creating an overarching representation.

But even if not, Wii U's maths powers are poop, but these are important in AI. Calculating distances, intercepts, collisions, mathematical weightings, etc. Cell may not be the best at churning through finite state machines, but Wii U certainly has its own fair share of AI shortcomings such that it shouldn't be assumed Espresso can handle everything AI that PS360 can.

"I believe if you program only against one main CPU (like we do for pretty much most emus), you would find that the PS3/Xenon CPUs in practice are only about 20% faster than the Wii CPU.

I've ported the same code over to enough platforms by now to state this with confidence - the PS3 and 360 at 3.2GHz are only (at best - I would stress) 20% faster than the 729Mhz out-of-order Wii CPU without multithreading (and multithreading isn't a be-all end-all solution and isn't a 'one size fits all' magic wand either). That's pretty pathetic considering the vast differences in clock speed, the increase in L2/L1 cache and other things considered - even for in-order CPUs, they shouldn't be this abysmally slow and should be totally leaving the Wii in the dust by at least 50/70% difference - but they don't."
http://gbatemp.net/threads/retroarch-a-new-multi-system-emulator.333126/page-7#post-4365165

http://www.avsforum.com/forum/141-xbox-area/758390-xbox-360-vs-ps3-processor-comparison.html

http://forums.macrumors.com/showpost.php?p=1633076&postcount=3

Wii U's CPU is SMP which it has own minor advantages over multi processing.

https://software.intel.com/en-us/bl...rence-between-multi-core-and-multi-processing

http://en.wikipedia.org/wiki/Symmetric_multiprocessing

It can handle it and do it better, it is painfully obvious despite you trying to convince/persuade everyone that it "apparently can't do it"... Its like saying a Core 2 Duo can't beat Pentium D which is two Pentium 4 duck taped together.

You can play SIMD and SPE card if you want as a last resort... It can be done on Wii U's GPU.

"Next you would think that the PS3 (just like the 360) would be able to segment the game control plus AI code into one core and the graphics rendering code into another core. However that is not possible! Since the total application code may be about 100 MB and the SPE only has 256KB of memory, only about 1/400 of the total code can fit in one SPE memory. Also since there isn't any branch prediction capabilities in an SPE, branching should be done as little as possible (although I believe that the complier can insert code to cause pre-fetches so there may not be a big issue with branching).

Therefore the developer has to find code that is less than 256KB (including needed data space) that will execute in parallel.

Even if code can be found that can be segmented, data between the PPE and the SPE has to be passed back and forth via DMA which very slow compared of a pointer to the data like the 360.

If we assume that enough segment code was found that could use all the 6 SPE cores assigned to the game application, now the developer would try to balance the power among the cores. Like the 360, some or all the cores may have a very low utilization. Adding more hardware threads are not possible since each core has only one hardware thread. Adding software threads probably will not work due to the memory constraint. So the only option is an overlay scheme where the PPE will transfer new code using DMA to the SPE when the last overlay finishes processing. This is very time consuming and code has to be found that does not overlap in the same time frame."

Wii U GPU has Wii GPU in itself thus inherited its 1MB SRAM texture cache and 2.25MB Framebuffer that likely can serve different role otherwise it would be a waste of silicon for Nintendo.

Betanumerical · Dec 14, 2014

blastingthosejets said:
...

You have quoted a lot of other peoples opinions from 2006/7. Im not really going to address the individual problems in this post because I feel there are people on this forum with far more knowledge who can do so. But I would suggest you fact check things before you post them, just because someone said it on the internet doesn't make it so.

blastingthosejets · Dec 14, 2014

Betanumerical said:
You have quoted a lot of other peoples opinions from 2006/7. Im not really going to address the individual problems in this post because I feel there are people on this forum with far more knowledge who can do so. But I would suggest you fact check things before you post them, just because someone said it on the internet doesn't make it so.

Opinions of people that worked on hardware and has relevance today...

Just noticed Cell has only one thread per core...

Betanumerical · Dec 14, 2014

said:
Opinions of people that worked on hardware and has relevance today...

Just noticed Cell has only one thread per core...

The majority of your argument comes from the following three posts.

http://www.avsforum.com/forum/141-xbox-area/758390-xbox-360-vs-ps3-processor-comparison.html
http://forums.macrumors.com/showpost.php?p=1633076&postcount=3

and this

http://gbatemp.net/threads/retroarch-a-new-multi-system-emulator.333126.//page-7#post-4365165

The first two don't even mention anything about the people posting doing any work whatsoever on either of the consoles, the third doesn't mention enough information to give us context. Whilst i have no doubt the 'only 20% faster then the Wii' is true in at least one case, I doubt it is true in the majority of cases.

Also whoever wrote the information about the Cells internal EIB speed being slow clearly has no idea what they are talking about, it has a peak bandwidth of 25.6GB/s of bandwidth per client (where each SPE and the PPE is a client as are a bunch of other devices), and is what the external system RAM travels over to get into the Local Store / Caches.

can you explain why the Wii U's CPU is better then the PS3's approach or say the XBOX 360's approach.

Infinisearch · Dec 14, 2014

I'm bored and have some time, so I'll bite.

blastingthosejets said:
You can play SIMD and SPE card if you want as a last resort

IIRC every console cpu since the PS2 had SIMD and SPE's are inherently a part of cell's architecture, to not take them into account is reducing cell to a dual core 1.6ghz in-order dual issue (I think) CPU - they are meant to be used by design.

blastingthosejets said:
"Next you would think that the PS3 (just like the 360) would be able to segment the game control plus AI code into one core and the graphics rendering code into another core. However that is not possible! Since the total application code may be about 100 MB and the SPE only has 256KB of memory, only about 1/400 of the total code can fit in one SPE memory. Also since there isn't any branch prediction capabilities in an SPE, branching should be done as little as possible (although I believe that the complier can insert code to cause pre-fetches so there may not be a big issue with branching).

In the history of "normal applications" I'm rather certain there has never been code 100MB in size or anywhere near that, and since all code exhibits locality breaking large code into smaller pieces becomes an exercise in caching. In addition if you've ever coded or looked into coding a game/3d engine you would know the access patterns to large data sets don't always exhibit the problem you're pointing out. SPE's do have branch hint instructions.

blastingthosejets said:
Even if code can be found that can be segmented, data between the PPE and the SPE has to be passed back and forth via DMA which very slow compared of a pointer to the data like the 360.

IIRC SPE's can initiate DMA transactions on there own and therefore traverse complex data structures on there own, I don't remember if there is "special ways" to get data from the PPU cache to an SPE LS w/o hitting memory first. If I'm not mistaken software pipelining was used to mitigate the latency incurred by DMA transfers, but I could be wrong - don't really remember.

blastingthosejets said:
Wii U GPU has Wii GPU in itself thus inherited its 1MB SRAM texture cache and 2.25MB Framebuffer that likely can serve different role otherwise it would be a waste of silicon for Nintendo.

That is a brazen assumption on your part.

Anyway this is all kinda OT for this thread so I'll shut up now, sorry for the interruption.

blastingthosejets · Dec 14, 2014

Infinisearch said:
I'm bored and have some time, so I'll bite.

IIRC every console cpu since the PS2 had SIMD (You're assuming I didn't know) and SPE's are inherently a part of cell's architecture, to not take them into account is reducing cell to a dual core 1.6ghz in-order dual issue (I think) CPU - they are meant to be used by design. Cell is Tri-Core 3.2Ghz,.. Its dual issue.

In the history of "normal applications" I'm rather certain there has never been code 100MB in size or anywhere near that, and since all code exhibits locality breaking large code into smaller pieces becomes an exercise in caching. In addition if you've ever coded or looked into coding a game/3d engine you would know the access patterns to large data sets don't always exhibit the problem you're pointing out. SPE's do have branch hint instructions. Really? Ok.

IIRC SPE's can initiate DMA transactions on there own and therefore traverse complex data structures on there own, I don't remember if there is "special ways" to get data from the PPU cache to an SPE LS w/o hitting memory first. If I'm not mistaken software pipelining was used to mitigate the latency incurred by DMA transfers, but I could be wrong - don't really remember.

That is a brazen assumption on your part.

Anyway this is all kinda OT for this thread so I'll shut up now, sorry for the interruption.

"We use the eDRAM in the Wii U for the actual framebuffers, intermediate framebuffer captures, as a fast scratch memory for some CPU intense work and for other GPU memory writes."

http://hdwarriors.com/general-impression-of-wii-u-edram-explained-by-shinen/

They can access with CPU the 32MB pool in GPU directly, so its probably possible to access other two if not already used...

Betanumerical · Dec 14, 2014

blastingthosejets said:
"We use the eDRAM in the Wii U for the actual framebuffers, intermediate framebuffer captures, as a fast scratch memory for some CPU intense work and for other GPU memory writes."

http://hdwarriors.com/general-impression-of-wii-u-edram-explained-by-shinen/

They can access with CPU the 32MB pool in GPU directly, so its probably possible to access other two if not already used...

The Cell has a single PPE at 3.2Ghz and 8 SPE's at 3.2Ghz. You could easily argue that each SPE counts as a core, even if they are cut down.

TheWretched · Dec 14, 2014

As well as each SPE (6 for general use) has 256kB of very fast local storage.

There's a lot of stuff I have issues with, above...

blastingthosejets said:
"I believe if you program only against one main CPU (like we do for pretty much most emus), you would find that the PS3/Xenon CPUs in practice are only about 20% faster than the Wii CPU.

Especially this... no way how you spin it does this make sense. Only if you drop the SPEs in CELL and program to one HW thread on CELL might this be correct. And even then, I am unsure. Wii CPU is not just old. It's freakishly old. And additionally, it's not clocked high, either. It might be out of order, but the architecture surrounding it doesn't make OOOe the magic bullet to combat CPU that is clocked nearly 5 times as high. And that is ignoring the fact, that XeCPU has AVX and several other SIMD advances, WiiCPU completely lacks. Each SPE has 8 times the processing power of a single Broadway! EACH ONE. Even if you program it really badly, you have 6 of them PLUS the PPE...

If we're talking WiiU, the story is different. But even then, I'd not dismiss CELL. It helped the PS3 to combat the XeGPU, which was hands down the much faster than RSX, yet the game differences (later on, that is) didn't show this.

Shifty Geezer · Dec 14, 2014

blastingthosejets said:
You assume that Wii U's CPU can't do it and you probably ignore advantages that it has...

You can't read (or jumped on one post and didn't bother reading the thread). I had acknowledged Wii U has some strengths, but was challenging the view that Cell and Xenon were lacking by pointing out a good part of modern AI involves lots of maths and that's an Espresso weakness.

You then go on to talk completely incoherently about code not fitting in SPE local store and such madness. Umm....no processor on the planet can fit 100 MBs of code on chip for the CPU to use (save maybe Intel's eDRAM chips!). All of them stream the code into the caches, measured in kilobytes.

You're possibly someone we've removed from this board before. Certainly you're someone being removed now as not being capable of contributing to sane discussion on the board.

Infinisearch said:
I don't remember if there is "special ways" to get data from the PPU cache to an SPE LS w/o hitting memory first.

The ring bus, I believe. Yep.

Infinisearch · Dec 14, 2014

blastingthosejets said:
IIRC every console cpu since the PS2 had SIMD (You're assuming I didn't know) and SPE's are inherently a part of cell's architecture, to not take them into account is reducing cell to a dual core 1.6ghz in-order dual issue (I think) CPU - they are meant to be used by design. Cell is Tri-Core 3.2Ghz,.. Its dual issue.

I'm not assuming anything, you were talking about "pulling cards", how can you pull a card that is in everyones hand? Cell has one PPU @ 3.2 ghz, it handles two threads - IIRC they are scheduled in a round robin static fashion - hence the 1.6ghz number.

blastingthosejets said:
"We use the eDRAM in the Wii U for the actual framebuffers, intermediate framebuffer captures, as a fast scratch memory for some CPU intense work and for other GPU memory writes."

Assuming your source is accurate, thank you I learned something new, edram/esram for gpu's as far as I know has never been implemented being directly addressable by the CPU or any other bus master besides (maybe) DMA. You also mention the texture cache... I see no support of that claim in your link. However I'd like to point out you point out problems with cell's local stores yet you don't point out the problems with this access pattern, why?

Shifty Geezer said:
The ring bus, I believe. Yep.

While everything is hooked up to the EIB cache's aren't normally directly addressable, so unless something special was built in to the bus interface of the PPU... but I might be missing something in regards to cache coherence in regards to DMA (dirty bit causing transfer to hit cache). I looked through your article and couldn't find anything to clear that up.

Shifty Geezer · Dec 14, 2014

What do you mean by 'hitting memory'? You can send a package from any core to any other core via the EIB without having to go out to and back from RAM.

Infinisearch · Dec 14, 2014

Shifty Geezer said:
What do you mean by 'hitting memory'? You can send a package from any core to any other core via the EIB without having to go out to and back from RAM.

Aren't all access's to a SPU's local store done through DMA? If so unless the DMA unit is cache coherent you can't go straight from cache to local store, hence the cache line/s would need to be flushed (used to update RAM - I'm not sure that is the right term) first. I'm unclear on this behavior and never really bothered looking it up.

BTW - I looked up the Wii-U CPU and according to wikipedia it can retire four instructions per clock, I'm not sure how many of those are ALU ops though.

jlippo · Dec 15, 2014

Infinisearch said:
Aren't all access's to a SPU's local store done through DMA? If so unless the DMA unit is cache coherent you can't go straight from cache to local store, hence the cache line/s would need to be flushed (used to update RAM - I'm not sure that is the right term) first. I'm unclear on this behavior and never really bothered looking it up.

BTW - I looked up the Wii-U CPU and according to wikipedia it can retire four instructions per clock, I'm not sure how many of those are ALU ops though.

Apparently PPU can directly access SPU local store directly, but it is more efficient to use DMA.
https://books.google.fi/books?id=7g...wOp9YD4DQ&ved=0CDIQ6AEwAA#v=onepage&q&f=false

Goodtwin · Dec 15, 2014

The only piece of that rant that is rooted in fact is the Shin'en quote about the edram for the Wii U. Shin'en did say they can access the GPU's edram directly with the CPU. Wii U is tightly engineered from that perspective. Shin'en did an interview with HD warriors where they expressed positive opinion of Wii U's memory setup, stating that Nintendo's engineering avoided typical stalls due to high memory latency. So I don't know enough about coding to know just how much code is super memory latency sensitive, but it does appear to be a positive for the Wii U.

Would GTA V be possible on Wii U?

Karamazov

Deleted member 11852

Guest

Shifty Geezer

uber-Troll!

Deleted member 11852

Guest

Shifty Geezer

uber-Troll!

Deleted member 11852

Guest

blastingthosejets

Betanumerical

blastingthosejets

Betanumerical

Infinisearch

blastingthosejets

Betanumerical

TheWretched

Shifty Geezer

uber-Troll!

Infinisearch

Shifty Geezer

uber-Troll!

Infinisearch

jlippo

Goodtwin

Similar threads