PS4 & XBone emulated?

Discussion in 'Architecture and Products' started by alexsok, Dec 10, 2013.

  1. Arwin

    Arwin Now Officially a Top 10 Poster
    Moderator Legend

    Joined:
    May 17, 2006
    Messages:
    18,762
    Likes Received:
    2,639
    Location:
    Maastricht, The Netherlands
    But the CPU and GPU components can work together very efficiently. This is a new paradigm to some extent, but similar to what happened with the more advanced titles last gen. The total system efficiency is what counts here, and the true bottlenecks have traditionally been manipulating large data streams.

    It's a bit blunt, but couldn't you say that as soon as your calculations, whatever they are, are bottlenecked on the CPU, they are likely to benefit from massive paralelisation? I'm sure that you'll find exceptions, but maybe not as many?

    And especially in these times, where the CPU has been an unreliable, inefficiently used quantity on PC versus the GPU (whether through gpu driver overhead, comparatively high-latency two-way communication, separate memory pools, etc.), it seems to me that the resources are more likely to be used in the GPU context than the CPU context in the first place.

    I'd love to hear about a real-life bottleneck.
     
  2. Psycho

    Regular

    Joined:
    Jun 7, 2008
    Messages:
    746
    Likes Received:
    41
    Location:
    Copenhagen
    The point then was that their existing production optimized x360 code was easily running faster on jaguar per thread, so straight ports of components could be made without additional threading.
    However, it could have something to do with this: http://beyond3d.com/showpost.php?p=1601066&postcount=35 ie that the Xenon is essentially behaving like a 1.6ghz in-order 6 core - which should also be quite a bit slower than the "6 core" XB1.

    Can't find the numbers now, but IIRC the 2ghz kabini is running pretty much like a 2ghz conroe quad.
     
  3. homerdog

    homerdog donator of the year
    Legend Subscriber

    Joined:
    Jul 25, 2008
    Messages:
    6,294
    Likes Received:
    1,075
    Location:
    still camping with a mauler
    Some people here making the mistake of thinking the X360 CPU was fast. It was trash - absolute utter garbage. The new console CPUs will run rings around it, just as much as typical desktop CPUs will run rings around them.
     
  4. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Baring in mind that any PC games that may be ported to the new consoles would likely be current gen console games at heart I can't see them giving the new console CPU's much bother. The code may need to be re-written to take better advantage of multiple threads but if it was possible to run it on Xenon/Cell is would certainly be possible to run it on the jaguars. Enhancements of PC versions over current gen console versions usually focus on the GPU side rather than CPU.
     
  5. HMBR

    Regular

    Joined:
    Mar 24, 2009
    Messages:
    418
    Likes Received:
    106
    well, the consoles run at 1.6 and 1.75GHz, but it should be close to a Conroe at a similar clock I guess, but that's hardly good enough for a PC CPU anymore, specially because Conroe was running at a higher clock most of the time.

    just for fun I compared my crippled (E2140) Conroe at the same clock as Kabini on single thread Cinebench 11.5 (perhaps not the best tool? but it's what I have) and...
    1.5GHz Kabini = 0.39 points
    1.5GHz E2140 = 0.40 points

    keep in mind the e2140 was downclocked a little and with less than half the memory speed.

    but yes, Jaguar single core performance is way to low for a gaming PC, but... it's being used on a closed platform, with 8 of these cores... so it's a different story.
     
  6. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    The general trend in the game development is to profile integer, memory intensive workloads and complex serial code on the CPU side and shift all/most of the parallel SIMD/FP parts to the GPU, incl. pure compute tasks. Address space unification will further emphasize on this model.
     



  7. I'm not an expert, but your statements are completely disparate to every single testimony from game developers I've seen so far.
    Everyone seems to agree that the OoO functionality and much higher IPC of the Jaguar core can more than compensate for the difference in clock speeds.

    Just as an example, each Jaguar core has been estimated to take almost 100M transistors. The entire Xenon (3 dual-threaded PowerPC cores) has 163M. The eight Jaguars alone take almost as many transistors as 5 entire Xenons. Not even taking into account that ~9 years of technological progress must have evolved the performance-per-transistor over all architectures, the fact that a single Jaguar core is much larger than the PowerPC cores in Xenos is a good indicator that IPC won't be anywhere near the same.

    It seems that all you're doing is taking wild assumptions based on clock speeds and theoretical limits alone, which isn't realistic by any means.
     
    #47 Deleted member 13524, Dec 16, 2013
    Last edited by a moderator: Dec 16, 2013
  8. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    9,237
    Likes Received:
    4,260
    Location:
    Guess...
    Wouldn't that leave the SIMD units underutilized though? Especially on PC which are quite heavy in that regard these days? Or would you tend to keep more of the SIMD code on the CPU for PC development due to the latency of going over PCI-E?
     
  9. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    The old claim that it's each core is like 2x1.6GHz cores is wrong. If you look at the Cell PPE datasheet (XBox 360's CPU should be similar) you'll see that it has a few different thread interleaving modes. IIRC they only affect the front end.

    But that doesn't change that the PPE's typical average IPC was very poor. It's due to a lot of factors beyond just lacking OoOE. Like pretty narrow/limited multi-issue capabilities, two-cycle ALU latency, poor branch prediction/high mispredict penalty, a big bubble even on predicted taken branches, high L1/L2 cache and memory latencies, lack of hardware prefetching, a large cache line size, and no store to load forwarding plus a big penalty on load hit store.
     
  10. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    I see we've moved from "better than last-gen phase" to "can compensate phase", that's not what we should be talking about when we mention next-gen, to truly have a tangible next-gen gaming experuience both CPU and GPU fronts should be more advanced than before .. instead what we have here is an underpowered CPU that is barely any better than last-gen, coupled with a powerful GPU. which means we will have better graphics but not necessarily better gameplay or experience, innovation will be quite limited.

    And I am not echoing my own voice here, just the words of other developers who expressed the same to Eurogamer's secret developer program , and I quote:


    In fact I recommend everyone to read this excellent article to get a better grasp on how things will develop in this console cycle.

    http://www.eurogamer.net/articles/d...ware-balance-actually-means-for-game-creators
     
  11. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    When the author says that the cores are "slower on paper" that refers to nothing more whatsoever than the lower clock speed. In some sense that really would make it slower, even if it completes the same or more work in the same time frame.

    The limited amount of code that actually delivers decent throughput on XBox 360 or PS3's PPE cores is generally well suited for GPGPU execution. Even more so for the PS3's SPE cores. For more general code a 1.6GHz Jaguar will easily match if not exceed the 3.2GHz PPE, the perf/MHz is really that bad. Couple in the fact that there over twice as many cores on XB1/PS4 than there were on XBox 360 and you get something far beyond just delivering similar CPU performance.

    Consider Wii U. It has only 3 CPU cores without SMT. These cores have inferior perf/MHz on scalar code (and much worse on vector code, especially vector integer) and lower clock speed than PS4/XBox 360. They also have little leftover ALU resources to utilize for GPGPU. And yet it's usually able to achieve something close to parity with PS3 and XBox 360 games. PS4/XB1 should therefore be able to greatly exceed them in CPU-driven capability.
     
  12. Psycho

    Regular

    Joined:
    Jun 7, 2008
    Messages:
    746
    Likes Received:
    41
    Location:
    Copenhagen
    Nice with some insight Exophase, the thread was starting to sorely lack it :)
     
  13. No one moved any phase.
    My words were that each Jaguar core can more than compensate for the lower clock speeds for each PPE core. From all I know, 1 Jaguar core can do more than 1 PPE core.
    And then the other difference is that for games there are 6 Jaguar cores versus 2 PPE cores.



    Yes, you are echoing your own voice.
    To my knowledge, no developer ever claimed that the CPU is barely any better than last gen, because it's simply not true. That article says no such thing.

    It's like you're stuck to the Pentium 4 era where most people thought more MHz = more performance.
     
  14. alexsok

    Regular

    Joined:
    Jul 12, 2002
    Messages:
    807
    Likes Received:
    2
    Location:
    Toronto, Canada
    I wonder what would have happened if the CPU was as beefed up as the GPU on next-gen. Would the consoles then cost $1000?
     
  15. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,213
    I think it is more than that, the only thing the author listed as advantageous for the new consoles is the OoOE as well as other minor architectural enhancements.
    Well, I guess actual games will be the judge of that, right now it does seem like the majority of next-gen flare in cross-platform games is stemming from the better GPU (ie the rendering side) we will see down the road if the simulation side is catered for.

    I also noticed that most points in this discussion focused only on the X360 CPU, ignoring the PS3's Cell completely, which doesn't bode well for the new consoles. unless there are other bits I am not aware of.

    I really think the Wii U is a bad example, games there look and run pathetically, while some of them come close to PS3/X360 they usually lack AA/AF , scale back on textures, shadows and other features, run with horrendous 20ish fps .. etc.

    Many developers actually gave up making any games on the Wii U at all, due to it's bad performance. if it was so close to the PS3/X360 that wouldn't have happened.

    For argument's sake, let's consider they are not really that bad compared to last-gen. However, even with that when you have a GPU that is 10 times better than last-gen, while the CPU is only 2 times better, then you have a CPU that is barely faster than last-gen.

     
  16. pMax

    Regular

    Joined:
    May 14, 2013
    Messages:
    327
    Likes Received:
    22
    Location:
    out of the games
    argh... you've beaten me in answering time... :oops:

    ah, I'd take a jaguar over the PPE any time. And likely any PS3 developer I guess...

    the PPE was thought to be no more than the scheduler for SPEs, in the end.
    ...and some game company didnt even use it for that, as it was too slow even for that :p

    Just a question - are CPUs today 10x faster, IPC clock per clock, than CPUs of 2006? As long as CPUs can keep up doing their job and offload to GPGPU, there's no need of anything more powerful.
     
  17. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    He's dumbing it down for the audience. Nowhere did he actually make an estimation of what the difference in perf/MHz is like, he just said it's different. OoOE is not at all a dominating factor compared to all those other things I listed. You can find in-order CPUs that have much better perf/MHz than the PPE (Intel's Saltwell, ARM's Cortex-A53 will be an even better example), they're even still dual issue. You can't ignore all the serious performance problems these CPUs have when not dealing with very data regular code.

    Yes, it takes time to develop these things. I remember people saying similar things when this last gen started.

    Because a) XBox 360 very usually did as well as better than PS3, there aren't really showcase examples where PS3 games had this huge advantage in logic, it's largely because the SPEs were quite domain specific/limited in the kind of work they could do and b) like I said, the stuff SPEs were good at is mostly stuff GPGPU can do now. In fact, much of what the SPEs were being used for wasn't just GPGPU friendly code but actual graphics, a lot of stuff that was even being done by the GPU on XBox 360.

    I really think the Wii U is a bad example, games there look and run pathetically, while some of them come close to PS3/X360 they usually lack AA/AF , scale back on textures, shadows and other features, run with horrendous 20ish fps .. etc.

    Look and run pathetically? I don't know what you're reading, most games that DF analyzed look about the same (some with better textures or more post-processing) and run somewhere between the PS3 and XBox 360 versions. A few have particularly notable issues, but a couple others run consistently better.

    GPUs have always gotten faster at a more dramatic rate than CPUs. They also hit diminishing returns with those improvements faster, at this point you need a lot more GPU power to really translate into a much nicer looking game. They're not really uniformly 10x faster in every way anyway.

    No, of course not, especially if you're starting with Conroe (which was a 2006 release) and not Prescott. And it doesn't change much if you look at peak single threaded performance and not just IPC. The consoles really did have exceptionally bad perf/MHz, but even then I doubt you'd hit 10x that.

    People groan about the consoles not being fast enough but they couldn't have really done that much better. Intel was never an option. Maybe if they pushed a Piledriver core to the limit they could have gotten a little over 2x the single threaded performance, with somewhat of a hit in threaded scaling. They couldn't have some monster APU with 4 PD modules and keep 12-18 GCN CUs, that would have been > 500mm^2 for sure and probably a huge hit to yield. So they'd probably need to something with 2 modules, but at the clock speeds needed to get similar multithreaded performance they would have used a ton more power, which they're likely also pushing the limits of right now. This is also a crappy situation if you still want to give up an entire core of even two for the OS.

    Face reality, there's a reason both companies went with what they did.
     
    #57 Exophase, Dec 17, 2013
    Last edited by a moderator: Dec 17, 2013
  18. HMBR

    Regular

    Joined:
    Mar 24, 2009
    Messages:
    418
    Likes Received:
    106
    in terms of overall ST performance an i5 Haswell ($200) is probably something like 4x the PS4 CPU performance for current PC software right?
     
  19. TheChosenOne

    Newcomer

    Joined:
    May 9, 2006
    Messages:
    71
    Likes Received:
    9
    Location:
    Los Angeles
    Depends...

    Here's a post from sebbi for reference.

    http://forum.beyond3d.com/showthread.php?p=1759251#post1759251

    Clock for clock you could say a haswell core is just over 2x faster than a jaguar core, theoretically speaking. So a quad core haswell at double the frequency would be about 2.5x faster than the 8 core jaguars at 1.6GHz...?

    In terms of emulation, raw performance is there when considering each part separately but like others have said, HSA features and tight cpu/gpu integration could be be tough to emulate.
     
  20. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Clearly the GPU part was a priority in both designs, single chip solution too. That left a too shallow option list for CPU choice, where Jaguar was virtually the only suitable candidate -- small, tightly integrated, modern/popular ISA support and OoO pipeline (mostly because of the load/store op's boost).
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...