XBox One Backwards Compatibility and Xbox One X Enhancements for X360 and OG (XO XOX BC)

Discussion in 'Console Technology' started by Lalaland, Jun 15, 2015.

Tags:
  1. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    In that case, I'm going to guess it really is down to RAM caching with aggressive prefetch. While I don't think they could quite fit the entire game they can get most of it and could be spending most idle HDD time prefetching. XB1 does have a faster HDD than XB360, but it appears to be only moderately so.

    If an OS or library routine for decompression is being used in conjunction with high level emulation then it's possible. But I doubt this alone would make such a dramatic performance difference.
     
  2. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    It's straightforward if you don't care about performance. Optimizing the code to perform as well as possible on the target platform is a very deep problem.

    There are some problems with this comparison..

    1) You're looking at native code performance. You'll be lucky to average 50% of that with translated code, and that's assuming some good generous properties of the code and emulator. The binary translator simply doesn't have access to the same level of structure and information that the compiler would have and it pays for it, especially when the original code was for an arch with twice as many registers. It also pays for having to manage some branch targets that aren't known statically.
    2) That performance comparison was probably done using GCC, vs what production code would have used, which was IBM's compiler which got more mature in time.
    3) The common 2x1.6GHz claim is very misleading, first of all you get full access to the CPU if only one thread is running. Second, it's like SMT - there's some contention but it's not nearly like halving performance, that would defeat the point. Nonetheless, the performance per thread is nowhere close to 100% what it is when only one thread is running so there's definitely some reprieve for code that heavily multithreads the cores. BUT this is assuming that they don't rely on a high degree of synchronization for performance or correctness, one that the emulator would probably not be able to provide while running the threads on separate cores (and if they have to constantly switch threads on the same core to get the same effect performance will tank). That's kind of the thing here, with so few games supported right now we don't know what kind of potential compatibility it has.

    But the other side of this is that games don't have to be using 100% CPU time on XB360 and many probably remain GPU limited or even frame time limited (especially the XBLA games) despite the CPU being so weak.

    Heavily optimized Altivec code will indeed be hard to deal with especially because it has so many registers. You're going to get inner loops that routinely blow the 16 XMM register budget. The emulator will probably have to heavily access registers in RAM to make up for this.

    But there's been tons of presentation material on optimizing XB360 and PS3 CPU code that go far beyond just using well scheduled Altivec, so it's pretty safe to say that a lot of major games were heavily optimized throughout and would do a lot better than the vanilla C++ comparison you gave.
     
    BRiT and FarticusMaximus like this.
  3. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,569
    Likes Received:
    4,761
    Location:
    Well within 3d
    I did not mean to imply that translation or cached translation would lead to native performance, just that it would be needed to avoid the order of magnitude hit that prior discussions about emulating brought up. The set of games that might tolerate weaker performance that is still within a generous range of "good enough" should be broader.
     
  4. rapso

    Newcomer

    Joined:
    May 6, 2008
    Messages:
    215
    Likes Received:
    27
    Microsoft wrote the compiler that generated the X360 opcodes, they are probably the best at generating something AST alike from the opcodes and generating an optimized binary for a new target platform. That's what driver do with shader byte-code (which is optimized nowadays for a totally different "imaginary" processor). That's also what .Net does with the bytecode. That's what NVidia's "Denver" does also. That's why I'd call it 'state of the art' in the high end software compiler business.
    I agree that some information is missing, but at the same time, that information would be of not that much help, because the data is layout already with that information in mind e.g. alignment, padding of members in structs, endianess etc.

    if you're referring to the PPU vs x86 benchmarks, your assumptions are not correct.
    e.g. http://web.archive.org/web/20100531...kpatrol.ca/2006/11/playstation-3-performance/
    (that's not the comparison I recall, just random 1min of google)

    you have now the exact same situation, microsoft VC++ vs Microsoft VC++, I'd doubt the X360 compiler would be more advanced than the optimizers they use for the transcoders now.

    there are several things to consider
    a) it's an in-order CPU, on OoO both threads can try to fill units of the CPU if some memory fetches are stalling, while on an In-Order design a fetch will completely stall that one pipeline. Running Both pipelines will cause a better occupation on instruction side, but per SMT-thread it will cause more contention and friction e.g. on L1D side, which is actually critical for the stalls.
    b) if a game does not utilize 6threads, it's likely there was no need to do so, thus I'd mightily assume it's not a critical code path. if it was critical, then it's likely spread to more cores for more throughput overall, but less throughput per core. That favors the real cores of XBOne

    I have no hard numbers to back my assumption, tho.

    I agree. and yes, I'm wildly guessing here. I don't claim it's the way I say, just from my experience it's what I'd assume most likely. MS is a great [edit: not OS, I meant:] COMPILER company and software emulating opcodes on runtimes or even 1:1 translation wouldn't sound to me as good performing as a transcoded binary.

    makes me curious to see some more recent games that made the XB360 sweat :)
    Is there something?

    which again makes it more likely to go the complex way of parsing the opcodes into an AST and run all the VC backend for x86. It's not just register renaming and instruction translation, quite some code would be done in a different way (e.g. some load, modify, store on XB360 could end up in one instruction like "inc memory")

    I agrre. That's what I wanted to point out also. there will be some 10% of code for e.g. physics, AI, etc. that are really critical and are heavily optimized and used in time critical cases. So in those moment (e.g. combat) not only there is way more pressure, but the code is also harder to translate. This might explain why in non-critical cases (e.g. maybe cutscene) the emulated version might run way better, while (as some claim) it action moments the FPS drops to 10fps. (again, my wild guess :) )

    makes me wonder whether MS might have a farm of programmers profilling critical code bits and re-writing those to x86 (at least c/c++ code with SSE intrinsics) and we'll get patches further improving game performance by some 2x, 3x, 4x scale in those low-fps situations.
     
  5. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,788
    Likes Received:
    17,097
    Location:
    The North
    Please release OG Xbox back compat now :). Would love to play some Soul calibur 2 again
     
  6. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    This isn't anything like VM bytecodes which are designed with JIT in mind, and it's nothing like Denver which has many architectural decisions that lend towards being useful for translating ARM (lots of registers, transactional memory, very efficient asserts, hardware branch mapping). x86 is not designed at all to be an efficient target for PowerPC translation.

    The missing information I'm referring to is not low level stuff that you're talking about but variable and control flow graphs which allow the compiler to do better register allocation and other optimizations than they can working with unannotated machine code.

    I've written multiple reasonably high performance emulators employing binary translation so I do have some experience with this.

    So 2006 Visual Studio, which I guess you're assuming is the state of the art in XBox 360 compilation. I doubt that. I've heard multiple sources claim that IBM's compilers were the best which would make sense.

    At any rate, where does this comparison say VC++ was used? For binaries running on Linux no less?

    Actually, despite being in-order the L1 and L2 caches in Xenon/Cell PPE are non-blocking and support hit-under-miss so multiple threads increases MLP to the caches.

    So you're saying if it doesn't peg six threads then it probably won't peg one thread? There's tons of software that's poorly threaded but has heavy single threaded requirements. Games have gradually been becoming less so but that was a hard lesson for developers over the lifetime of the XBox 360. Parallelizing code isn't always easy (and sometimes doesn't give enough benefit even when you do it)

    Visual Studio's code quality is actually not that incredible these days (there's a reason people use ICC), but even if their binary translation was the best in the world there are still hard practical limits.

    Clearly they've done a very very good job just based on what we're seeing so far. I'm not saying their emulator sucks, just that I'm not so confident that it'll be able to handle whatever is thrown at it. We don't really know yet how hard games will push it.

    I'm sure there are games that push the system a lot harder than Mass Effect did, let alone Kameo, Perfect Dark Zero, or the variety of XBLA games currently available.

    No recompiling emulators out there use a C compiler as a back end for generating code (people talk about using LLVM for this now and again but I'm not aware of anyone actually doing so). For one thing, code is generated dynamically (you can't generate all of it statically) placing somewhat of a limit on how quickly they can work and these compilers aren't optimized enough for speed and perform many optimizations that are generally not applicable to code that was already compiled. And it's really not that well suited for optimizing what looks like machine code. You can get useful peephole-style optimizations you're talking about without throwing an entire compiler backend at it.

    Yeah, but there could be other speed traps that are less obvious. I don't know how XBox 360 games are written but if they're allowed to map hardware I/O into the address space and then proceed to hammer it this will be very slow because of how it'll have to trap to the hypervisor (I'm assuming that's how they're doing it). On the other hand, if this is all abstracted by OS syscalls to begin with it's not necessarily a problem.

    That very well could be the plan for some games if they haven't already done it. They already admit that they're packaging a unique binary of the emulator with every game so they can tweak it on a per-game basis, that could easily include custom HLE for specific games.

    Where it matters it's probably going to be way more work than just a few functions though. The 90/10 rule is very often overstated. At some point it has to make more sense to lean on some kind of library that's uses parts of the emulator to help port the game and getting the company to let them do a real port.
     
    BRiT likes this.
  7. turkey

    Veteran Newcomer

    Joined:
    Oct 21, 2014
    Messages:
    1,108
    Likes Received:
    883
    Location:
    London
    how much is reused or droped from memory and then reused quickly?
    They have the space to just leave data in memory so next call is quicker, last used first out once the allocated space is full which will be a massive cache.

    More like hybrid ssd caching,
    basic but probably quite effective.
     
  8. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    I don't know, in CPU caches the answer is a lot, which is why LRU tends to be viewed as the best replacement policy. Although even there there are enough exceptions that it can be worth switching to other replacement strategies adaptively: http://blog.stuffedcow.net/2013/01/ivb-cache-replacement/ But not necessarily MRU like you describe.

    Disk access patterns in games may be different. If an entire area or level is loaded there's a good chance you won't be loading it again for a while or ever. But if pieces of a level are streamed in that may not be the case. Disk accesses are so slow that it can make sense to invest some CPU time in coming up with a decent adaptive algorithm here too.
     
  9. matthias

    Newcomer

    Joined:
    May 19, 2010
    Messages:
    38
    Likes Received:
    34
    Location:
    Germany
    i am not really sure but i think mass effect and nuts&bolts have at least 2xAA
     
  10. Cyan

    Cyan orange
    Legend Veteran

    Joined:
    Apr 24, 2007
    Messages:
    9,606
    Likes Received:
    3,304
    #170 Cyan, Jul 6, 2015
    Last edited: Jul 6, 2015
  11. FarticusMaximus

    Newcomer

    Joined:
    Jul 22, 2014
    Messages:
    229
    Likes Received:
    118
    Burnout Paradise was 60fps (but sadly not a Burnout game in all the ways that count, IMO).

    Burnout 3 (the ultimate, complete, unlikely-to-be-repeated king of Burnouts) may have been 60fps on the original Xbox, but I don't think it runs at 60fps on Xbox 360. Doesn't feel like it anyway.

    Burnout 3 and Flatout:Ultimate Carnage are the main reasons I'm still holding on to my 360.

    MS really need to get on with that original xbox emulator. I have my Kung-fu Chaos disc ready and waiting! (my favourite local multiplayer game ever)
     
    Cyan likes this.
  12. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    17,186
    Likes Received:
    4,575
    Is flatout any good it seems so if your keeping your 360 for it
    do you know if I would be better gettting FlatOut 3 Chaos And Destruction instead ?
    soz for the o.t

    edit: read some reviews, stay away from flatout 3 chaos and destruction...
     
    #172 Davros, Jul 7, 2015
    Last edited: Jul 7, 2015
  13. FarticusMaximus

    Newcomer

    Joined:
    Jul 22, 2014
    Messages:
    229
    Likes Received:
    118
    Yeah I hear Flatout:CnD was a stinker.

    The minigames in Ultimate Carnage were what had me hooked. In a couch multiplayer situation it was a riot. Pure rediculous awesomess.

    Stone skipping (think: On a beach with flat pebbles skipping on the surf) with a driver ejecting from a jet car? Oh yes. Yes ineed.
     
    Cyan likes this.
  14. Cyan

    Cyan orange
    Legend Veteran

    Joined:
    Apr 24, 2007
    Messages:
    9,606
    Likes Received:
    3,304
  15. AzBat

    AzBat Agent of the Bat
    Legend Veteran

    Joined:
    Apr 1, 2002
    Messages:
    7,669
    Likes Received:
    4,675
    Location:
    Alma, AR
  16. xexuxjy

    Regular Newcomer

    Joined:
    Jul 23, 2005
    Messages:
    287
    Likes Received:
    182
    Location:
    London
    Shameless begging, but can anyone on the preview program throw me an invitation please?
     
  17. NRP

    NRP
    Veteran

    Joined:
    Aug 26, 2004
    Messages:
    2,712
    Likes Received:
    293
    Is Halo Reach working yet?
     
  18. Cyan

    Cyan orange
    Legend Veteran

    Joined:
    Apr 24, 2007
    Messages:
    9,606
    Likes Received:
    3,304
    My gamertag is Ruin of Palmira. Add me, and I shall send you an invitation.

    It doesn't, for the moment.

    Racing games pose a licensing challenge for XB1 backwards compatibility, Phil Spencer has explained why.

    http://mmgn.com/xboxone/news--xbox-boss-talks-backwards-compatibility
     
  19. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    19,465
    Likes Received:
    22,437
    If I read that correctly, the trouble with racing games being a problem is they no longer have the rights for the songs or having a partnership with the publisher so it may take longer to get their permission.

    Depending on how the songs were packed initially, they might not be able to replace them. It would be nice to allow the users to select their own songs to be used, but that would take an external app to do the manipulation.
     
  20. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,569
    Likes Received:
    4,761
    Location:
    Well within 3d
    I would look at the logos being the big one, since there are about a billion of them per mm2 in some cases. I haven't played racing games enough to know if it's the attempt to simulate real venues to or if there is some other reason why songs would affect that genre more.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...