Importance of assembler knowledge for console devs (and its education) *spawn

Discussion in 'Console Technology' started by Crossbar, Dec 13, 2011.

  1. Crossbar

    Veteran

    Joined:
    Feb 8, 2006
    Messages:
    1,821
    Likes Received:
    12
    I really don´t get this. You are talking assembly programmers right? In what way are these assembly programmers so special. Ordering the instructions to avoid dependencies must be the objective of any assembly programmer to make the CPU fire on all (n-issue) cylinders, regardless if the CPU is in-order or OOO. You can´t really depend on the re-ordering pipe-line if you really want to optimise your code in assembly as I see it, or if you do, you better know in detail how it works, the length of it, etc.
    Am I missing something?

    Yeah, I remember this post of yours. That sounds like a really shitty design and are you telling us that the compilers are still not helping in these situation by at least giving you a warning?

    This is certainly true. In order execution puts more work on the compiler.

    To be fair there should be considerable amount of legacy code optimised for the in-order PPE core in the PS3 and 360 by now, considering they´ve been on the market for 5 years.

    True, but as a programmer I would also rather replace a 4 core cpu with a 4 times faster single core cpu. What ever cpu design is chosen for the the next gen consoles, it will likely depend heavily on what gives best ipc/die area within a reasonable power envelop and at a decent frequency. It will be really interesting to see how things will turn out this time.
     
  2. Barbarian

    Regular

    Joined:
    Jun 27, 2005
    Messages:
    289
    Likes Received:
    15
    Location:
    California, USA
    I wasn't talking about assembly programming necessarily, but mostly about engineers that understand the low level implications of their code - things like instructions, caches, data structures, concurrency and so on.
    You might not believe it, but sadly it is very hard to find people that have the knowledge and skills for that kind of work.

    The compiler, no. There are performance tools that can tell you, PIX, SN Tuner etc, but again you'll need the skills to read generated code and understand what the problem is.
    And as I mentioned, with the PPU design, often you can't do anything about it - from Load-Hit-Stores, to microcoded instructions, to branch stalls, to slow atomic operations and even slower cache misses.
    Most of these can be taken care of with an OOO design.

    Yes, but that same code will still run faster on an Out-of-Order chip.
     
  3. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Yes, most C++ (or god forbid script) programmers do not have enough intimate hardware knowledge to optimize their code to circumvent all the various LHS stall cases. I have been trying to write some kind of simple guide to cover up the most common LHS cases. It's easy to tell programmers to avoid integer/float/vector casts (causing LHS because the variable needs to be transferred though the memory subsystem), or avoid updating class member variables in tight loops ("this" is a pointer and all member variables are thus accessed though a pointer and not kept in registers -> lots of LHS possibilities)... but it's harder to explain things like function prologue/epilogue LHS stalls to C++ game programmers that have no assembler and CPU architecture knowledge. I personally find caches easier, since good data structures (cache line aligned bucketed lists for example) can be automatically made to prefect properly when iterating though them. As long as the technology programmers provide all the data structures, the higher level C++ game programmers do not have to interact so much with the low level hardware details.

    There's only a very limited pool of programmers that are capable of writing efficient assembly code, and only a (small) subset of these are interested in game programming or/and have any knowledge about console CPUs. For example here in Finland we only have four (if I count correctly) game development companies that have released PS3 games. All are pretty small compared to big international companies, so each has maybe 5 programmers capable of writing efficient PS3 assembly code. They don't teach (mandatory) assembly programming at universities anymore, and most of the assembly programming teached at universities is targeted towards OS programming and embedded systems (microcontrollers). Most of the professionals who understand performance critical assembly programming are self taught.

    All the high level languages (Java, C#, etc) that they now use to teach programming in schools/universities instead of good old C/ASM are not making things better. The industry needs programmers capable of writing efficient assembly (esp. vector instructions).
     
    #3 sebbbi, Dec 13, 2011
    Last edited by a moderator: Dec 13, 2011
  4. Xenus

    Veteran

    Joined:
    Nov 2, 2004
    Messages:
    1,316
    Likes Received:
    6
    Location:
    Ohio
    The still teach a manditory assembly class at the university I went to or did 3 years ago or so when I took it though it was mips. Unfortunately though they don't start teaching multithreading hazards the benifits of OOE and chache till the masters level courses. The only other exposure to multithrading in the madatory classes was my OS class and that was only about threading in general and spin locks, state changes etc...
     
  5. Fafalada

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    2,773
    Likes Received:
    49
    The notion that any optimization problem in-games is solved by throwing assembler and mad-scientists at it is frankly speaking more then a little silly - even if we DID have armies of those type of guys out there.

    Any sufficiently large codebase* will inevitably lead to mostly unoptimized code in production critical paths (usually coupled with largely flat profiling graphs because most of it will be equally slow).


    *grown through "standard" production deadlines.
     
  6. Crossbar

    Veteran

    Joined:
    Feb 8, 2006
    Messages:
    1,821
    Likes Received:
    12
    Java is like Opium, it makes people dumb.
    The final exam of any programming education should include a part where the student had to port a random chosen sorting algorithm written in Java to a random chosen assembly language and make it work! They should complete it within 8 hours by a computer with an emulator.

    My post was stated as a question concerning what Barbarian meant by "1) Engineers that know how to properly optimize for In-Order cores are very expensive and hard to find." Because I thought assembly programmers for in-order and OOO-cores are equally hard to find and they probably are. He could probably have written that "programmers with low-level knowledge are hard to find and the PPE cores of the PS3 and 360 benefits greatly from them".

    If your profiling graphs are flat, good for you, you've probably picked the low hanging fruits and optimised the critical loops, sounds pretty normal to me. If you can avoid assembly that is the prefered way, but in some cases it can really make a difference.

    I myself avoid assembly like the plague, you usually can get pretty far by loop unrolling, inline functions and common sense, but I do know how to read the list files and can predict them pretty well, though the compilers do surprise me at times. :wink:
     
    #6 Crossbar, Dec 14, 2011
    Last edited by a moderator: Dec 14, 2011
  7. Dominik D

    Regular

    Joined:
    Mar 23, 2007
    Messages:
    782
    Likes Received:
    22
    Location:
    Wroclaw, Poland
    Why would I want everyone write some random algo in assembly? Most of us who don't write console games and even most of the people who do don't have to be able to write assembly. Especially since most of the compiler optimizations are going to outperform most of the asm we can write by hand. Sure, there are places where compiler can't guess what's your intention and you're the only person to force behavior that you know is safe but compiler doesn't. What Barbarian most likely meant* was that you should be aware of what's going on underneath the C/C++ code you're looking at. Calling one method isn't equal to calling some other method if one of them is virtual. There are tons of little things like this one good programmer should be aware of and that has little to do with writing asm. Yeah, perhaps that's kinda related to reading and understanding asm, but it's more about knowing how computer works. And that's not assembly, that's architecture.

    *just a guess though
     
  8. Crossbar

    Veteran

    Joined:
    Feb 8, 2006
    Messages:
    1,821
    Likes Received:
    12
    My point is that if you just once in your education had to write some assembly you might actually find it easier to read a list file and make some wise decision when trying to write efficient code if you later in your life run into performance problems. Perhaps even better you may attempt to write efficient code right from the start.

    Some students leaving school are so up in the blue and have little to no knowledge what´s really going on under the hood and in I think that is a bad thing per se, but that is just my opinion, feel free to disagree.
     
    #8 Crossbar, Dec 14, 2011
    Last edited by a moderator: Dec 14, 2011
  9. Gubbi

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,519
    Likes Received:
    852
    The problem is that you equate assembly programming to efficient programming for the general case. That is just wrong.

    There are only two reasons for using assembler. 1.) If the scheduler of the compiler doesn't do a proper job or 2.) if the CPU has some special functionality which isn't exposed in your high level language (SIMD add/mul/permute).

    In these cases it can make a big difference to use assembler. Reason 1. should be solved by compilers and micro architecture (OOO), Reason 2. should be solved by languages (proper vector support, OpenCL/CUDA).

    Time is sparse at university, performance programming should focus on:
    1. Data structures.
    2. Picking the right programming language.
    3. Optimize for general hardware features. - Caches.

    I agree, but not for performance reasons. People not understanding why 32bit integers have limited range, or that you can't use floating point values to accurately express decimal fractions is worse (having worked on financial applications, the latter is a lot worse).

    Cheers
     
  10. Crossbar

    Veteran

    Joined:
    Feb 8, 2006
    Messages:
    1,821
    Likes Received:
    12
    I was considering writing a piece about that problems can be optimised on so many levels, in retrospect I obviously should have, but thanks for the complimentary information.
    I am not saying that assembly is the holy grail (I even wrote that I avoid it as the plague, can I be much clearer?), but understanding it helps a a lot when you ran into certain performance problems. If you understand assembly the threshold for understanding cache prefetch instructions, memory alignment and such will certainly be lower as well. To me understanding the implication of assembly level code and cpu architecture goes hand in hand.

    To get back on topic I can see one reason for keeping existing CPU-cores for the PS4 and nextBox and just increase the number of cores. It would make it possible to re-use the future die shrinks (the core logic part) of the existing cpus for the future die shrinks of the new cpus. Which could save Sony and MS some substantial money.
     
    #10 Crossbar, Dec 14, 2011
    Last edited by a moderator: Dec 14, 2011
  11. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,056
    Likes Received:
    1,020
    I'd change point 3 slightly, to say "Optimize for the memory hierarchy." Closely connected to point 1 of course.

    And somewhere you have to talk about algorithms. From what I have seen from my perch in scientific computation, I see a lot of work being spent on adapting to very restrictive architectures in high performance computing. Because the only way to get at big FLOPs is to conform to limitations in data sets, communication et cetera. Simply put, the tendency is to do tons of computational work on simplistic problem descriptions, because that's the only thing that lets itself map decently to the underlying parallel hardware. As soon as you try to take more of your knowledge of the problem into account, more heuristics, more conditionals, et cetera, in short as soon as you try to make more intelligent algorithms, your utilization of the underlying hardware drops off the proverbial cliff. (YMMV and all that.)

    There are balances to be struck obviously, but with time I've become less impressed by throwing FLOPS at simplistic problem descriptions, and believe more in human ingenuity and programming methods and architectures that support it. Algorithms shouldn't be taken for granted, they spring from human creativity and the educational system should make that clear.
     
  12. Gubbi

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,519
    Likes Received:
    852
    This I most certainly agree with.

    True.

    Mea culpa. Algorithms should be part of "Reason 1" above, datastructures and algorithms. Chosing one often dictates the other.

    Isn't this a result of the ridiculous focus on Linpack for HPC ? Computers are measured by peak mega-bollocks, not by how effective they are at solving real problems.

    Cheers
     
  13. hoho

    Veteran

    Joined:
    Aug 21, 2007
    Messages:
    1,218
    Likes Received:
    0
    Location:
    Estonia
    I'd dare to say biggest thing missing from stuff that is taught is common sense. I've seen tons of people doing e.g binary search over couple of dozen element arrays and keeping them sorted while a simple linear search would be faster even without considering the savings you get from not having to re-sort it every time something changes.
     
  14. Rodéric

    Rodéric a.k.a. Ingenu
    Moderator Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,983
    Likes Received:
    846
    Location:
    Planet Earth.
    Common sense is a superpower, didn't you get the memo ?

    (I started writing a post on the 3 points, then assumed algorithms were implied in data structure (tightly coupled), and memory access patterns in memory hierarchy optimisation.)
     
  15. ban25

    Veteran

    Joined:
    Apr 7, 2002
    Messages:
    1,380
    Likes Received:
    6
    Location:
    San Francisco, CA
    It's not uncommon to look at the disassembly when debugging optimized builds, otherwise it can be very difficult to understand what's actually going on. If you can't at least read and understand small amounts of compiler emitted asm, then your usefulness as an engineer is frankly limited.
     
  16. ban25

    Veteran

    Joined:
    Apr 7, 2002
    Messages:
    1,380
    Likes Received:
    6
    Location:
    San Francisco, CA
    Java is effectively becoming what VB was in the '90s. :( We tend to hire most graduates from specialized programs, like the ETC at CMU, where students get the majority of the skills they need to be productive in the game industry.
     
  17. TheWretched

    Regular

    Joined:
    Oct 7, 2008
    Messages:
    830
    Likes Received:
    23
    I am in the midst of my graduation, and if I hadn't done so by myself, I wouldn't have learned a whole lot, besides Java. My professor demanded us programming in C++ for our thesis actually, but we aren't really forced to do it. But "disagreeing" with your professor isn't really what you want to do^^ I must say, learning C++ was worth it. Not just the different syntax, compared to Java, but also a lot of the compiler nitpicks (I mainly use Linux/QtCreator at home and Windows/Visual Studio at Uni) Although I can read assembler, I can't comprehend it, really. At least not more complex stuff. I had some assembly programming classes in my first semester, though (MIPS32), but nothing major.

    My University did however completely change up the lower semesters to get more deversity into their studies. Now they learn a semester Java, one Haskell and one C. After that, I don't know what they'll have to do, if anything.
     
  18. anexanhume

    Veteran Regular

    Joined:
    Dec 5, 2011
    Messages:
    1,493
    Likes Received:
    676
    Assembly is neat because it's kind of the opposite of OOP. Normally you can tell exactly what a specific instruction does, but it's difficult to tell what a group of instructions does whereas with OOP it may be hard to know the exact specifics of execution of a certain instruction, but you can get the gist of a block of code.
     
  19. Gubbi

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,519
    Likes Received:
    852
    True for the games industry. It is simply not a skill required for software development in general.

    Cheers
     
  20. ban25

    Veteran

    Joined:
    Apr 7, 2002
    Messages:
    1,380
    Likes Received:
    6
    Location:
    San Francisco, CA
    I suppose I'd have to ask, what's your definition of "software development in general?" I'm certain it is useful for engineers at Apple working on iOS devices, at Microsoft working on Office, at Google working on Search, at Oracle working on their DB engine, etc., etc.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...