MEME: Sony engineering better than everyone else *cleanup*

Discussion in 'Graphics and Semiconductor Industry' started by w0lfram, Jan 28, 2019.

  1. w0lfram

    Newcomer

    Joined:
    Aug 7, 2017
    Messages:
    149
    Likes Received:
    31

    Your entire post goes against AMD's business philosophy!

    AMD is moving to their business model to providing solutions to their customers using their own patented chiplet design & (heterogenous design). (See China/SONY/MS/etc..) Dr Su said herself about working closely with their customers and at some length (integrated) to be able to provide them with a custom design suiting their exact needs/wants/goals.


    sigh*

    Navi had an issue & had to be "re-taped" out. Yes that deep of an issue, but easily remedied as someone has been told. (Navi will be an Aug product, instead of April.)


    The *thread spawn* comes from Me asking "What if... NAVI got additional updates learned from working with the likes of SONY/MS/etc, and snuck a few of those "updates" into the newly taped out uArch.

    It (thread spawn) was a very simple question, yet people unfolded and went on different tangent.
     
    #41 w0lfram, Jan 30, 2019
    Last edited by a moderator: Jan 31, 2019
    MBTP likes this.
  2. BRiT

    BRiT (╯°□°)╯
    Moderator Legend Alpha Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    12,154
    Likes Received:
    8,304
    Location:
    Cleveland
    You entirely misunderstood that post and the concept of what AMD customizations provides.

    Think of it more like a Burger King. Yes, you can customize your order your way, but there is no way to have them come up with a Steak and Lobster dinner. They simply don't have those as ingredients.
     
    Geeforcer likes this.
  3. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,095
    Likes Received:
    2,814
    Location:
    Well within 3d
    Tapeout is a late step in bringing a design to manufacture, where the masks are being developed from a database that has gone all the way through the design process and pre-production validation. That's after the chip is ideally as final as the designers can manage. In a lucky scenario, that initial revision is good enough to go after the first chips come back and are validated.

    If there are issues after tapeout, those are generally from the initial production samples. Tweaks and subsequent passage of test wafers are called respins.
    The changes at that point are intended to be minimal, since a significant investment in resources and time went into verifying the chip in its taped-out form, and the more alterations go from the later metal layers to the fundamental transistor layer can mean more weeks/months of time needed start mass production.

    Adding design features like semi-custom architectural changes threatens to throw out substantial portions of the base design and negating much of the engineering and verification on silicon that no longer corresponds to what was engineered and tested for months or potentially years.

    In this scenario, perhaps it could be argued that the uArch is dead, and to be replaced by another one. Perhaps if this new uArch was already in progress and could be used to replace the rejected one, there would be a lesser delay. Just deciding at the point of tapeout is being a few months from ramping and setting the clock back to an intermediate point in a 3-4 year process.
    I'm not sure what would be compelling enough to go that far back on a design that didn't have a serious need to be reset further back than tapeout or just replaced for other reasons.
     
    MBTP, AlBran, iroboto and 1 other person like this.
  4. w0lfram

    Newcomer

    Joined:
    Aug 7, 2017
    Messages:
    149
    Likes Received:
    31
    Yes, that is why I mentioned "respin".

    And nothing you have suggested, refutes what is now widely reported, or to what I have said. The scenario I suggested, didn't hint at Navi being dead/broken, etc... as you've suggested. It was delayed by 4 months until they can hand out samples from the respin (after the re-taping).

    Given that^ tidbit/fact..
    I had asked; What if.. AMD reworked other aspects of Navi's uArch (for this new re-spin), perhaps simple lessons learned from their closer collaboration with SONY (Microsoft/China/etc)...?

    It was a simply question really...


    Additionally, when asked about Navi's progress (delay?), Dr Su said mentioned that AMD was happy where Navi was, with a few small issues along the way, but are choosing to do the "right thing" (unlike Polaris being rushed) so Navi is more complete for the consumer, or some such nonesense. She also mentioned in some fashion that AMD is excited to where Navi is technically and mention they were able to incorporate more into their design, than had been slated. It somehow ties into their leapfrogging philosophy of teams.

    I did stay at a Holiday Inn last night..
     
  5. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,095
    Likes Received:
    2,814
    Location:
    Well within 3d
    As I said, a respin is intended to have as small an impact as can be managed. It's about fixing minor faults or functional errors in existing units, not adding them or changing their design. The design and verification of the units already in the chip is essentually final by that point, and taking features or elements from a different design will discard the time invested in the prior version.

    Then there's significant pressure to not do what you are saying. Chips are almost set in stone a long time before final production. If the design doesn't have a serious problem requiring a significant reset, it's expensive and high-risk to do it for some marginal gain.

    That's in-line with a new stepping with some bug fixes, since the fab pipeline has up to several months of turnaround time and then weeks of physical testing. That doesn't leave time for new functionality or design changes to be inserted, and there is significant risk with unverified alterations being put in at the time where errors take the most time and money to fix.

    That would imply a new design already nearing completion, not restarting a number of steps earlier in the design process of an implementation. The time frame given doesn't give time to change the design or to go through the months of internal simulation and testing that led up to the first tape-out. Transplanting an element implemented in a Sony or Microsoft product into an architecture that didn't include it requires redoing a lot of work, and the 4 months for a respin happen after that work is done again.

    The process from specification to design to silicon is multiple years. Changing design rolls something like the later third of the process back.
    An example of GCN-based hardware is the PS4, which its lead designer said took 2 years to spec, 2 years to create custom designs, and 2 years to build the platform around it.
    Taking some years off because of the larger scope of a total SOC and platform, the work going into the chip could take around 4 years, and would have been substantially locked-in perhaps 2 years prior to release.
    That can readily allow for there being many months to over a year of work expended after the design featureset had been decided on, so design changes can revert things quite far in time.

    If there's something to be added, it doesn't follow what would be so pressing as to risk delaying a product for much longer than a respin rather than put out the current chip and have the next GPU include the features--assuming there are really game-changing features the console makers would not want exclusive to their chips.

    https://www.digitaltrends.com/gaming/meet-the-guy-who-engineered-the-playstation-4/
     
  6. w0lfram

    Newcomer

    Joined:
    Aug 7, 2017
    Messages:
    149
    Likes Received:
    31
    Again, I think everyone here understand that.

    I had asked, that (If so), and they were going back to re-tape Navi (no how matter how small or insignificant that would ental), that with leapfrogging teams and having updated (revisions) already stacking behind each other. So that when AMD went back to "re-tape" Navi out (for some minor error), that they would encorporate the most up-to-date masking (A revision that AMD had already been working on...) into the tape out and re-spin.

    As logic would imply, not only with the updated (fix) and tape out, Navi would also adopt a newer (more up to date) revision of the latest masking, or production process. Again, going back to Dr Su saying, that they do not want to make the same mistake they made with Polaris (at this stage in development) and rush it. Instead they are doing it right. If Navi was indeed going to be announced at CES and was delayed, then do you think that 4-month delay would include Navi's latest revisions (since the last tape out..?)



    What little changes that could be made within a short revision, will probably be done.

    But it sounds (to me), that You are suggesting (if true) AMD will be using the exact same revision, but are just going to fix (whatever they found in error) and tape out & respin it... without taking any opportunity to sneak newer revisions into that new tape out..?

    You touched on a newer revision briefly, then went on to explain what that process entitles and why AMD would (at this stage) not change the core functions of Navi's architecture. (an argument nobody was making)

    However slight that revision might be, the opportunity to tape out as a further revision of itself, isn't all that bad of a prospect. I am just wondering what (if anything) could they do. In addition to perhaps better thermals & efficiencies do to a newer refined processes..?
     
  7. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,095
    Likes Received:
    2,814
    Location:
    Well within 3d
    Masks are unique to the specific chip they are generated for, and the most up to date set is the chip that is currently taped-out. Designs earlier in their process are further from being in physical form, and so would not have much value to the chip that is taped out and can have physical samples for real testing.

    The presented scenario is that there is a chip A being taped out with a chip B being further back in development. Then chip is "re-taped" and takes elements from B.
    This hybrid design is neither A or B, so their respective validation and simulation results cannot provide guarantees in the areas where they are combined.

    Three or so months isn't out of line with respins discussed in the past, and 7nm without EUV is noted to have even longer turnaround time due to the large increase in steps from heavy multi-patterning.
    EUV's reduction in multi-patterning is touted by the foundries as reducing cost and reducing lead times, if the EUV lithography tools can improve exposure times and power sources.
    If the decision to do a respin is made after the first chips come back from the fab, there's not much more time besides time-constrained fixes to the show-stopper issues and focused validation on those changes.
    Features that have too many issues could even be disabled in order to salvage the schedule.

    For a respin, the revision of the chip that came back from the fab has minor corrections made, and a new mask stepping is made and sent off.
    There are ways of reducing the manufacturing lead time and the risk of alterations, but those work by changing less and less of the chip.
    It's a bad time to be changing things just to change them, and there's no clear upside. The risks of severely impacting the design are high, and no features we've seen from the consoles provided a massive benefit. Even if there were an upside to the features, the right thing to do is generally to tape-out the current chip and have the next product incorporate the new features.

    No specific level of change was cited, but if this is somehow taking features from Sony or Microsoft, it's already sounding non-trivial. Since there's no sign of them being in production, they wouldn't have any physical feedback to give to a chip being taped-out--and even physical learning doesn't have a lot of carry-over between designs. That leaves feature changes with a physical, electrical, and logical footprint. Those effects would need to be validated.
    Even minor changes can have an impact if new transistors or layouts affect how the lithographic patterns interfere with each other, or how the physical differences that result from the choice of patterns in a region can affect mechanical and chemical effects of the fabrication process.
    One of the advantages touted by AMD for its chiplet strategy is that if you want to change one part of an SOC, it requires re-evaluating for complex side effects.
     
    MBTP, w0lfram, function and 5 others like this.
  8. w0lfram

    Newcomer

    Joined:
    Aug 7, 2017
    Messages:
    149
    Likes Received:
    31
    Thank you. That was insightful.

    But, as you have suggested, that in most those cases presented were worse case scenarios. And again, you didn't take into account AMD's leap-frogging teams, with validated designs of their own, etc. (just saying)

    (ie: What if regardless of Navi's original release, AMD was planning a 2nd revision within 6 months...)


    So basically:
    TSMC will have refined their process node a tad more for better efficiencies (over the 4 month delay). And that AMD (in the mean time), who vowed "not to take the easy route" and to "do it the right way", will simply retape/respin the dies & correct only what needs to be done. (No updates, or revisions.)

    Seems like wasted time.
     
  9. BRiT

    BRiT (╯°□°)╯
    Moderator Legend Alpha Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    12,154
    Likes Received:
    8,304
    Location:
    Cleveland
    Why would they plan to spend millions on another revision within 6 months?
     
  10. entity279

    Veteran Regular Subscriber

    Joined:
    May 12, 2008
    Messages:
    1,220
    Likes Received:
    418
    Location:
    Romania
    Nope, that's just sane engineering (in any field). You'd need a working, stable baseline to improve upon first.
     
    MBTP and AlBran like this.
  11. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,095
    Likes Received:
    2,814
    Location:
    Well within 3d
    The time frame given earlier was 4 months, which is shorter than the 6 month claim given now.

    I would need to see the specific statements about leapfrogging teams.
    The relationship between two teams where one team's product is re-designed based on another project's features (when the other project is not in the process of initial sampling) sounds more complicated than leap-frogging, especially if they are piling up at tape-out despite historically taking longer than the initially claimed four months for "re-taping".

    Despite this, leap-frogging doesn't prevent injecting a separate project's design elements from disrupting the first team's late-stage development timeline.


    Unless the Sony and Microsoft chips are in production, the chip that would be able generate any learning from the new refinements would be the chip being "re-taped", which is still a term I haven't seen defined clearly.
    As a pipe-cleaner, Vega 20 may have provided some information for the design for manufacturing of future chips, since it taped out and went into production much earlier.

    In the development of very complex architectures and chips, the general process is that each stage involves making design and engineering choices based on projections on what will happen later, and each later stage builds sequentially on what came before. Each stage reaches a point where it must commit, and the full verdict of those decisions may not be known for a long time. When choices were made a six months or a year ago and all the work since has been based on them, changing the foundation of those lapsed months requires new investments of time and engineering. There's not much borrowing of another design's time.

    For a rather ancient example of a chip revision rather than "re-taping", there's what AMD did with the Thoroughbred A and B cores in 2002.
    AMD modified the layout, added a metal layer, and added some transistors and decoupling capacitors when going from A to B. This allowed for higher clock speeds, but AMD did not want to differentiate the two revisions beyond that.
    https://www.anandtech.com/show/972/3

    This was over 16 years ago, so the complexity and lead times for doing this were far different, and the x86 development and refinement process showed more intensity that AMD's GPU cadence.
    Even so, Thoroughbred A was not "re-taped" or held up for B. Revision A was taped out, validated, and sold for months before Revision B started to come out. The "right thing" in that case was to finish the current chip and bring it to market, and let the next chip adopt what optimizations are available at its final stage of development.
     
  12. Clukos

    Clukos Bloodborne 2 when?
    Veteran Newcomer

    Joined:
    Jun 25, 2014
    Messages:
    4,452
    Likes Received:
    3,781
    This pretty much. It's fair to assume that Sony's input will be valuable to AMD in developing Navi (or whatever ends up in Ps5/Nextbox) but to assume that they are basically developing Navi is a bit far fetched. I would expect some custom features (potentially something akin to mesh/primitive shaders perhaps) that won't make it to desktop Navi but not anything groundbreaking.
     
  13. metacore

    Newcomer

    Joined:
    Sep 30, 2011
    Messages:
    105
    Likes Received:
    77
    Are we arguing about who is better to play lego with transistors or ... that guys who's bread and butter is makeing best code on limited resources for retail code, guys who codded in assembly on 30mhz2MB, then 3 cores and five memory subsytems on ps2 and then the cell have absolutely nothing to input? zero bad and good experience to share. On the other hand there is paved road of failed features culminating with vega.

    BTW PS2 > Turing <runs> ( that comments are about meshlets )


     
    MBTP likes this.
  14. vipa899

    Regular Newcomer

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    Intresting that with ps2, future hardware it was. One of my favorite hardware of all time.
    Do they mean it was too slow, but the architecture was more flexible?

    If we still havent "beaten" ps2, then why was the 2001 xbox faster in about everything
     
  15. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    1,135
    Likes Received:
    567
    Location:
    France
    I believe they talk about logical processing of those tasks and efficiency, flexibility, more than absolute raw power.
     
    vipa899 likes this.
  16. metacore

    Newcomer

    Joined:
    Sep 30, 2011
    Messages:
    105
    Likes Received:
    77
    Yeah it's about felxibility, there is further comment by Sebastian
     
    MBTP and vipa899 like this.
  17. vipa899

    Regular Newcomer

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    Ok it could have been flexible, but if it cant do most effects at reasonable performance those effects werent of much use. Things like bump mapping, pixel shading, detail textures, etc werent all that common on the PS2. Some games went ambitious (SotC) but @ 15fps it wasnt all that smooth.
    Cant really say PS2 was better then (more) fixed function variants like OG xbox, or perhaps GC to some extend. I think PS2 was a 90's design with software rendering in mind, it was common then, much more flexible but slower.

    Wasnt one of the unreleased or limited Voodoo 5 6000 a multipass gpu, or atleast an idea of PS2-like design?
     
  18. bgroovy

    Regular Newcomer

    Joined:
    Oct 15, 2014
    Messages:
    600
    Likes Received:
    450
    No one is suggesting that VUs would have stayed trapped at 300mhz had the concept been brought forward.
     
    vipa899 likes this.
  19. vipa899

    Regular Newcomer

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    Wonder how a further developed EE/GS future would have looked like. Some said it wasnt forward thinking.
     
  20. metacore

    Newcomer

    Joined:
    Sep 30, 2011
    Messages:
    105
    Likes Received:
    77
    Vipa, they are talking precisely about geometry setup/processing not texturing, shading and so on.
     
    vipa899 likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...