Tesla Dojo

Discussion in 'Graphics and Semiconductor Industry' started by Jawed, Aug 24, 2021.

  1. Jawed

    Jawed Legend

    I just love how gigacorps are building their own chips from scratch:



    The video is only about 20 minutes and has some nice nuance that you won't get from this article:

    Tesla's insane new Dojo D1 AI chip, a full transcript of its unveiling | TweakTown

    The article is also missing some slides, which is a shame. 5x GPU in off-chip bandwidth for system-wide networking is a non-trivial factor (timestamped video for the slide I'm referring to):



    [​IMG]

    So actual self-driving coming real soon now?
     
    Tags:
    entity279, iroboto and Lightman like this.
  2. Jawed

    Jawed Legend

    Gary's 10 minute explanation is also worth a watch and might be preferable for some people:

     
    Lightman likes this.
  3. orangpelupa

    orangpelupa Elite Bug Hunter Legend

    does their presentation of tesla dojo "transparent"? or there are misleading things?

    i meant, in their presentation of doing point clouds, there were things (able to make point clouds around corners in intersection outside of camera view, lidar rack in window reflection) that indicate it was made off LIDAR (or at least used LIDAR as a base or something) but they didnt mention lidar at all.
     
  4. Jawed

    Jawed Legend

    I haven't watched anything other than those two videos.

    Tesla Is Testing Lidar Sensors, Which Elon Musk Has Criticized: Report (businessinsider.com)

    Overall, Elon Musk has been effectively lying about their autonomous driving progress for years now, so I ignore the detail.

    It seems clear, now, that they were orders of magnitude distant from what's truly required. Easily could be another 10 years for all I know.

    I can't help thinking that Tesla realised that the GPGPU roadmap was too slow by about a decade, so a few years ago they decided to build their own AI supercomputers.

    "General purpose GPU compute" looks like a dead end now. ExaPod's cabinet is the size of an aisle and the architecture is designed for aisle--level-scaling, not blade-level as seen with GPUs. Maybe the next wave of GPUs will achieve cabinet-level-scaling... NVidia is building cabinets that are 5x bigger for the same performance, so already far behind on bandwidth density.

    (Domestic robot thing sounds like a marketing fuck-up of gigantic proportions, literally making Tesla a laughing stock. I just scrolled past any articles or videos on that topic.)
     
    Kyyla and orangpelupa like this.
  5. nutball

    nutball Veteran Subscriber

    I'm expecting fully autonomous/self-driving cars around the time that we get the first commercially viable nuclear fusion power plants. I'm not expecting to live to see either.
     
    Kyyla, AlphaWolf and orangpelupa like this.
  6. nAo

    nAo Nutella Nutellae Veteran

    I am not so sure about the roadmap not moving fast enough. Their own supercomputer, which doesn't exist yet, is already slower than AI supercomputers from last year.
     
  7. Jawed

    Jawed Legend

    I was waiting for someone to suggest this. The hubris of incumbents is always entertaining. Tesla built this for a joke, obviously.

    Well they're already working on version 2.

    I wonder if they'll enter the cloud AI business with this, if it works.
     
  8. iroboto

    iroboto Daft Funk Legend Subscriber

    training != runtime
    just to be clear training is exponentially longer than running, the hardware required to perform self driving is fairly good as it is.

    But let's say your AI is about 98% effective. To gain each 0.1% more in accuracy requires significantly more training and research, feature extraction, datasets etc. You require significantly more processing power to extract that small portion forward. Each 0.1% is a massive deal when you're dealing with cars in the millions, so 99.99% safe is not really nearly as safe as 99.9999%, when you consider millions of tesla vehicles driving millions of kilometers.

    As it stands, adding LIDAR into the AI is ideal, but camera's are likely to represent the majority of the work.

    There is a lot that goes into a self driving AI, following distance, acceleration curves, braking curves, dealing with people cutting in, visual inaccuracies, blinding lights, rain, snow, dust, mud, etc. Being able to handle all those situations without making the car feel like it drives erratically requires some complex neural networks with a fairly extensive driving history (maybe 30 seconds backwards). As someone who was there for autopilot 1 to where it is today, they've come a very far way in how well the car handles the road compared to how it did at the start.

    Full self driving will be here much sooner than you think, but drivers should be prepared to 'take over' from time to time. Maybe in 10+ years time, you can be relatively not there as a driver.

    Tesla continues to make it's own hardware because quite frankly, nvidia is expensive. They'll likely use this for SpaceX as well.
     
    Last edited: Aug 24, 2021
    Lightman likes this.
  9. nutball

    nutball Veteran Subscriber

    In which case it's not full self-drive. It's a slightly smarter cruise control.

    Full self-drive is like being in a taxi or chauffeur driven. You go from A to B regardless of where A and B are and what's in between them, without always having to be ready to leap in to the cockpit on the off-chance that something hasn't been modelled properly. Anything less than that is marketing bollocks.
     
    Kej, milk and Jawed like this.
  10. iroboto

    iroboto Daft Funk Legend Subscriber

    they could release that today. Doesn't mean it's as safe as they want it to be. There's a big difference between the technology being available, and being 99.9999% safe. I assure you that they have full self driving working already. Doesn't mean it can handle everything you throw at it.

    I don't think they ever promised it to be a chauffeur driven experience. You are welcome to point to me where Elon says that, because I'm pretty sure he goes on record that we are well over 20+ years away from that level of self driving.

    Mind you that was in 2017 when I bought mine, I haven't seen what he's said lately. Frankly I don't trust it that much per se, but it's been useful for long road trips.


    edit: update. Ah, I see he is overstating their output:

    The ratio of driver interaction would need to be in the magnitude of 1 or 2 million miles per driver interaction to move into higher levels of automation. Tesla indicated that Elon is extrapolating on the rates of improvement when speaking about L5 capabilities. Tesla couldn’t say if the rate of improvement would make it to L5 by end of calendar year.

    **

    yea, I don't see L5 automation being ready for end of 2021, it's unlikely to have occurred during the covid years. Curious to see where they land in 2024.
     
  11. nutball

    nutball Veteran Subscriber

    I'm not singling out Elon or Tesla here. My sole point is that the term "full self driving" needs to be used very carefully IMO lest it get devalued. I realise I'm pissing in the wind and that marketing spin will likely triumph as it usually does.

    If Musk is saying 20+ years for what I call full self-driving, then compensating for the Musk Reality Distortion Field gives ~40 years Standard Human Elapsed Time, which is roughly what I'm expecting. So he and I agree. Excellent. And anything up to then is marketing bollocks.
     
    milk likes this.
  12. iroboto

    iroboto Daft Funk Legend Subscriber

    it'll likely land much before. He made that comment in 2017 and revised it again in 2019. I think the amount of road data being captured by all these teslas are something he didn't factor in. 40 years is a long time, there will be several breakthroughs before then. FSD subscriptions comes out soonish for USA, so they'll be able to FSD point A to B. But there will be times in non-ideal conditions that the AI will ask the driver to take over. Of this I'm certain.

    “Full Self-Driving capability is now available as a monthly subscription. Upgrade your Model Y ... for $199 (excluding taxes) to experience features like Navigate on Autopilot, Auto Lane Change, Auto Park, Summon and Traffic Light and Stop Sign Control. The currently enabled features require active driver supervision and do not make the vehicle autonomous.” City street driving, steering, is a requirement that the laws in a particular area will need to allow. That pretty much means anyone who buys into FSD today, is a beta participant as you cross boundary areas the GPS will shut off certain features.

    oddly I don't get to subscribe for my version, I can only pay. But it's 5300 CAD. So much cheaper than the 10K everyone else has to pay.
     
    Last edited: Aug 24, 2021
    Lightman likes this.
  13. iroboto

    iroboto Daft Funk Legend Subscriber

    Right let me reiterate my position here on the subject.
    Tesla is not far away from FSD due to a lack of computational power. This is likely a bottleneck around R&D and data sets that are required there to make FSD happen at Level 5 automation.

    The reason why Tesla would move to make their own supercomputer as opposed to using an existing one comes down to cost. They likely extrapolated the run rates of their training costs many years down the road and determined that this would ultimately be cheaper and better for them.

    To provide you an example:
    https://dl.acm.org/doi/fullHtml/10.1145/3381831
    You can quickly see how this starts to ramp up in cost. There is a lot of processing power out there, but the electricity doesn't come cheap. If there is a way to get similar performance with less power usage, that would be the reason to invent your own silicon.

    You are constantly retraining and evaluating models and only when you are satisfied do you release it into production. If a team is training several models a day, you're burning through tons of cash.
     
  14. nAo

    nAo Nutella Nutellae Veteran

    What hubris? I simply made a factual statement.

    They’re clearly very serious about this effort and perhaps they will be very successful on other important metrics. Raw power doesn't seem to be highlight, so far, and strictly speaking there is nothing wrong with that.

    Like every other company in this space. It’s a marathon..

    Developing the whole thing just for themselves sounds very expensive if in the long term they don’t plan to somehow externally productize their technology.
     
    orangpelupa and pharma like this.
  15. Jawed

    Jawed Legend

    I'm wondering if people have watched the videos I linked or read the transcript I linked, because I'm seeing lots of comments that contradict selected aspects.
    I don't think "exponentially" quite captures the difference :mrgreen:

    Tesla described lots of other problems. Networking, small-batch performance, physical size being some.

    Also, Tesla gets to write its own ISA. And when it spends billions on software engineering to target its ISA, it has full control over that software. It's not praying for NVidia to agree on what's important, and to implement that stuff whenever it suits. It's not paying for a drip-feed of incremental improvements...

    The problem is assuming that NVidia's roadmap is useful to Tesla.

    There's an asymmetry here: Tesla knows what NVidia tells it about its plans (going back a few years it would have looked like: Ampere, Hopper, Next Next, Next Next Next) and Tesla estimates the fitness of that roadmap for its own use.

    Clearly, Tesla decided that NVidia's roadmap was unsuitable.

    I see Dojo as being very much like Apple building M1 and the family of chips that follows it. The incumbent was basically clueless and now has a hollow-looking roadmap to attain parity with what Apple has already delivered. Intel's roadmap was utterly useless. In addition to that, Apple realised that vertical integration down to the transistor would result in vastly better products.

    Time will tell if NVidia's roadmap was also useless for Tesla, or Tesla dumps Dojo in a few years' time for more NVidia.
     
    iroboto, Lightman and xpea like this.
  16. pcchen

    pcchen Moderator Moderator Veteran Subscriber

    I don't know if what NVIDIA is building is good enough for Tesla or not, but it's obvious that Tesla is not the only one doing AI works right now. People who buy from NVIDIA definitely have their own requirements and (at least for those who can't afford to build their own chips) will tell NVIDIA what they want, in order to get what they want.
    Furthermore, NVIDIA is no longer in the situation where "gaming GPU" is required to subsidize "AI chip" development, as revenue from AI chip rivals gaming GPU already, so they can afford to design a completely different AI chip if that's the way to go.
     
    pharma, DegustatoR and iroboto like this.
  17. AzBat

    AzBat Agent of the Bat Legend

    Kinda surprised on the lack of discussion of the actual D1 chip, it's scaling to a tile of 25 D1 processors, then 6 tiles to a tray & then 2 trays to cabinet, then 10 cabinets to a ExoPod.

    [​IMG]
    Tommy McClain
     
    Malo, Lightman and BRiT like this.
  18. AzBat

    AzBat Agent of the Bat Legend

    Still nothing? Maybe if it was painted blue or used an x-clamp. Bleh.

    Tommy McClain
     
  19. Jawed

    Jawed Legend

    Problem is there's really not much detailed information.

    Some videos that are mildly interesting:







    But there's lots of speculation in those videos.

    I think CleanTechnica is wrong to call the 25 chips in a slice "a wafer". I believe this is actually 25 known-good dies that are mounted onto a single wafer. Tesla's video is very brief on the subject of "wafer". CleanTechnica in my opinion is incorrect to assume that redundancy (a la Cerebras) across a single wafer is good enough for the architecture that Tesla is using.
     
    BRiT likes this.
  20. pharma

    pharma Veteran

    The Tesla Dojo Chip Is Impressive, But There Are Some Major Technical Issues – SemiAnalysis
     
    Sxotty and Jawed like this.
Loading...

Share This Page

Loading...