For years, Elon Musk has talked about Dojo — the AI supercomputer that would be the cornerstone of Tesla’s AI ambitions. It’s vital sufficient to Musk that in July 2024, he mentioned the corporate’s AI staff would “double down” on Dojo within the lead as much as Tesla’s robotaxi reveal, which occurred in October.
However what precisely is Dojo? And why is it so crucial to Tesla’s long-term technique?
In brief: Dojo is Tesla’s custom-built supercomputer that’s designed to coach its “Full Self-Driving” neural networks. Beefing up Dojo goes hand-in-hand with Tesla’s purpose to achieve full self-driving and convey a robotaxi to market. FSD, which is on a whole lot of 1000’s of Tesla automobiles at the moment, can carry out some automated driving duties however nonetheless requires a human to be attentive behind the wheel.
Tesla’s Cybercab reveal has come and gone, and now the corporate is gearing as much as launch an autonomous ride-hail service utilizing its personal fleet of automobiles in Austin this June. Tesla additionally mentioned throughout its 2024 fourth-quarter and full-year earnings name on the finish of January that it plans to launch unsupervised FSD for U.S. clients in 2025.
Musk’s earlier rhetoric has been that Dojo can be the important thing to reaching Tesla’s purpose of full self-driving. Now that Tesla seems near nearing that purpose, Musk has been mum on Dojo.
As an alternative, ever since August 2024, discuss has been round Cortex, Tesla’s “giant new AI training supercluster being built at Tesla HQ in Austin to solve real-world AI.” Musk has additionally mentioned it’ll have “massive storage for video training of FSD & Optimus.”
In Tesla’s This fall shareholder deck, the corporate shared updates on Cortex, however nothing on Dojo.
Tesla has positioned itself to spend massive on AI and Dojo — and now Cortex — to achieve its purpose of autonomy for each vehicles and humanoid robots. And Tesla’s future success actually hinges on its capacity to nail this down, given the elevated competitors within the EV market. So it’s value taking a better take a look at Dojo, Cortex, and the place all of it stands at the moment.
Tesla’s Dojo backstory
Musk doesn’t need Tesla to be simply an automaker, or perhaps a purveyor of photo voltaic panels and vitality storage methods. As an alternative, he needs Tesla to be an AI firm, one which has cracked the code to self-driving vehicles by mimicking human notion.
Most different corporations constructing autonomous car know-how depend on a mixture of sensors to understand the world — like lidar, radar and cameras — in addition to high-definition maps to localize the car. Tesla believes it could actually obtain absolutely autonomous driving by counting on cameras alone to seize visible information after which use superior neural networks to course of that information and make fast choices about how the automotive ought to behave.
As Tesla’s former head of AI, Andrej Karpathy, mentioned on the automaker’s first AI Day in 2021, the corporate is principally attempting to construct “a synthetic animal from the ground up.” (Musk had been teasing Dojo since 2019, however Tesla formally introduced it at AI Day.)
Corporations like Alphabet’s Waymo have commercialized Degree 4 autonomous automobiles — which the SAE defines as a system that may drive itself with out the necessity for human intervention below sure circumstances — via a extra conventional sensor and machine studying strategy. Tesla has nonetheless but to supply an autonomous system that doesn’t require a human behind the wheel.
About 1.8 million folks have paid the hefty subscription worth for Tesla’s FSD, which presently prices $8,000 and has been priced as excessive as $15,000. The pitch is that Dojo-trained AI software program will finally be pushed out to Tesla clients through over-the-air updates. The dimensions of FSD additionally means Tesla has been in a position to rake in tens of millions of miles value of video footage that it makes use of to coach FSD. The concept there’s that the extra information Tesla can gather, the nearer the automaker can get to truly reaching full self-driving.
Nonetheless, some trade consultants say there is likely to be a restrict to the brute drive strategy of throwing extra information at a mannequin and anticipating it to get smarter.
“First of all, there’s an economic constraint, and soon it will just get too expensive to do that,” Anand Raghunathan, Purdue College’s Silicon Valley professor {of electrical} and pc engineering, instructed TechCrunch. Additional, he mentioned, “Some people claim that we might actually run out of meaningful data to train the models on. More data doesn’t necessarily mean more information, so it depends on whether that data has information that is useful to create a better model, and if the training process is able to actually distill that information into a better model.”
Raghunathan mentioned regardless of these doubts, the development of extra information seems to be right here for the short-term not less than. And extra information means extra compute energy wanted to retailer and course of all of it to coach Tesla’s AI fashions. That’s the place Dojo, the supercomputer, is available in.
What’s a supercomputer?
Dojo is Tesla’s supercomputer system that’s designed to perform as a coaching floor for AI, particularly FSD. The identify is a nod to the house the place martial arts are practiced.
A supercomputer is made up of 1000’s of smaller computer systems referred to as nodes. Every of these nodes has its personal CPU (central processing unit) and GPU (graphics processing unit). The previous handles total administration of the node, and the latter does the advanced stuff, like splitting duties into a number of components and dealing on them concurrently. GPUs are important for machine studying operations like people who energy FSD coaching in simulation. Additionally they energy giant language fashions, which is why the rise of generative AI has made Nvidia probably the most invaluable firm on the planet.
Even Tesla buys Nvidia GPUs to coach its AI (extra on that later).
Why does Tesla want a supercomputer?
Tesla’s vision-only strategy is the principle purpose Tesla wants a supercomputer. The neural networks behind FSD are educated on huge quantities of driving information to acknowledge and classify objects across the car after which make driving choices. That implies that when FSD is engaged, the neural nets have to gather and course of visible information constantly at speeds that match the depth and velocity recognition capabilities of a human.
In different phrases, Tesla means to create a digital duplicate of the human visible cortex and mind perform.
To get there, Tesla must retailer and course of all of the video information collected from its vehicles around the globe and run tens of millions of simulations to coach its mannequin on the info.
Tesla seems to depend on Nvidia to energy its present Dojo coaching pc, but it surely doesn’t wish to have all its eggs in a single basket — not least as a result of Nvidia chips are costly. Tesla additionally hopes to make one thing higher that will increase bandwidth and reduces latencies. That’s why the automaker’s AI division determined to provide you with its personal {custom} {hardware} program that goals to coach AI fashions extra effectively than conventional methods.
At that program’s core is Tesla’s proprietary D1 chips, which the corporate says are optimized for AI workloads.
Inform me extra about these chips
![Tesla Dojo: Elon Musk's massive plan to construct an AI supercomputer, defined | TechCrunch 1 Ganesh Venkataramanan, former senior director of Autopilot hardware, presenting the D1 training tile at Tesla’s 2021 AI Day.](https://techcrunch.com/wp-content/uploads/2024/08/Tesla-AI-Day-Dojo-Tile.png?w=680)
Tesla is of an identical opinion to Apple in that it believes {hardware} and software program ought to be designed to work collectively. That’s why Tesla is working to maneuver away from the usual GPU {hardware} and design its personal chips to energy Dojo.
Tesla unveiled its D1 chip, a silicon sq. the dimensions of a palm, on AI Day in 2021. The D1 chip entered into manufacturing as of not less than Could this 12 months. The Taiwan Semiconductor Manufacturing Firm (TSMC) is manufacturing the chips utilizing 7 nanometer semiconductor nodes. The D1 has 50 billion transistors and a big die measurement of 645 millimeters squared, in response to Tesla. That is all to say that the D1 guarantees to be extraordinarily highly effective and environment friendly and to deal with advanced duties shortly.
“We can do compute and data transfers simultaneously, and our custom ISA, which is the instruction set architecture, is fully optimized for machine learning workloads,” mentioned Ganesh Venkataramanan, former senior director of Autopilot {hardware}, at Tesla’s 2021 AI Day. “This is a pure machine learning.”
The D1 remains to be not as highly effective as Nvidia’s A100 chip, although, which can be manufactured by TSMC utilizing a 7 nanometer course of. The A100 incorporates 54 billion transistors and has a die measurement of 826 sq. millimeters, so it performs barely higher than Tesla’s D1.
To get the next bandwidth and better compute energy, Tesla’s AI staff fused 25 D1 chips collectively into one tile to perform as a unified pc system. Every tile has a compute energy of 9 petaflops and 36 terabytes per second of bandwidth, and incorporates all of the {hardware} vital for energy, cooling and information switch. You’ll be able to consider the tile as a self-sufficient pc made up of 25 smaller computer systems. Six of these tiles make up one rack, and two racks make up a cupboard. Ten cupboards make up an ExaPOD. At AI Day 2022, Tesla mentioned Dojo would scale by deploying a number of ExaPODs. All of this collectively makes up the supercomputer.
Tesla can be engaged on a next-gen D2 chip that goals to resolve data circulate bottlenecks. As an alternative of connecting the person chips, the D2 would put all the Dojo tile onto a single wafer of silicon.
Tesla hasn’t confirmed what number of D1 chips it has ordered or expects to obtain. The corporate additionally hasn’t offered a timeline for the way lengthy it’ll take to get Dojo supercomputers operating on D1 chips.
In response to a June submit on X that mentioned: “Elon is building a giant GPU cooler in Texas,” Musk replied that Tesla was aiming for “half Tesla AI hardware, half Nvidia/other” over the subsequent 18 months or so. The “other” could possibly be AMD chips, per Musk’s remark in January.
What does Dojo imply for Tesla?
![Tesla Dojo: Elon Musk's massive plan to construct an AI supercomputer, defined | TechCrunch 2 GettyImages 2162480419](https://techcrunch.com/wp-content/uploads/2024/08/GettyImages-2162480419.jpg?w=680)
Taking management of its personal chip manufacturing implies that Tesla may someday have the ability to shortly add giant quantities of compute energy to AI coaching packages at a low price, significantly as Tesla and TSMC scale up chip manufacturing.
It additionally implies that Tesla might not must depend on Nvidia’s chips sooner or later, that are more and more costly and exhausting to safe.
Throughout Tesla’s second-quarter earnings name, Musk mentioned that demand for Nvidia {hardware} is “so high that it’s often difficult to get the GPUs.” He mentioned he was “quite concerned about actually being able to get steady GPUs when we want them, and I think this therefore requires that we put a lot more effort on Dojo in order to ensure that we’ve got the training capability that we need.”
That mentioned, Tesla remains to be shopping for Nvidia chips at the moment to coach its AI. In June, Musk posted on X:
Of the roughly $10B in AI-related expenditures I mentioned Tesla would make this 12 months, about half is inner, primarily the Tesla-designed AI inference pc and sensors current in all of our vehicles, plus Dojo. For constructing the AI coaching superclusters, Nvidia {hardware} is about 2/3 of the associated fee. My present finest guess for Nvidia purchases by Tesla are $3B to $4B this 12 months.
“Inference compute” refers back to the AI computations carried out by Tesla vehicles in actual time and is separate from the coaching compute that Dojo is answerable for.
Dojo is a dangerous wager, one which Musk has hedged a number of occasions by saying that Tesla may not succeed.
In the long term, Tesla may theoretically create a brand new enterprise mannequin primarily based on its AI division. Musk has mentioned that the primary model of Dojo will probably be tailor-made for Tesla pc imaginative and prescient labeling and coaching, which is nice for FSD and for coaching Optimus, Tesla’s humanoid robotic. However it wouldn’t be helpful for a lot else.
Musk has mentioned that future variations of Dojo will probably be extra tailor-made to general-purpose AI coaching. One potential downside with that’s nearly all AI software program out there was written to work with GPUs. Utilizing Dojo to coach general-purpose AI fashions would require rewriting the software program.
That’s, except Tesla rents out its compute, much like how AWS and Azure hire out cloud computing capabilities. Musk additionally famous throughout Q2 earnings that he sees “a path to being competitive with Nvidia with Dojo.”
A September 2023 report from Morgan Stanley predicted that Dojo may add $500 billion to Tesla’s market worth by unlocking new income streams within the type of robotaxis and software program companies.
In brief, Dojo’s chips are an insurance coverage coverage for the automaker, however one that would pay dividends.
How far alongside is Dojo?
![Tesla Dojo: Elon Musk's massive plan to construct an AI supercomputer, defined | TechCrunch 3 GettyImages 524212924](https://techcrunch.com/wp-content/uploads/2024/08/GettyImages-524212924.jpg?w=680)
Reuters reported final 12 months that Tesla started manufacturing on Dojo in July 2023, however a June 2023 submit from Musk urged that Dojo had been “online and running useful tasks for a few months.”
Across the similar time, Tesla mentioned it anticipated Dojo to be one of many high 5 strongest supercomputers by February 2024 — a feat that has but to be publicly disclosed, leaving us uncertain that it has occurred.
The corporate additionally mentioned it expects Dojo’s complete compute to achieve 100 exaflops in October 2024. (One exaflops is the same as 1 quintillion pc operations per second. To achieve 100 exaflops, and assuming that one D1 can obtain 362 teraflops, Tesla would want greater than 276,000 D1s, or round 320,500 Nvidia A100 GPUs.)
Tesla additionally pledged in January 2024 to spend $500 million to construct a Dojo supercomputer at its gigafactory in Buffalo, New York.
In Could 2024, Musk famous that the rear portion of Tesla’s Austin gigafactory will probably be reserved for a “super dense, water-cooled supercomputer cluster.” Now we all know that it’s truly Cortex, not Dojo, that’s taking on that house in Austin.
Simply after Tesla’s second-quarter earnings name, Musk posted on X that the automaker’s AI staff is utilizing Tesla HW4 AI pc (renamed AI4), which is the {hardware} that lives on Tesla automobiles, within the coaching loop with Nvidia GPUs. He famous that the breakdown is roughly 90,000 Nvidia H100s plus 40,000 AI4 computer systems.
“And Dojo 1 will have roughly 8k H100-equivalent of training online by end of year,” he continued. “Not massive, but not trivial either.”
Tesla hasn’t offered updates as as to if it has gotten these chips on-line and operating Dojo. In the course of the firm’s fourth-quarter 2024 earnings name, nobody talked about Dojo. Nonetheless, Tesla mentioned it accomplished the deployment of Cortex in This fall, and that it was Cortex that helped allow V13 of supervised FSD.
This story initially printed August 3, 2024, and we are going to replace it as new data develops.