This is all nets baby. Nothing but nets.
In August last year, Tesla CEO Elon Musk personally demonstrated FSD Beta v12, which had not yet been officially released at the time, on his Model S.
What’s special about FSD Beta v12 is that it is the world’s first end-to-end autonomous driving system based entirely on neural networks. In other words, this is truly “driven by AI.”
Judging from the standards of current live video broadcasts, the live broadcast with picture quality that was not as good as that of smartphones ten years ago, the screen frequently rotated and jittered, and the professional quality of the anchor was worrying, it was obviously not a successful live broadcast. But the topic "AI drives Musk to Zuckerberg's house" is really interesting, with nearly 12 million people watching online.
More importantly, during this 45-minute live broadcast, FSD Beta v12 only took over once. At other times, its driving style was basically the same as ordinary people.
As soon as the live broadcast started, Musk encountered an unusual road condition – a temporary diversion due to construction. But FSD did not hesitate at all and easily crossed it at a not too low speed. Musk also said at the time that the system had "never seen such a road surface."
Throughout the live broadcast, FSD v12 seemed to perform well, but many people still found a problem – the road conditions in Palo Alto, California's Silicon Valley, were too friendly.
In Palo Alto, there are no pedestrians crossing the road everywhere, nor motorcycles and bicycles suddenly emerging from blind spots. Even American netizens who are used to seeing wide roads said that it is time to increase the strength of FSD v12. This kind of road conditions is simply not enough.
But now, as FSD Beta v12 is officially pushed to North American users, we can already get a glimpse of its true capabilities from the videos of many overseas bloggers.
V12, splitting two eras of smart driving before and after
YouTube user Whole Mars Catalog was one of the first bloggers to receive the test version, and he has been testing the capabilities of Tesla FSD since 2020.
Judging from his video, FSD Beta v12 has excellent performance in the "old and difficult" scene of waiting on a rainy night.
▲ Traveling on a narrow road on a rainy night, the video has not been accelerated
It's no problem during the day. Turn into the side road and when you reach your destination, pull over and park instead of parking in the middle of the road like before.
Compared with FSD Beta v11, the detour speed of v12 has been significantly improved. Facing the same car parked on the road, the detour action of v12 is completely comparable to that of a human driver, while v11 is "trapped" in the middle of the road, and the driver I had to step on the accelerator to help the vehicle pass.
▲ Model S upgraded to FSD Beta V12
▲ Model Y still equipped with FSD Beta V11
In addition, Whole Mars Catalog believes that FSD Beta v12’s unprotected left turns have been significantly improved compared to previous versions, and the ability to recognize traffic lights is also stronger.
In the live broadcast five months ago, the only time Musk took over the vehicle was because of the incorrect recognition of traffic lights. In the middle of the live broadcast, Model S mistakenly saw the green light for a left turn as a green light for going straight at an intersection, and then started driving, but was stopped by Musk .
▲The wrong traffic light
Musk smiled awkwardly and said that he would watch more "traffic light videos" for FSD to solve this problem. That's right, unlike previous FSDs, the growth of FSD Beta V12 does not rely on lines of code, but on videos.
Tesla noted in the release notes of FSD Beta V12, “FSD Beta V12 upgrades the city street driving stack into a single end-to-end neural network, trained on millions of video clips, replacing more than 300,000 lines of explicit C++ code".
The so-called end-to-end solution refers to the entire process of "perception-decision-control" within a unified system framework, and training is completed through deep learning methods instead of decomposing it in the traditional way. It is composed of multiple modules such as perception, positioning, path planning, and control. The upper module outputs the results and guides the lower module to run.
The highly abstract results of the series connection between each module of the modular solution may be wrong. The next prediction module cannot repair the error, or requires a lot of post-processing or judgment to restore the error, and the effect may not be very good. At the same time, each module requires a separate data set, which consumes a lot of money for labeling requirements, and separate deployment requires higher computing power.
Chen Li, a researcher on the Shanghai AI Lab Pujia OpenDriveLab team, previously said in an interview with China Business News that the modular solution is still dominated by expert rules in the decision-making and control parts, and is manually tuned through expert systems, with weak generalization capabilities.
It's like a student who listens carefully but doesn't think divergently. He knows everything the teacher teaches. But once he encounters something the teacher hasn't taught, it's hard to tell. It can be said that these are two different methods. One is to give the correct answer and just follow it; the other is to give the problem-solving ideas and then draw inferences.
FSD, must be a "good student"
The reason why FSD Beta v12 has attracted a lot of attention, and the reason why Musk launched a live broadcast to promote it, is ultimately because it has changed the means of realizing intelligent driving.
As long as there is a red light ahead, everyone will stop behind the white line.
So FSD learned the rule of "stop on red light, go on green light". This is the result of FSD's self-learning, rather than the standard answer told to it by humans. This is the neural network, or to use a more popular term – AI.
Learn autonomous driving by learning the driving behaviors of a large number of real drivers. This is a process of transition from a new driver to an experienced driver. The more you drive, the more you will see about the world, accumulate experience, and become a better person. Unlike humans, FSD can devour massive amounts of content and learn from it. This efficiency is much higher than that of most migrant workers who only drive when commuting to and from get off work.
But neural networks are not perfect.
Think about it, when you were growing up, did you encounter some bad people who might "lead you astray" – FSD will also see some bad driving habits "demonstrated" by unruly human drivers.
During Musk's live broadcast, the engineer sitting in the passenger seat mentioned that in the United States, only 0.5% of drivers will stop completely before a stop sign to observe, and the vast majority will choose to pass slowly. However, Regulatory authorities will require the smart driving system to stop completely in front of the sign. To this end, Tesla needs to specifically "teach" the FSD, increase the weight of the "demonstration" of correct operation, and let it "learn" something good.
In FSD Beta v12, the system can accurately identify stop signs at each intersection, stop and observe, and respond sensitively enough when passing conditions are met. The previous version may hesitate for a long time due to pedestrians or bicycles on the roadside.
However, the Whole Mars Catalog also stated that the current FSD is still not perfect and "is not ready to be launched to everyone." For example, at some forked intersections, the steering wheel will wander left and right, unable to make up its mind; at some relatively empty intersections, the vehicle will occasionally stop for a long time and be too cautious.
▲ The vehicle stopped at this intersection for 15 seconds
Musk has previously stated that FSD v12 will take off its beta version hat and no longer have the "Beta" suffix, but the version currently pushed to users still has the "Beta" logo. All I can say is that FSD still has a lot to learn.
Another issue worth considering is cost.
Musk mentioned that Tesla invests up to US$2 billion in FSD every year, which is undoubtedly a money-burning business. For other car companies that are still struggling to make profits, whether they can afford this money is an unavoidable topic. The training of the model alone is an astronomical sum of money.
The video training of Tesla FSD Beta V12 requires up to 15,000 NVIDIA H100 GPUs, which puts Tesla in the top 12 of NVIDIA's purchase quantity list in the third quarter of 2023. Although Tesla released its own supercomputer Dojo in 2021, which was also successfully mass-produced in 2023, Tesla still needs NVIDIA, and only a small part of the training uses Dojo.
NVIDIA also has Chinese brands on its “Large Customer List.” In order to cope with Tesla's challenges, the domestic autonomous driving industry chain has also been deploying end-to-end solutions, including simulation testing in end-to-end model training.
In the same list, Baidu, which has been deeply involved in smart driving and AI for a long time, ranked 8th and purchased a total of 30,000 Nvidia H100s. Behind Baidu is Alibaba, which purchased a total of 25,000 GPUs. It should be pointed out that in August 2022, Xpeng announced that it would jointly build an intelligent computing center with Alibaba with a computing power of 600PFLOPs (petaflops).
NIO is more concerned about vehicle-end computing power. NIO's smart driving system currently uses 34 NVIDIA Orin In other words, its computing power will be above 1000TOPS.
With the continuous development of large-scale models, intelligent driving's requirements for computing power will further increase. The so-called "computing power is useless" can only be empty talk.
# Welcome to follow the official WeChat public account of aifaner: aifaner (WeChat ID: ifanr). More exciting content will be provided to you as soon as possible.