After 8155, where does the intelligent cockpit go? Baidu Li Tao: Car Agent

Screen, perhaps one of the earliest "electronic devices" in cars, is also the first vehicle-mounted device to usher in intelligent changes.

Nowadays, when the so-called joint venture brands of "old fuel vehicles" all use Qualcomm 8155 chips on their vehicles, intelligent voice recognition and continuous dialogue are no longer new technologies. When AI voice assistants are asked to draw a picture and plan a trip at the new car launch conference, the "involution" of the intelligent cockpit seems to have reached the bottleneck.

When the AI concept really touches the automobile industry, is it really as simple as drawing a picture and becoming an encyclopedia? When the configuration table has been difficult to see the differences in the capabilities of various brands in the field of intelligent cockpit, where should car companies go?

"In the future, we need a new era cockpit that can know the user context, understand what you want at this moment, and automatically generate a global implementation plan, which is also the ultimate direction of the overall evolution of the intelligent cockpit." Li Tao, general manager of Baidu Intelligent Cabin Business Department, said this at the Global Smart Car Industry Conference (2024GIV) recently.

The integration and collision between AI big model and automobile intelligence is "historical inevitability". At present, it seems that there are still some "Arabian Nights" smart cockpits that understand what you need and think. The final realization depends on strong understanding, memory, logic and generation ability, which is precisely the field that AI big model is best at.

At the same time, Baidu, which has a large language model product like "ERNIE Bot" and an intelligent car product, just "stepped on" this wave.

Because to realize the intelligent experience mentioned above, we need full sensory integration, global planning and global execution ability, which Apollo has.

??

What does the vehicle agent look like in the AI era?

?

The concept of "fusion perception" is no stranger, at least in the field of intelligent driving, it has been widely used as marketing speech by OEMs. And generalizing it to the whole car is also composed of three dimensions: people, cars and the world.

Li Tao said: "AI defines a car, and high-level cognition is ultimately reflected in personalized service for users, and knowing’ people’ is the starting point of personalization. If our terminal intelligent devices can’t even know people, then there is no so-called personalized service."

In the past, in the traditional automobile era, the hardware function of the automobile did not support personalized service, but from multi-temperature air conditioning to voice recognition in multi-tone areas today, the satisfaction of the personalized needs of each member in the cockpit has been supported by hardware conditions, which also highlights the value of "knowing people".

Li Tao cited a very common scene: the same air volume and temperature will have very different feelings for men and women, and if an old man is in the car, he may have a sense of strangeness and fear about technology products and the implicit communication style of his parents’ generation, so it may be difficult to express his uncomfortable experience directly. Similarly, if a baby who can’t talk is sitting in the car and falls asleep in the safety seat, it is likely to catch a cold.

"An automated agent that can sense the status of the crew in the cabin and automatically provide personalized and scene-based auxiliary services is coming to the fore." Li Tao preached.

Similarly, for drivers, the state of the vehicle itself in different natural environments and geographical locations, such as high altitude areas, hot areas, rainy and snowy weather, paved and unpaved roads, has different driving states and vehicle settings. In the past, what was tested was our so-called "driving experience", but with the AI ? ? big model, the system can replace people to become vehicle experts, and intelligently complete scene understanding and help users drive vehicles.

Of course, at this stage, it may be somewhat "impossible" to achieve such a scene, but if the time goes back five years, a car can realize automatic assisted driving on urban roads, and perhaps not many people will believe it.

However, an important scene that AI can still solve has gradually landed. At present, the triggering of a large number of in-vehicle applications requires people to click through voice or fingers, and the ability to "say what you see" is mainly the corresponding instruction of "A-A". However, the ability to recognize the natural meaning will come soon. Li Tao said: "With the instruction of’ Go to a Huicai Pavilion after today’s meeting’, the system can calculate the corresponding arrival time and mobilize all related applications, including parking, reservation, etc., to deeply understand the user’s needs and match them globally."

??

Baidu, amplifying "convergence ability" in the AI ? ? era

?

Another meaning and value of "fusion perception" is to help people jump out of the limitations of natural perception. A pair of eyes and two ears can receive very limited information, and most serious road traffic accidents are due to the lack of perception, such as ghost probes, reversing and killing.

Baidu can integrate cabin driving sensors, which can help users gain a stronger perceptual field of vision and make timely and necessary reminders in the cockpit. For example, in the latest version of Baidu map V20, the "cart approaching reminder" has been added, which can avoid the vicious accident caused by the driver’s hasty operation because of poor observation. Compared with the past, the map was triggered by the geographical location of "close to the ramp", and Baidu integrated more perceptual abilities, so the reminder would be more accurate and effective.

Another example is a very common rear-end collision on the road, or a traffic accident such as hitting a faulty car or an obstacle in front. On the one hand, it may be due to the driver’s neglect of observation, on the other hand, there are objective restrictions that human beings cannot observe long-distance scenes. "We are also thinking and exploring whether there is a scientific and technological means to help car owners open a window of life at the moment of natural disasters and before the accident?" Li Tao said.

In fact, relying on technical means, such ability is already available.

Li Tao said that the information monitoring and slope monitoring system on the expressway can be perceived and reminded through the collaborative integration of vehicles, roads and clouds, and at the same time, the seat, safety belt and audio-visual ability in the vehicle can be linked to make a comprehensive reminder to warn the danger ahead and put forward suggestions for drivers to pull over or detour.

In fact, compared with the past, "decision-making" has become a very important part in the global perception and integrated intelligent scene.

Baidu, on the other hand, realizes intelligent scene construction and understanding based on Wenxin Big Model, and uses the integration of expert model and end cloud to realize the division of labor of "end-to-end and cloud-to-cloud". "End-to-end" solves the problems of performance and privacy compliance, and "cloud" provides the thrust calculation of super intelligence and complex scene tasks, and then sends it to the car to drive the car to execute in all directions.

This can not only improve users’ experience, but also greatly reduce the continuous investment of OEM in scene customization.

At present, Baidu’s intelligent products, such as image and voice dual-mode recognition technology, can be experienced on the brand-new Lu Zun PHEV and other models, which can achieve high recognition rate in the window-opening scene with a speed of 90 km/h. The sixth generation unmanned vehicle of Rapunzel will soon be equipped with super cockpit agent, which can realize automatic welcome, automatic vehicle control and care for different people driven by large models.?

Perhaps different from the "leap forward" we imagined, the boarding of the AI ? ? big model will bring about the continuous evolution of the experience. Of course, at present, in the "Wen Xiaoyan" APP, you can also feel the silky experience that digital people driven by big models can bring to users. From mouth shape, face, hair to clothes, it has achieved a surreal effect, and provided companionship and service throughout the driving process.?

?

Perspective of large companies

??

The arrival of the AI ? ? big model may subvert our previous cognition and imagination of the smart cockpit. As Li Tao said, the smart cockpit will evolve into a car-side agent: "In the past, everyone grafted some capabilities into the cockpit. This idea may have to be changed. It is a fundamental change to make a forward design based on the big model."

Perhaps this can also answer the questions of the current industry.

If the automobile develops in the era of internal combustion engine, the "volume" can be an engine with better performance and lower fuel consumption, and it can be an invention of automobile from carriage to various new technical capabilities. However, the development of smart electric vehicles now seems to be caught in the innovative dilemma of "refrigerator, color TV, big sofa".

Some people say that the development of smart electric vehicles will be the same, perhaps at the moment, and even the way we used to make paper comparisons through configuration tables is failing, but we should also see that when AI and cars are deeply integrated, the imagination space owned by cars may leap to a new level, and the difference in experience brought by intelligence will also widen.?

Just as many guests at the Global Smart Car Industry Conference (2024GIV) said, companies with stronger AI capabilities will have tickets for future competition.