Baidu Wenxin Yiyan made its debut! Can’t match ChatGPT yet, but don’t be disappointed

AI is really all the rage these days.

OpenAI relies on the strength of ChatGPT to attract attention, and Google followed with Bard. Just yesterday, the GPT-4 model stole the limelight again. You just sang about the AI ​​​​field where I made my debut. Today, it's Baidu's turn.

Just this afternoon, the highly anticipated Baidu Wenxin came as scheduled. However, at the beginning of the press conference, Baidu CEO Robin Li gave everyone a shot:

In a sense, Baidu has been preparing for this (publishing Wenxin Yiyan) for many years. We started investing in AI research more than ten years ago, and launched the Wenxin big language model in 2019. Today’s Wenxin Yiyan is A continuation of the efforts of the past many years.

But it cannot be said that we are completely ready. Wenxin said that the threshold for benchmarking against ChatGPT, or even GPT-4 is very high. No major global manufacturer has made it yet, and Baidu is the first. My own test feels that there are still many imperfections.

How did the belated Wen Xin Yiyan perform? How much is the gap with ChatGPT? Can it meet the market's demand for large Chinese language models? We did a comprehensive analysis of it.

We will also release the website for internal testing

What can a new generation of large language models and generative AI products do?

Five key points:

  • Creative writing;
  • Creation of business copywriting;
  • Mathematical and logical calculations;
  • Chinese understanding;
  • Multimodal generation.

For these five scenarios, Robin Li gave demonstrations respectively. It is worth mentioning that these demonstrations are not live operations, but recordings.

The first is Wen Xin Yi Yan's literary creation ability. Li Yanhong continuously raised several issues related to "Three-Body Problem" to Wen Xin Yi Yan.

First of all, from the demonstration video, Wenxinyiyan’s answering speed is very fast, much faster than ChatGPT, and the content that comes out is quite good. Later, we will throw the same questions to ChatGPT and Bing Chat. Let’s take a look at three difference between the

▲The animation is not accelerated

Regarding the creation of commercial copywriting, Li Yanhong asked Wen Xin a question:

If you want to set up a technology service company that uses large models to serve the digital upgrade of small and medium-sized enterprises, what company name can you choose?

Here is its Q&A.

Not to mention, it's pretty decent.

It can also be used to generate press releases. It can be said that AI is used from the beginning to the end, which is in line with the positioning of this company.

In the mathematical logic deduction session, Baidu asked a question about chickens and rabbits in the same cage. Wen Xin first discovered that the question was wrong, and then changed the question, and it was able to answer it accurately.

Before Wenxin Yiyan was released, some people had guessed that this language model would be better than OpenAI, Google, and Microsoft, so Baidu also showed off its skills in this part. Wenxin Yiyan not only accurately answered the meaning of the idiom "Luoyang Zhigui" , also explained the financial phenomenon behind it, and finally wrote a Tibetan acrostic poem using this idiom.

Li Yanhong also mentioned at the scene that Chinese is the advantage of Wen Xin Yi Yan, and conversely, the analysis of English materials has become its disadvantage.

Finally, Wenxin Yiyan's multi-modal generation ability is introduced. Drawing pictures, writing long texts, and generating short video content based on texts are all completed one by one. This is an ability that ChatGPT does not have.

▲It can also generate dialect voice

Li Yanhong also mentioned that Baijiahao is already using Wenxin Yiyan's multi-modal generation capabilities to convert text content into videos.

After the meeting, the first batch of users can experience Wenxin Yiyan first through the internal test code provided by Baidu. We have submitted the internal test application and will bring you experience content as soon as possible in the future.

Before that, we fed the several scenes demonstrated at the meeting to ChatGPT (version 3.5) and Bing Chat respectively to see how they output.

Compared with ChatGPT&Bing Chat, how is the experience of Wenxin Yiyan?

During the demonstration, Li Yanhong repeatedly emphasized that Baidu is in a unique position in the processing of the Chinese language.

Compared with ChatGPT and Bing Chat, the biggest difference now is multi-modal generation, that is, posters, voice and even video content can be generated through language.

In the presentation of the press conference, Li Yanhong demonstrated the use of Wenxin Yiyan to generate event posters, dialect voices, and generate event-related videos based on the content of questions. However, the cost of generating video is relatively high, and it is not yet open to all users at this stage.

The ability to generate pictures and videos really made our eyes shine. Robin Li also said, "Multimodal generative AI is a clear development trend."

In addition to this feature, we are also curious about other capabilities compared to ChatGPT and Bing Chat, so we used the content demonstrated in the press conference to ask about ChatGPT (version 3.5) and Bing Chat. Let me talk about the conclusion first: Wenxinyiyan's performance in the Chinese field is indeed better than the two predecessors ChatGPT and Bing Chat.

The first is the question about "The Three-Body Problem". Both Bing Chat and Wenxin Yiyan can correctly answer the question of who the author is and where he is from, while ChatGPT mislabeled Liu Cixin's hometown as Shandong.

Interestingly, the source of information for Bing Chat is Baidu Baike.

As for the actors of the TV series "Three-Body Problem" that will be staged in early 2023, ChatGPT, whose information base is stuck in 2021, is deflated again, saying that the TV series "Three-Body Problem" has not yet started filming, while Bing Chat found the answer in Douban.

In terms of business copywriting, all three can give their opinions. ChatGPT also thoughtfully attached an English name, which is convenient for us to enter the international market.

However, Bing Chat misidentified the meaning of the question at the first inquiry, and did not give me the exact company name, but provided a solution on how to choose a company name.

As for which of the three names is better, I leave it to everyone to judge.

Whether it's ChatGPT or Bing Chat, they don't give us complete peace of mind when doing math problems, but the problem of chicken and rabbit in the same cage mentioned in Baidu's press conference did not bother them, and they both answered it accurately.

In contrast, I prefer the interpretation of Bing Chat, which is more like a persuasive teacher, and Wen Xinyiyan's answer is a bit like a reference answer after class.

In terms of Chinese comprehension, the advantages of Wenxinyiyan are reflected.

When I asked "How expensive was the paper in Luoyang at that time", ChatGPT mistakenly thought that I was asking about the price of the Tang Dynasty, and told me that the paper in Luoyang was not expensive at all. There was no problem with Bing Chat's identification, but it did not give accurate data .

And Wenxin's price of two to three thousand Wen is at least consistent with the data I got from the search.

I believe you have also noticed that, not to mention the content of the writing, neither ChatGPT nor Bing Chat understands what Tibetan acrostics are. In comparison, Baidu Wenxin Yiyan's performance is indeed outstanding.

Of course, such a comparison is unfair to ChatGPT and Bing Chat. After all, we have not officially experienced Wenxinyiyan, and it is just a comparison with the presentation at the press conference. After getting the test opportunity, we will experience Wenxinyiyan for the first time, and we will see how we perform at that time.

Li Yanhong also mentioned in the press conference that although the Chinese language has obvious advantages, Wenxinyiyan has not trained enough for English languages ​​and code scenarios, and its performance is not good enough. I believe that Baidu will improve rapidly in the future.

Keep your feet on the ground and look up at the stars

There is no doubt that the release of Wenxin Yiyan is a landmark event for the Chinese Internet.

As Li Yanhong said at the beginning, Baidu is the first major manufacturer to produce a product that can be compared to ChatGPT, and has achieved a breakthrough in the Chinese language large model AI generative product from scratch.

But on the other hand, we also need to correctly look at the gap between Wenxinyiyan and ChatGPT.

What we call ChatGPT today, or the GPT-4 language model behind it, took 5 years and 4 iterations to complete the process from quantitative change to qualitative change. It is almost impossible for Wen Xinyiyan to catch up in such a short period of time.

Judging from today's press conference, Wenxinyiyan is not a revolutionary product as people expected, but more like a mid-term test of Baidu's AI technology reserves, showing that Baidu also has the ability to pursue the research and development of the most advanced artificial intelligence products .

After opening to the public, Wenxinyiyan can learn and improve through a large number of search requests from users, so as to improve the accuracy and speed of dealing with corresponding problems. On ChatGPT, we have seen the speed at which AI language models evolve.

If you are also looking forward to seeing a real "Chinese version of ChatGPT", you might as well give Wen Xin some time and patience. Three days after the farewell, we should look at each other with admiration, especially for AI models.

