China’s first hyper-realistic virtual idol signed with Warner, will it be the opponent of traffic stars? |Interview

Virtual, even if I am virtual.
Fictional name, indifferent fancy name.
False feelings, at least not to you.
You gotta show me, you gotta show me.

( Click here to listen to the song MISS WHO).

If you don’t say it, you might think it was sung by a real person, not from a virtual digital person.

But it can be clearly felt that the self-confidence and publicity in her voice reveals a strong new-generation idol temperament.

She is China's first super-realistic virtual idol——Hajiang.

Now, she has a new identity, Warner's first virtual music artist, also known as the "meta universe virtual artist."

At the beginning is her first solo debut single "MISS WHO". Its music creation comes from the world-class head record company Warner Music, and technology supports intelligent synthesis sound customization technology from Microsoft.

Aifaner hereby interviewed the relevant persons in charge of Microsoft and Warner, and talked about the ups and downs of virtual idols from Haran's music.

Behind the ha sauce, it reflects the popular path of contemporary virtual idols, and can also give a glimpse of our future digital life.

How to become a virtual idol

Dazzling short blue hair, exquisite and flawless features, tall and outstanding figure, trendy and avant-garde wear, bold personality.

Ha Jiang has all our imaginations about virtual idols.

Since her birth, she has also undergone several "transformations."

In 2019, Budweiser Investment Group produced Harbin, after which she became the virtual spokesperson of Harbin Beer, and then co-branded with Li Ning, PONY, Crow and other different brands. Hajiang is now not only a KOL, a skateboarding girl, but also an e-sports anchor, a public welfare ambassador, and a traffic safety publicity ambassador…

When everyone was confused about her multiple identities, now she was signed by Warner.

At this time, the fire of the meta universe is igniting, and Warner is looking for which virtual digital people in the industry are suitable for training. Haran, who is suitable for the music route, was selected and formally joined its dance label, Wheel Records, and became a virtual music artist.

Zoe, the head of Warner's electronic music label Whet, told us that the commercial ownership of Harmong is now mainly owned by Warner and her independent operating company Manfu.

Next, Warner will also focus on strengthening the character design of "Hip Hop, National Trend".

As a music artist, naturally the core ability is singing.

The reason why Microsoft chose Hajiang is also because in the contemporary era of continuous evolution of AI technology, the machine's speaking ability has been able to speak, learn, and sing from the initial cold utterance.

Empowering this technology to Hajiang will not only promote the development of speech synthesis technology, but also allow virtual idols to create more new possibilities.

For Microsoft, Warner, and Harbin, this is a complementary and triple win.

But it is not easy to become a "realistic" virtual music artist.

Liao Qinying, chief product manager of the speech group of Microsoft's Cloud Computing and Artificial Intelligence Division, told us that the AI ​​songs that everyone heard before were probably simple direct combinations of sentences, with a single genre and tone. But Haran's voice is modern, its timbre changes, and its style is in line with the aesthetics of young people.

From trying to sing and experimenting with singing, Hajiang has now been able to release a real single.

Ding Binggong, product director of the Cloud Computing and Artificial Intelligence Division of Microsoft Asia Pacific R&D Group, went further and told the story of how they made a good voice.

The first step is to determine the person setting of the virtual idol and extract the tonal elements.

The second step is to train the data according to the set of people. Microsoft has a powerful neural network speech model base model, which integrates various elements of people’s voice, such as timbre, age, accent, rhythm, etc. It will release corresponding abilities according to the personality of Hajiang, and train special It is a model of the sound of Hajiang.

The third step is to adjust the model, just like making a sculpture, first make an embryo, and then finely carve it. They have a complete set of tools and processes to polish, and finally a perfect human voice comes out.

The neural network Chinese voice model used to create the timbre of Hajiang supports 15 styles including narration, news , customer service , assistant, lyric , chat , calm , happy , sad , angry , fear , dissatisfaction , severe , coquettish , and gentle .

The whole process needs to solve many engineering problems, and always ensure quality and stability-just like an industrial assembly line.

If the singing is stable, it is a question of how to sing.

Next, we arrived at Warner's place of military use.

Warner Music Music Director Zeng Yu said that they also explored many angles when creating music, and produced many versions, including corrections one by one.

In addition, we must also consider how its genre is changeable, how to catch the ear, how to integrate the melody of oriental aesthetics, how to restructure the elements of national style with electronic music writing, and how to express the attitude of Hajiang…

Unlike live recordings, virtual recordings have to do a lot of detailed work.

It does not simply record someone's voice and then cut them together, but it needs to constantly try, adjust, and overcome new difficulties.

The pleasant side is that after they lay a foundation, they can use this foundation to make Hajiang sing all kinds of songs. After creating the first single, it will be no problem for Hajiang to sing pop songs in the future. .

"Looking at everyone's evaluation of MISSWHO on NetEase Cloud Music, we almost didn't say whether she sang like a robot, but was evaluating the quality of the song itself. I feel quite surprised and very happy," said Ding Binggong.

After the first single, on November 19th, Hajiang and Russian artist MARUV collaborated on a remix ( click here to listen ).

Zoe said that during the Spring Festival, Hajiang will also sing some classic Spring Festival songs, but they will be performed in the form of electronic music. Later, they will cooperate with well-known artists at home and abroad, including creating a national style single with Xu Mengyuan and so on.

Zeng Yu has more room for imagination in the music creation of virtual idols.

The first step is how she can be like a real person. When everyone is no longer unfamiliar with virtual artist singing, we will not compare it with real people. Maybe we can challenge something that real people can't do.

Virtual idol VS real idol

There are more and more virtual idols.

Leroy Entertainment also launched its first virtual idol group A-Soul last year. Tencent, NetEase, Kuaishou, Station B, and Ali have all set foot in virtual idols. In recent months, more and more new consumer brands have begun to invite virtual idols. Idol spokesperson.

It can be seen that the tripartite combination of technology companies, artist brokerage companies, and virtual idol companies will increasingly become the norm.

But at the moment, some virtual idols are very popular, and many virtual idols have fallen in batches.

The reasons may be insufficient technology, bad content, insufficient operation, or the input cost is too high.

Zoe also said frankly: "It is still difficult to make a profit on virtual idols at this stage, but the potential is great. There will be business opportunities constantly, but the premise is to ensure that there is good content."

Virtual idols are still in the early stages of development.

▲ A-Soul

When they sing, dance, host, model like human beings… the public always expects to be indistinguishable from real people, or better than real people.

But often the reason for its backwardness is the level of technology first.

Take music as an example, the quality of speech, voice, and singing are almost like a mountain river.

Sometimes there are some blemishes, and the sense of hearing is more comfortable. In fact, what we want to hear is a more natural sound, or in other words, a more emotional sound.

Emotional interpretation has always been a problem in the AI ​​world.

"I think so far, we actually don’t have a good answer," Ding Binggong said. "But from the perspective of AI learning, we can learn, refine, and simulate in a huge database through new algorithms. Human emotions".

He was talking about an algorithm called Neural singing that Microsoft recently researched.

Zeng Yu also mentioned that he believes that now, Hajiang can be said to be the benchmark in the virtual idol industry's sound quality, because her songs can give people a soulful feeling.

▲ Microsoft Azure artificial intelligence platform and framework diagram

In their eyes, "soul" is a process that can be manufactured in virtual idols.

  • Zeng Yu believes that the more AI learns, the more delicate it presents, and it can capture more details, perform more perfect in every word and every treble, and the so-called soul expression will be better;
  • Ding Binggong believes that the soul is a very imaginary concept that is difficult to define, and the perception of virtual idols should be multi-modal presentations such as vision and hearing;
  • Zoe also added that Haran’s skills beyond music, such as skateboarding, illustrations, etc., will make the "soul" of virtual artists more interesting.

To put it simply, for virtual idols and real idols to have the same strength, at least a high-quality combination of "technology + content" is required.

When virtual idols are madly manufactured and homogenized, they are all pretty and uniform. We need more personalized and humanized things to entrust our love and admiration to them-just like real idols.

▲ Virtual idol Yoo Yehee

Zeng Yu sighed, now that most of the virtual artists in China, whether they are made by platforms or large companies, focus on the marketing side, and there are not many virtual artists who really make good products.

This industry has just started, there is no need for everyone to come up to compete for what it is, but to polish their products to the best. No matter how powerful these concepts of Metaverse are, what the audience feels is still entertaining content, and it still has to have quality requirements. At least every song and every image can shock some people. Time accumulates, virtual artist The power of will exist.

From a long-term perspective, he believes that it will take a long time for the virtual idol industry to be able to rival real-life idols, or become more popular than real-life idols.

Virtual idols must be like real people, able to adapt to multiple environments and scenes, be able to synchronize sound and picture, bring realistic visual effects, and actually act in front of you, perform, contact, etc., and require various technical support.

"At least the entertainment industry and the technology industry will have to communicate and adapt for a long time," said Zeng Yu.

▲ Super-realistic digital person AYAYI

But also based on the possibility of more creation of virtual idols, they are now planning to let Haran sing something that real people can’t sing, such as music that is much higher, lower, or faster than real people, and the interpretation is completely different. Style, do some more fresh gameplay.

The image of a virtual idol can also be separated from the person itself, whether it is two-dimensional, three-dimensional, realistic, super-realistic, or strange creatures, you can freely imagine.

Ding Binggong said, "Like a voice, for us, nothing is the best, only what is the most suitable."

▲ Disney-Lina Belle

What kind of future will virtual idols take us to?

Virtual idols have unpredictable possibilities, and there is no boundary in sight.

Microsoft uses "AI for Good" to prevent technology from going beyond the boundaries. Zeng Yu said, "At least until the virtual idol learns to dress up, I don't think it will get out of control."

In his opinion, the difference of virtual idols like Ha Jiang lies in her advanced and leading nature.

Now, they make Hajiang's voice, singing style, singing attitude, and music style unique. When the characteristics of the existing voice are particularly good, they will find something more special on that benchmark.

Qinying Liao also told us that the songs we sing now are all preset songs. In the future, we will synchronize audio and video with Microsoft's Viseme technology. Virtual idols may be able to interact with users for dynamic creation and generation to meet various needs. .

Leading the trend forward is a kind of difference.

Fox TV’s singing talent show "Alter Ego" , on the AR virtual image projection stage, performers wearing motion capture costumes hide behind the scenes and sing

These trends will enter people's daily lives.

Just like virtual idols, it is also extending from music, games, and film to more industries. There will be more application scenarios in our lives in the future. The technology behind virtual idols is also connecting to a smarter future.

When the threshold of technology is getting lower and lower, the cost and degree of automation of making virtual idol voices will gradually decrease.

However, Ding Binggong said that in the future, maybe everyone can create and become a virtual idol, but there are still many technological boundaries that need to be broken.

Going back to the virtual artist Ha Jiang, in the future, Microsoft needs to dig deeper into her music, so that the quality of the interpretation is better, the style is more, and in terms of breadth, it will also explore more "Ha Jiang" with Warner. , To generate more virtual idols set up by different people.

These can all push the boundaries of technology.

We need to reduce the production cost, production barriers and accessibility of virtual idols, as well as the process of technical education and market cultivation. When people realize that this technology can bring them meaning and value, it is time for this technology to be universally beneficial. .

As for the connection between virtual idols and Metaverse, both Microsoft and Warner said they are in the learning stage.

Before the next digital world of the future takes shape, many things are difficult to say accurately.

The transition of music from the physical to the Internet industry has actually not been so smooth. Zeng Yu said that Metaverse allows people to see disruptive technical support at the level of technology, distribution, and decentralization, but it also depends on the entry point from which to bring changes to the music industry.

Microsoft already has some layout for Metaverse. Their newly launched "Meta universe part-time job" platform Mesh for Microsoft Teams enables everyone to become digital people, where they can have immersive meetings, collaboration and communication in Teams.

This road of digital exploration continues.

When the wave of the future hits, just as technology and content complement each other and achieve each other, when our eyes are focused on the word "virtual", don't forget the meaning of "idol".

Because of the combination of the two, the name of the virtual idol was established.

#Welcome to follow Aifaner's official WeChat account: Aifaner (WeChat ID: ifanr), more exciting content will be provided to you as soon as possible.

Ai Faner | Original link · View comments · Sina Weibo