From now on, everyone will have a “Van Gogh” in their mobile phone
How long do you think it takes to make an episode like this?
In the traditional animation industry, animation production is often the most time-consuming link. Animators have to produce the animation effects of each shot according to the storyboard script and art design.
From designing characters to drawing scenes, and then making animations, a production team often takes several months or even a year.
But recently, this industry law is being rewritten by AI tools.
The animated short you're looking at now consists of 120 visual effects shots and a total running time of 7 minutes, and its production team, Corridor, spent only a few hours on animation.
The secret of efficiency lies in the tool they use – Stable Diffusion.
Creativity is once again unleashed
As we all know, every moving shot in the animation is composed of continuous moving pictures drawn frame by frame by the artists, and behind each episode of animation is almost composed of thousands of sketches.
It is almost impossible for Corridor to draw animation frame by frame by hand, so they thought of another way to make a painting: the camera shot is essentially a frame-by-frame photo, if it is transformed into an animation style frame by frame Stitching together again, isn't it just animation?
To convert photos into animation pictures in batches, Corridor first thought of the most popular AI drawing tool: Stable Diffusion.
Compared with AI drawing tools such as Dall-E 2 and Midjourney, one of the advantages of Stable Diffusion is that it is an open source project. Users can prepare the most suitable database locally, let AI learn the drawing style in a targeted manner, and then generate batches The style of the picture.
According to the setting conceived in advance, Corridor let Stable Diffusion learn a large number of pictures of "The Vampire Diaries", as well as photos from various angles of the two leading actors, so that the transformed animated pictures can be as accurate as possible and the style tends to be unified.
After Stable Diffusion converts the entire video into an animation style, some unstable images are removed, and flickering is eliminated, and finally the green screen is replaced with the background captured by the virtual camera. Originally, it took a dozen painters to draw for several weeks. The animation is complete.
Seeing this, do you also want to use your imagination, shoot an animated short film yourself, or turn yourself into a variety of fantasy heroes?
Although Stable Diffusion has the advantages of high controllability, but in order to control it, you first need a powerful computing environment on the server or local side for it to run.
In other words, if you have no imagination, but do not have strong natural language learning and processing capabilities, and AI computing power as support, you still cannot use Stable Diffusion to create.
So, is there a way that ordinary people can easily draw a unique avatar by themselves?
There really are, and as long as "you have a mobile phone."
At the MWC conference, Qualcomm demonstrated Stable Diffusion running locally on an Android phone for the first time, and also showed multiple AI pictures generated on the phone side. The effect looks pretty good, and the whole process takes less than 15 seconds.
The parameters of Stable Diffusion exceed 1 billion. Ordinary computers are very difficult to run. How does Qualcomm "stuff" such a huge model into the mobile phone and make it run smoothly on the SoC of the mobile phone?
In order to "put the elephant in the refrigerator", Qualcomm's engineers first optimized the elephant.
Here, we must first mention a major improvement in AI in the second-generation Snapdragon 8 mobile platform, natural language processing (NLP).
Natural language processing is one of the new fields of AI applications. In order to understand and decompose human language as quickly as possible, Qualcomm has significantly improved the Hexagon processor and increased hardware acceleration, which can run the Transformer network more quickly and efficiently, and reduce the processing speed through micro-slicing reasoning. Power consumption makes the second-generation Snapdragon 8 show unique advantages in natural language processing use cases.
In order to enable Stable Diffusion to run on the terminal side, Qualcomm engineers chose to start with the FP32 1-5 version open source model of Hugging Face, and use the Qualcomm AI Model Enhancement Toolkit (AIMET) to quantify it after training without sacrificing the model In the case of precision, the original FP32 model is compressed into the INT8 format with higher computational efficiency.
Through Qualcomm's unified AI software solution Qualcomm AI software stack, Qualcomm can quantify and simplify AI models without losing model accuracy, greatly improve AI reasoning performance, and reduce power consumption, making large AI models more adaptable to mobile phones and other low-power computing environments on the terminal side, making the terminal-side expansion of AI models easier.
Through software and hardware full-stack optimization, Stable Diffusion can finally run on the second-generation Snapdragon 8 mobile platform integrated with the Hexagon processor, perform 20 steps of inference in 15 seconds, and generate a 512×512 pixel image, so that The speed is already comparable to the latency of cloud computing.
In other words, the large-scale generative AI model in the cloud has taken the first step in the expansion of terminal-side applications.
Although you can't use Stable Diffusion to shoot blockbusters like Corridor, it is more than enough to use it to draw your own head and take virtual portraits. Whether you want a self-portrait in the style of Monet, Van Gogh or Akira Toriyama, you can directly input commands on your phone to generate a unique AI work with one click.
In the future, AI models with a scale of tens of billions of parameters may be able to run on the terminal side, and the intelligence level of the AI assistant on your mobile phone will have a qualitative leap. The possibilities brought about by on-device deployment of generative AI models are beyond imagination.
A natural technological explosion
When it comes to AI computing, the first thing many people think of is a large cloud server. AI seems to be far away from our lives.
But in fact, every time you unlock your phone, wake up your voice assistant, or even press the shutter, it is an intimate contact with AI computing.
Due to the many advantages of AI processing on the terminal side including mobile phones in terms of reliability, delay, and privacy, more and more large AI cloud models have begun to run on the terminal side.
Today, AI computing has penetrated into all aspects of our lives along with terminal deployment. You can easily find AI in terminal products such as smartphones, tablets, XR glasses, and even cars. This is the vision of the edge of the intelligent network connection that Qualcomm has been building, and Qualcomm has been working silently for more than ten years.
Bringing AI from the cloud to the terminal can solve the two pain points of users at one time: on the one hand, the data processed by the terminal can be kept on the terminal device, and the privacy of the user's personal data is properly protected. On the other hand, terminal devices can perform calculations and processing in a timely manner, providing users with low-latency and reliable processing results.
Qualcomm is the first to deploy Stable Diffusion on Android phones this time, which not only provides users with the possibility of AI creation anytime and anywhere, but also makes future image editing full of imagination.
Stable Diffusion models encode a wealth of linguistic and visual knowledge, and tweaking the model can have tangible impacts in image editing, image inpainting, style transfer, and super-resolution.
Just imagine, in the future, you can take Disney-style or Japanese-style photos or videos without the Internet, and all image calculations are only performed on the mobile phone, which is fun and fun while protecting privacy and security.
In Qualcomm's technical planning, this is just the beginning.
Previously, Qualcomm released a solution called "Qualcomm AI Software Stack". Simply put, it only needs to develop a model once, and it can be expanded on all different terminals.
The research breakthroughs and technical optimizations made by Qualcomm on Stable Diffusion will be integrated into the Qualcomm AI software stack in the future. In the future, it only needs to be expanded on this basis to create different models suitable for platforms such as XR glasses and cars. It is also called "unified technology roadmap" by Qualcomm.
Through such a product development route, Qualcomm can integrate leading AI technologies such as natural language processing and facial recognition on smartphone terminals into XR glasses, PCs, Internet of Things, automobiles and other products, and ultimately create new smart experiences for users.
Such a highly flexible and efficient development model is inseparable from Qualcomm's AI engine.
The Qualcomm AI engine includes a graphics processing unit, a CPU, and the most critical Hexagon processor.
Among them, the Hexagon processor is composed of scalar, vector and tensor processors. The three accelerators use a unified shared memory. Qualcomm doubled the computing performance of the tensor accelerator and doubled the capacity of the shared memory, making the new The first-generation Qualcomm AI engine has improved energy efficiency by 70% compared to the previous generation.
The Qualcomm AI engine can be flexibly expanded in hardware, and a Hexagon processor is usually configured on a mobile platform; for automotive, cloud, and edge computing platforms, multiple Hexagon processor instances can be used to increase computing power.
Combining the architectural advantages and computing power performance of leading rivals, Qualcomm can be said to use the Qualcomm AI engine to form the core of smartphones, Internet of Things, XR glasses, automobiles and other businesses.
According to Qualcomm's vision, AI computing will continue to develop in a completely distributed direction, that is, AI reasoning will be transferred from the cloud to the terminal side in large numbers.
For example, mobile phones will learn the user's accent to improve the accuracy of voice; cars will learn different road conditions and improve the recognition rate of obstacles, etc. These are the application cases of AI popularization in terminals.
In December last year, Qualcomm joined hands with the new Oscar actress Michelle Yeoh to describe such a future vision of intelligent interconnection of all things:
Smartphones are getting smarter, powered by Qualcomm's AI Engine. It has mastered professional imaging skills, allowing you to shoot 8K movie-level scenes at your fingertips; it also has natural language processing capabilities, and can actively provide you with customized services such as real-time translation like an assistant.
Qualcomm's AI engine will allow cars to evolve into reliable drivers. In the future, cars can sense your arrival and make adaptive adjustments according to your habits; powerful computing power brings rich functions such as driving assistance, situational safety awareness and streaming entertainment, and can also predict the road ahead to improve the driving experience , allowing you to reach your destination comfortably and safely.
Wearable devices such as XR will bring you a more immersive experience and more intuitive interaction, allowing you to roam freely in the virtual and real worlds.
All in all, we are currently in the midst of myriad possibilities, and the AI innovations around us are reshaping the world and quietly changing the way we work, live and communicate.
At present, smartphones are the best application platform for AI technology, but the popularization of AI technology on smartphones is only the first step. In the future, AI technology will become ubiquitous, and people's productivity and creativity will be further released. Qualcomm has long been prepared for this.
#Welcome to pay attention to Aifaner's official WeChat public account: Aifaner (WeChat ID: ifanr), more exciting content will be presented to you as soon as possible.