One day of AI painting is one year on earth.
Dall-E 2 and Midjourney, which became popular in the first half of the year, were completely overshadowed by Stable Diffusion in the second half of the year.
The recent hot drawing products all have "diffusion" in their names, and they all benefit from the artificial intelligence "diffusion" algorithm. This algorithm breaks through the application critical point of AI painting, which is easier to use and better in effect.
Machine painting has a history of half a century, and within two years, AI painting suddenly became an "attacking giant". Not only the quality has improved visible to the naked eye, but the speed of generating pictures has also been shortened from a few hours at the beginning of the year to more than ten seconds bell.
Significant advances in AI painting technology have sparked interest in "creative AI" — a range of AI tools that mimic human creativity, from fine art to poetry. But no one really felt panicked.
A while ago, many people speculated that Yan Ning, a biologist, left the United States and returned to China because the AlphaFold artificial intelligence system could predict the structure of proteins and was robbed of his job. In fact, software that can write news information has long existed, and no journalist has lost his job because of it. AI can't even replace people who write tofu cubes, let alone top scientists.
What is Diffusion Algorithm
The current artificial intelligence models use deep learning neural networks. Self-learning models, such as GPT-3 is the most famous of these models, which will "learn" on the neural network of about 45TB of text data, and generate works that are almost the same as human output.
Stable Diffusion is part of the deep learning family. Specifically, Stable Diffusion learns the connection between images and text through a latent diffusion model. It works by taking image data and adding "noise" to it. Noise, also called noise, refers to the rough spots in the images captured by digital photography equipment, which are generally produced by electronic interference.
A picture is gradually added with noise until the whole picture becomes white noise. The model records this process and reverses it for the AI to learn.
From the perspective of AI, the first thing you see is a picture full of noise, then you see the picture becomes clearer, and finally it becomes a painting. What AI learns is the whole denoising process, especially how to deal with Gaussian noise, and finally generate paintings.
Gaussian noise refers to a type of noise whose probability density function obeys Gaussian distribution (that is, normal distribution). The diffusion algorithm adds Gaussian noise. One is to verify the validity of the "actual" image, because the images in the use environment are all noisy. One is for the convenience of learning, as long as the noise does not conform to the standard normal distribution, it will be invalid.
Stable Diffusion's basic database is called LAION-Aesthetics, which contains images with illustrations, and is also filtered according to "aesthetic style". Other trained artificial intelligence models also "corrected" the database to predict how people would respond to "how much do you like this painting" ratings in order to eliminate some pornographic content.
How is it different from the "predecessors"
Stable Diffusion is similar to Dall-E 2 and Midjourney in that it relies on "text description" to generate images.
However, Stable Diffusion is open source and its underlying code is publicly available. Neither Open AI nor Google has released their own AI models.
Stability AI is comprised of more than 4,000 NVIDIA A100 GPUs running in the Amazon Cloud (AWS). According to reports, Stability AI's operating and cloud spending costs exceed $50 million.
The company claims it can provide a "breakthrough in speed and quality", and that GPUs with less than 10G of memory can also run. They will also provide versions running on AMD, Apple M1/M2 chips.
Currently, the function of Stable Diffusion is that it can convert text into a 512×512 pixel image in a few seconds; the image can be transformed, enlarged, modified and replaced; using GFP-GAN modeling, allowing users to upload blurred facial images for Zoom in or restore original appearance.
Last month, Stability AI raised $101 million. CEO Emad Mostaque graduated from Oxford University with a master's degree in mathematics and computer science, and previously worked as an analyst at various hedge funds. Currently, the company is valued at $1 billion. In addition to Stable Diffusion, there is also Dance Diffusion-music editing.
Stability AI's money-making plan is to train "private" models and general-purpose infrastructure platforms for customers. It has a platform, DreamStudio, which is also accessible to individual users. Today DreamStudio has over 1.5 million users who have created around 200 million images. Counting all channels, Stable Diffusion has more than 10 million users.
The company also made a high-profile hire of Google scientist and futurist Daniel Jeffries.
Is this art?
With the announcement of various artificial intelligences, related ethical and legal issues are also increasing. Stable Diffusion allows the generation of real-life images, and the problem becomes more "serious".
Stable Diffusion has been used by users to create a lot of sensitive content, and fake celebrity photos are flying all over the place. Getty Images has banned uploads of images generated by Stable Diffusion due to intellectual property concerns.
U.S. House of Representatives Rep. Anna G. Eshooo recently published a letter urging the U.S. National Security Advisor and the Office of Science and Technology Policy to address these "unsafe models."
In the release announcement, Stability AI announced a "loose license allowing commercial and non-commercial use", which is actually an agreement with users. It expects users to self-regulate their behavior and do the "right thing" and has little effect in punishing users who don't follow the rules.
In addition to legal issues, works generated by artificial intelligence are also subject to suspicion.
Anyway, the U.S. Copyright Office considers these images "not art." In February, the Copyright Office's review board rejected claims for images generated by artificial intelligence.
The Review Board emphasized that "human authorship is a prerequisite for copyright protection" and requires "the relationship between human thought and creative expression." The U.S. federal court also held in a recent judgment that artificial intelligence cannot be counted as the "inventor" of a patent.
Artificial intelligence art is very attractive. Although it is not legally recognized, it is recognized by the market. In 2018, Christie's sold an artificial intelligence painting for $435,000. Moreover, the vast majority of consumers cannot tell the difference between AI paintings and the works of human painters.
The most controversial is the art competition of the Colorado State Fair in September this year. The artificial intelligence work "Théâtre D'opéra Spatial" won the first prize. It was produced by Midjourney and the operator Jason Allen said "Art is dead, AI wins, humans lose".
In fact, there is no need to generalize. In the creation of artificial intelligence, there is no need to be overly optimistic or exaggeratedly pessimistic.
The artistic creation of artificial intelligence is produced according to the "logic" of human beings. Naturally, it is not as good as human elites, but it is more than enough to surpass the mediocrity among them.
#Welcome to pay attention to Aifaner's official WeChat public account: Aifaner (WeChat ID: ifanr), more exciting content will be presented to you as soon as possible.