AI's "iPhone moment" has arrived
At the just-concluded Nvidia GTC 2023 conference, Nvidia CEO Huang Renxun repeated this point of view three times.
How to understand?
The multi-touch screen technology carried by the iPhone has created a brand-new interaction mode of the smartphone interface, which gave birth to the mobile Internet.
The emergence of AI super applications such as ChatGPT and Stable Diffusion marks the maturity of accelerated computing and AI technology. AI is penetrating into all walks of life at an unprecedented speed and promoting a new industrial revolution.
After the continuous development of AI technology in recent years, powerful computing power and advanced models provide a suitable application platform for AI, prompting various manufacturers to reimagine their products and business models, as well as the speed of update iterations.
Last night, Nvidia, Microsoft, Google, Adobe and other manufacturers launched their respective AI services in almost the same time. The trend of chasing each other seems to convey the same anxiety:
"In this era of big AI, if you don't want to be subverted by others, you must first subvert others."
How will AI change our lives? After watching this "AIGC's Most Volumed Night", you may have a deeper feeling.
Nvidia is bringing AI to every industry
Nvidia has brought a lot of eye-catching "new activities" at the annual GTC conference.
For example, it released an AI-assisted core-making technology called CuLitho, which paved the way for the 2nm process; cooperated with automakers such as Lotus, Mercedes-Benz, and BMW, and used Omniverse to build digital production lines.
But when it comes to the most eye-catching content of the entire GTC conference, it has to be the new graphics card released by Nvidia-the H100 NVL with dual GPU NVLink.
H100 NVL is a graphics card specially designed for ChatGPT, which requires huge computing power. H100 NVL has an exaggerated 188GB HBM3 memory (94GB per card), which is currently the largest memory card released by Nvidia.
A large language model like GPT consumes a lot of memory resources. In theory, a GPT model with hundreds of billions of parameters can quickly fill up an H100 graphics card.
Compared to the HGX A100 for GPT-3 processing, a standard server using four pairs of H100s and dual-GPU NVLink is 10 times faster, the "nuclear bomb" H100 NVL is ideal for large-scale deployment of language models like ChatGPT .
Another big job of Nvidia is to move the "ChatGPT same model" to the cloud and open it to the public.
The operation of ChatGPT mainly relies on the DGX supercomputer composed of A100 or H100. Microsoft spent hundreds of millions of dollars to purchase tens of thousands of A100 graphics cards to form the Azure cloud computing platform.
In order to reduce the cost of deploying large models for users, Nvidia has launched the DXG Cloud service. Starting at $36,999 per month, you can get a cloud supercomputer composed of 8 H100 or A100 graphics cards to easily complete high-load computing tasks.
In 2016, Lao Huang personally delivered the first DGX supercomputer computer to OpenAI. Seven years later, the top AI computing power has the opportunity to enter every company through DXG Cloud, accomplishing tasks that were impossible in the past.
It is not difficult to imagine that advanced applications like ChatGPT that can improve human communication and work efficiency will continue to emerge, bringing more convenience and surprises to our lives.
Touching stone into gold, the second generation of Runway allows you to generate all kinds of blockbuster movies in one sentence
There has always been a popular stalk at Station B: Videos cannot be posted, so this is true. But now videos can not only be P, but also can be generated directly with AI from scratch, without painters, photographers, or post-processing. As long as you enter a paragraph of text into Runway, it will return you a shocking short video.
At first, Runway was a post-production auxiliary tool. Although it used the magical power of artificial intelligence, what it could achieve was not complicated: erasing objects, interpolating images, deleting backgrounds, motion tracking, etc. It can be regarded as Adobe Premiere plugin for dummies.
And when Runway opened a new door to artificial intelligence, it has the ability to turn stones into gold. In the Gen 1 version announced last September, it has the ability to convert text to video. People at that time had just seen the magic of text-to-image conversion, and Runway could directly generate dynamic images, which was as shocking as a dimensionality reduction blow.
After half a year, Runway Gen 2 is here.
Compared with the Gen 1 model, it achieves higher time consistency and fidelity. In human terms, the connection between the pictures is smoother, and the picture quality is higher.
With Gen 2, you're one step closer to generating videos of your imagination anytime, anywhere.
Enter a prompt word that is not too long: mountains photographed by drones. Based on this, Runway generates the following screen.
Here's another: Afternoon sunlight streaming through the windows of an apartment in New York City.
Come to an advanced version, feed pictures and text to Runway, and then generate a short video.
▲ The text is: A man is walking on the street, and the neon lights of the surrounding bars illuminate him
▲ Original picture
Or animate a static image.
It is also possible to directly render a dynamic image from a non-textured animation to be rendered.
The progress of Runway is obvious to all. It has gone farther and farther and smoother on the road against the network model. Today's Gen 2 version can be regarded as "watchable". Although it is not exquisite, the future can be expected.
Perhaps when it comes to the Gen 3 version, with the help of it, we can generate Douyin hit short videos with one click. At that time, will this be a nightmare for quality bloggers?
Burst! Google starts Bard testing
If Nvidia allows us to see the future of AI development, then Google's Bard is today's AI.
After everyone was shocked by GPT-4 and Midjourney V5 for a week, while Lao Huang looked forward to the future AI era, Google announced the official opening of Bard's access: please try Bard and provide your feedback.
According to Google's latest demo, Bard is more like a personal assistant focused on work and study than ChatGPT. With its assistance, you can stimulate ideas and satisfy curiosity.
You can ask Bard to explain quantum physics in plain language, or ask Bard to brainstorm and help you read 20 books in a year.
Google said that users can accelerate ideas and stimulate curiosity with the help of Bard. You can use Bard to give tips like how to read 20 books a year, or explain quantum physics in plain language.
We also found some details in the content of the demo. Bard seems to generate multiple answers at the same time. You can choose the one that suits you best according to your needs and keep asking questions.
Of course, ChatGPT can also generate multiple answers, but it is regenerated after the answer is over. In comparison, Bard is more like a party B who will provide multiple solutions at the same time.
It may be that the negative news of ChatGPT and Bing Chat has attracted the attention of Google, which constantly emphasizes that Bard is just an experiment, and the information generated by Bard does not represent Google's point of view.
Google said that although Bard is powered by a large language model and will become stronger over time, it will learn some biases or stereotypes, causing it to "confidently" say some inaccurate or false information , for example, it will understand "ZZ Plant" as Zamioculcas zamioculcas instead of the correct Zamioculcas zamiifolia.
With the lessons learned from Bing Chat's "crazy", the first beta version of Bard limited the number of exchanges in the same conversation, so as to ensure the accuracy of the content. When you first log into Bard, it tells you this is an experiment and looks forward to your feedback.
As the first version of Bard, it does not yet support more languages (including Chinese), and Google will continue to update code writing, image recognition/generation, and multi-language support.
▲ Bard: I don’t know Chinese, but I hope to speak Chinese in the future
Currently, Bard is only open to access in the United Kingdom and the United States, and will gradually expand to more countries and regions in the future.
You can make pictures while chatting, Bing Chat goes a step further
How popular Bing Chat is, perhaps only the data that Bing DAU exceeded 100 million for the first time after its launch can explain. From complex questions to entertaining chats to inspired ideas realized, Bing Chat is reshaping the way we search the web.
From now on, you can ask Bing Chat to draw pictures.
Microsoft has updated the preview versions of the new Bing and Edge browsers with three new features: Bing Image Creator, AI-driven Stories, and Knowledge Cards 2.0. The most important of these is the Bing Image Creator that can draw pictures.
According to Microsoft, the human brain processes visual information about 60,000 times faster than text. In Bing's search data, images are one of the most searched types. Bing Image Creator, powered by an advanced version of the DALL·E model, can Let us use our own language to describe the image information in Bing Chat, select the art style, and Image Creator will connect the context and "draw your imagination on paper".
The addition of Bing Image Creator makes the Edge browser the first browser to integrate an artificial intelligence image generator.
Stories and Knowledge Cards 2.0 powered by artificial intelligence allow you to get images, short videos and infographics driven by artificial intelligence after searching, and you can get facts and key information at a glance.
When you use Bing to search in the future, what you get will not be cold web links, but richer and more interesting pictures, videos and visual stories.
If you have applied through the new Bing, you can experience Bing Image Creator in Bing Chat now, and if you enter from the following URL, you can try it directly.
However, this feature only supports English for the time being, and will continue to be updated in the future.
Adobe Firefly: "The strongest support among allies"
When technology companies get involved in the generation of pictures, the famous design and creative company Adobe is naturally not far behind. On this crazy night, Adobe also launched its own collection of creative generative AI models: Adobe Firefly.
Adobe demonstrated Firefly's capabilities with a few simple examples. You can use one sentence to turn the scenery in spring into winter.
You can also use a brush on the grass to paint randomly, and then tell Firefly that this is a river, and it will automatically generate a river.
Naturally, Firefly can do more than that. Select a dog’s hair, it can turn the hair into a brush, and help the dog change its hairstyle on the spot; design a word art, it can help you generate a word or a sentence; design a Headphones, Firefly can also put it in the scene and turn it into a product display…
Adobe believes that AI is providing a new way to open the world, and designers can use the most convenient way to realize creativity, "helping creative people rather than replacing creative people."
At the same time, Adobe also created the "Content Authenticity Initiative" CAI, which established a global standard for the attribution of trusted digital content, and marked the content generated by artificial intelligence to create the Firefly ecosystem. The strongest support".
Attack on AIGC
During this night, we witnessed several major breakthroughs in the AI field from the service layer to the application layer.
These clustered updates prove from the side that the field of AI has entered a stage of rapid development. Last month, AI may not be able to draw fingers well. Next month, with the increase in computing power and model upgrades, AI can already replace the work of clothing models.
Countless science fiction works have predicted that AI will become a part of our lives in the future, but no one has told us that we are only a few feet away from this future.
Our lives are being rewritten bit by bit by AI. At the GTC conference, Huang Renxun put forward an interesting point of view. He believes that generative AI is a new type of computer. We can program in human language, and anyone can order the computer to solve problems.
In the past few months, we have witnessed how AI has gradually mastered skills such as drawing, writing, editing, tabulation, and PPT. If this evolutionary speed continues, is there anything that AI cannot do?
OpenAI CEO Sam Altman recently boldly predicted a new version of "Moore's Law" on Twitter. He believes that the amount of global artificial intelligence computing will double every 18 months.
In other words, if you still have doubts about generative AI, then time will give you the most powerful answer.
#Welcome to pay attention to Aifaner's official WeChat public account: Aifaner (WeChat ID: ifanr), more exciting content will be presented to you as soon as possible.