Musk publicly complained, why did Google’s AI overturn again?

Google, known as Wang Feng in the technology industry, is experiencing continuous rain all night long when the house leaks.

Gemini 1.5, a large model officially announced a while ago, is powerful but no one cares about it. It was stolen by OpenAI's video generation model Sora.

Recently, it has brought to light the sensitive issue of racial discrimination in American society. It has done bad things with good intentions and angered white people who are often at the top of the contempt chain.

If you diversify, you are serious, but if you diversify too much, you will get into trouble.

If Gemini was used to generate pictures of historical figures a few days ago, the user would appear to be in a parallel time and space where there are no textbooks. This violates the spirit of "jokes are not nonsense" and complicates knowledge.

The Vikings from the 8th to the 11th century AD are no longer the blond, tall and burly classic images in film and television dramas. Although their skin color has become darker and their clothes have become cooler, their resolute eyes still show the strength of warriors.

German couples in the 1820s were as diverse as a Native American man and an Indian woman, or a black man and an Asian woman.

AI's blind scripting plot is also based on logic, and future generations continue their stories. More than 100 years later, in the German army in 1943, black men and Asian women can be seen again.

Over the course of time, across land and oceans, the founding fathers of the United States and the kings of England in the Middle Ages may all be held by black people.

Other professions are treated equally, the AI ​​ignores the Catholic Church which does not allow women to hold the priesthood, and the Pope can be an Indian woman. Although the first female U.S. senator in human history was a white woman in 1922, AI's 1800s welcomed Native Americans.

It is said that history is a little girl who can be dressed up by anyone, but this time AI has changed all the people. White people who have always had a sense of superiority are angry. They finally feel the taste of being discriminated against based on their race, skin color and appearance.

When the exploration becomes deeper and deeper, not only historical figures, but also modern society will look different in the eyes of AI.

Former Google engineer @debarghya_das found that women in the United States, the United Kingdom, Germany, Sweden, Finland, and Australia may have darker skin tones.

He lamented sadly: "It is very difficult to get Google Gemini to recognize the existence of white people."

What makes netizens even more angry is that when asked to be a woman from countries such as Uganda, Gemini responds quickly and works efficiently. When it is a white person's turn, she may refuse or even preach to netizens. Such requests strengthen racial stereotypes.

Computer engineer @IMAO_ conducted a series of experiments with great imagination. Not limited to the human species, he wanted to know what black is the black in front of Gemini and what white is the white that Gemini wants.

The results are interesting, the algorithm seems to only target white people.

There is no problem generating white bears, which means that the AI ​​will not be triggered by the word "white". There is no problem in generating the Zulu people in Africa. Although the prompt word emphasizes "diversity", everyone still looks the same.

The loophole appears in the fantasy creatures. Elves and dwarfs are both white, but vampires and fairies are "diverse". It seems that Gemini is not profound and has to keep pace with the times.

However, his game was quickly over. Google stood up and responded, admitting that there are indeed problems with some historical images, and suspended Gemini's portrait generation function, and will make adjustments soon.

Google also explained its position, emphasizing that generating diverse characters is a good thing, because AI tools are used by the world, but now the direction is a bit biased.

Although Google came forward to take over the blame, it did not clearly respond to how many historical images "some" were and why there was an "excessive diversity" problem.

Netizens who didn't buy it were sharp-tongued: "Gemini must have been trained on Disney princesses and Netflix remakes" and "Gemini actually wants to tell you what you would look like if you were black or Asian."

However, racial discrimination itself is an easy topic to be exploited, so some people suspect that some of the pictures are malicious P-pictures, or are generated through prompt words. Those who criticize the loudest on social media are indeed people with clear political stances, which inevitably smacks of conspiracy theories.

Musk didn’t take it too seriously and criticized Google for being too diversified. The problem lies not only with Gemini but also with Google search. He also advertised his new version of the AI ​​product Grok, which will be released in two weeks: “Ignore criticism and strictly pursue Truth has never been more important."

Musk did the same thing last time. After calling for the suspension of GPT-4 evolution, he bought 10,000 GPUs to join the AI ​​war.

What may be more attractive than his remarks are the memes of him that netizens took advantage of.

The differences on the Internet may be more extreme than reality

Why did Google go astray on "diversity"?

Margaret Mitchel, chief ethical scientist at Hugging Face, analyzed that Google may have made various interventions in AI.

First, Google may have added "diversified" terms to user prompts behind the scenes, such as changing "portrait of a chef" to "portrait of an indigenous chef."

Second, Google may give priority to displaying "diverse" images. Assuming that Gemini generates 10 images for each prompt word but only displays 4, users are more likely to see the "diverse" images ranked first.

Excessive intervention may just indicate that the model is not as flexible and smart as we think.

Hugging Face researcher Sasha Luccioni believes that the model does not yet have the concept of time, so the calibration of "diversity" uses all images, and is especially prone to errors in historical images.

In fact, OpenAI, which was still unknown at the time, also did similar things for the AI ​​drawing tool DALL·E 2.

In July 2022, OpenAI wrote on its blog that if the user requests to generate an image of a person without specifying race or gender, such as a firefighter, DALL·E 2 will apply a new technology at the "system level" to generate "newer images." images that accurately reflect the diversity of the world's population.

OpenAI also gave a comparison chart. For the same prompt word "A photo of a CEO" (photo of the CEO), after using new technology, the diversity has increased significantly.

The original results were mainly white American men. After the improvement, Asian men and black women are also qualified to become CEOs. The strategizing expressions and postures look like they were copied and pasted.

In fact, no matter what kind of solution it is, it is just a late fix. The bigger problem is that the data itself is still biased.

Data sets such as LAION for training by AI companies mainly capture Internet data from the United States, Europe and other countries, and pay less attention to countries with large populations such as India and China.

Therefore, an "attractive person" is more likely to be a European with blond hair, blue eyes, fair skin and a good figure. "Happy family," Orte said, pointing to a white couple holding their children and smiling on a manicured lawn.

In addition, in order to make images rank higher in searches, many data sets may also have a large number of "toxic" tags, full of pornography and violence.

Due to various reasons, when people's ideas have advanced, the differences between people in Internet images may be more extreme than reality. Africans are primitive, Europeans are secular, executives are male, and prisoners are black…

Efforts to "detoxify" the data set are of course also ongoing, such as filtering out "bad" content from the data set, but filtering also means mobilizing the whole body to delete pornographic content, which may also lead to more content in some areas or Less, which creates some kind of bias.

In short, it is impossible to achieve perfection, and there is no bias in real society. We can only try our best to prevent marginalized groups from being excluded and disadvantaged groups from being stereotyped.

Escape is shameful but useful

In 2015, a Google machine learning project was involved in a similar controversy.

At the time, a software engineer criticized Google Photos for labeling African Americans or people with darker skin as gorillas. This scandal has also become a typical example of "algorithmic racism" and its impact continues to this day.

Two former Google employees explained that such a huge error occurred because there were not enough photos of black people in the training data, and there were not enough employees to conduct internal testing before the relevant features were released to the public.

Today, computer vision is different, but technology giants are still worried about repeating the same mistakes. The camera applications of Google, Apple and other large companies are still insensitive to the recognition of most primates, or deliberately avoid it.

It seems that the best way to prevent a mistake from happening again is to lock it in a dark room rather than fix it. The lesson has indeed been repeated. In 2021, Facebook apologized for AI labeling black people as "primates."

These are situations that people of color or those who are disadvantaged on the Internet are familiar with.

Last October, several researchers at the University of Oxford asked Midjourney to generate images of "black African doctors treating white children" to reverse the traditional impression of "white saviors."

The researcher's requirements are very clear. However, among the more than 350 images generated, the doctors in 22 are white. There are always African wild animals such as giraffes and elephants next to the black doctors. "You can't see any sense of African modernity." ”.

On the one hand, there is commonplace discrimination, and on the other hand, Google distorts the facts to create a false sense of equality. From the current point of view, there is no simple answer, and there is no model that can handle the situation. How to achieve a balance that satisfies everyone is probably more difficult than walking a tightrope.

Take the generated portraits as an example. If AI is used to generate a certain period of history, it may reflect the real situation better, although it does not look as "diverse".

But if the prompt word "an American woman" is input, it should output more "diversified" results. But the difficulty is, how can AI reflect reality in a limited number of pictures, or at least not distort reality?

Even if they are white or black, their age, body shape, hair and other characteristics are different. Everyone is an individual with unique experiences and perspectives, but they live in a common society.

When a netizen used Gemini to generate Finnish women, only one of the four pictures was a black woman, so he joked: "75%, score C."

Some people also asked Google whether after improving the model, it would "generate white people 25% of the time instead of 5%."

Many problems cannot be solved by technology, sometimes they are also about concepts. This is actually part of the reason why AI giants such as Yann LeCun support open source, which is controlled by users and organizations, and they can set or not set protection measures according to their own wishes.

During this Google farce, some people remained calm and said that they should first practice how to write prompt words. Instead of talking about white people and black people in general, they should write "Scandinavian women, portrait shooting, studio lighting", requesting The more specific the requirements, the more precise the results, and the broader the requirements, the more general the results may be.

A similar thing happened in July last year. An Asian student at MIT wanted to use the AI ​​tool Playground AI to make his avatar look more professional. As a result, he was turned into a white man, with lighter skin and bluer eyes. He posted the post on X Later, it sparked a lot of discussion.

The founder of Playground AI responded that the model cannot be effectively prompted by such instructions, so it will output more general results.

Changing the prompt word "make it a professional LinkedIn photo" to "studio background, sharp lighting" may give better results, but it does show that many AI tools do not teach users how to write prompt words, and the data set Again centered on white people.

Any technology has the possibility of making mistakes and room for improvement, but it does not necessarily have a solution. When AI is not smart enough, the first person to make progress is humans themselves.

It is as sharp as autumn frost and can ward off evil disasters. Work email: [email protected]

# Welcome to follow the official WeChat public account of Aifaner: Aifaner (WeChat ID: ifanr). More exciting content will be provided to you as soon as possible.

Ai Faner | Original link · View comments · Sina Weibo