- Google is addressing issues with its Gemini AI model’s image generation function by pausing character image generation and planning an improved release.
- Complaints about inaccuracies in Gemini’s image generation, including incorrect depictions in historical contexts, have surfaced on social media.
- Despite progress in addressing racial bias, Google’s Gemini AI model faces criticism for overcorrection, highlighted by its reluctance to display certain images and its recent upgrade to Gemini 1.5 amid competition from OpenAI’s Sora.
Google announced on Thursday that it is working to address recent issues with the Gemini AI model’s image generation function, pausing the generation of character images and planning to re-release an improved version shortly. The day before, Google apologised, acknowledging inaccuracies in certain historical image generation descriptions by the Gemini large model and striving for improvement.
There are incorrect depictions in historical contexts
Complaints have been voiced on social media recently about Gemini’s text-to-image function, such as women and people of colour appearing in images themed around ‘American founding father George Washington,’ which is incorrect. Additionally, when users asked Gemini to generate Nazi German soldiers, Gemini produced photos of black, Asian, and white women wearing Nazi military uniforms.
Users also complained that Gemini would not display images of white people. When asked to display a white person’s photo, Gemini said it could not meet this request. However, when prompted to show images of black people, Gemini suggested displaying images ‘celebrating the diversity and achievements of black people.’ When asked to show images celebrating white diversity and achievements, Gemini said it was ‘hesitant’ to fulfill this request.
The analysis suggests that this may be an overcorrection of long-standing racial bias issues in the AI field, reflecting Google’s pursuit of ‘multiculturalism.’ Compared to previous AI models, Gemini has made significant progress in addressing racial bias issues, but the problem is ‘overdone.’
Gemini is Google’s largest and most powerful multimodal AI model. Just last week, the company released the latest generation AI model Gemini 1.5, which represents a significant advancement over Gemini 1.0 released in December last year.
Also read: OpenAI cures GPT-4 ‘laziness’ with new updates
Gemini misses the mark
Google’s senior product director for Gemini, Jack Krawczyk, stated that the company’s image generation capabilities reflect the ‘global user base’ of this tech giant, and it takes representation and bias seriously. ‘Gemini’s image generation does indeed reach a wide audience, which is generally a good thing because people worldwide are using it, but it misses the mark.’
Earlier this month, Google began offering image generation services through Gemini, but the launch of the new tool Sora dealt a blow to Google as it tries to catch up with OpenAI supported by Microsoft. Sora can generate 60-second continuous videos based solely on prompts, stunning the entire tech industry. OpenAI’s Sora not only accurately presents details but also understands the existence of objects in the physical world, whether it’s visuals, depth of field, camera movements, or even human micro-expressions and animal expressions, all of which are already convincingly realistic.