Goodbye Photoshop? Google has just launched a new AI capable of modifying your images with a simple voice command. No more hours spent mastering complex software, now you just have to ask the AI to add a hat to your image. your grandmother or transform your living room into a tropical jungle.
Google has just improved Gemini 2.0 Flash, a template capable of generating and editing images as easily as it creates text. A new on Google's AI, which integrates directly into a chatbot interface, promises to democratize image editing and revolutionize our relationship with photo retouching.
Launched last week and now accessible to all via Google AI Studio, Gemini 2.0 Flash stands out for its ability to process both text and images within a single AI model. This multimodal approach marks a break with existing solutions, which generally used separate models for text and image generation.
Read also – Gemini can now play YouTube videos for you, here's how it works
Gemini 2.0 Flash aims to be even more versatile than before
Gemini 2.0 Flash's photo editing capabilities are vast and varied:
- Adding or removing objects in an image
- Changing backgrounds and lighting
- Changing the viewing angle
- Zooming in or out
- Removing watermarks (although this may affect image quality)
According to Google, this versatility is made possible thanks to training on a large dataset combining images and text. The model thus integrates a deep understanding of visual and textual concepts, allowing it to directly generate images in response to user queries.
Google's approach stands out from that of other tech giants like OpenAI, which uses separate models for text (ChatGPT) and images (DALL-E). From a technical perspective, as you can imagine, processing text and images simultaneously is extremely computationally intensive. This partly explains why the quality The performance of images generated by Gemini 2.0 Flash does not yet reach that of specialized models like DALL-E.
The bad news is that ethically, the ease with which these multimodal models can manipulate images raises legitimate concerns. Creating convincing deepfakes or manipulating photos for malicious purposes could become even easier, posing new challenges for misinformation and privacy.
0 Comments