Google has released a new AI tool called Whisk, which disrupts traditional image generation methods by allowing users to use images as prompts instead of lengthy text descriptions.
With Whisk, you can upload images to specify the theme, scene, and style of the AI-generated images, and you can use multiple images for each of these three elements.
For example, I uploaded an image of a pig and an image of a cat, chose an illustration style, and didn't input any text prompts (of course, you can also choose to use text prompts simultaneously). The AI automatically generated this image for me. Note that for the SCENE, it is generally recommended to input a scene image, but if you want to input character images like I did, that works too. The AI will automatically blend them together, and while the relevance may vary, there can be delightful surprises.
If you don't have suitable images on hand, you can click the dice icon to let Google automatically fill in some images as prompts (these images also seem to be AI-generated).
I clicked randomly, and Google provided a dog, a small boat, and an embroidery image to see what kind of effect would be mixed:
The result is quite good; the elements from the three images blended perfectly together to create an interesting embroidery design~
By clicking on the image, I discovered that Whisk also provides text prompts for each generated image. If you're satisfied with the result, you can save or download the image; if you want to further optimize it, you can add more text in the text box or directly click on the image to edit the text prompt.
Google emphasized in a blog post that Whisk is designed for "quick visual exploration, rather than pixel-perfect editing." The company also mentioned that Whisk may "go off track," allowing users to edit the underlying prompts.
I spent a few minutes experimenting with the Whisk tool and found it very interesting. Although image generation takes a few seconds, which can be a bit annoying, and the generated images can sometimes be odd, the iterative process is quite enjoyable.
Google stated that Whisk uses the latest version of the Imagen3 image generation model, which was also officially released today. Google also launched a new generation video generation model, Veo2, which is said to understand "the unique language of film" and has "fewer" hallucinations, such as extra fingers. Veo2 will first be introduced in Google's VideoFX, and users can apply to experience it through Google's Labs waitlist, with plans to expand to YouTube Shorts and other products next year.
In summary, the emergence of Whisk brings new possibilities for image generation, allowing users to express their creativity visually and achieve personalized image customization more conveniently.
Product experience link: https://top.aibase.com/tool/whisk