Google Labs recently launched its latest generative AI experimental tool, Whisk, in the United States. Unlike traditional image generation tools that primarily rely on text prompts, Whisk focuses on using images as input, allowing users to create artwork more intuitively.
Users can directly upload images to Whisk or generate images within the tool by specifying elements such as themes, scenes, and styles. The Whisk system supports users in mixing and matching these components and can be fine-tuned with additional text prompts as needed.
Notably, in the background, Google's language model (possibly the recently released Gemini 2.0 Flash) automatically generates detailed descriptions of the input images. These descriptions are fed into Google's latest image generation model, Imagen 3, capturing the essential characteristics of the subject rather than creating an exact replica.
AIbase conducted multiple tests, uploading the three images on the left, which were then blended to generate the results on the right. The outcomes were quite impressive and highly engaging, as shown below:
However, since Whisk only extracts a few key elements from each source image, Google warns that the generated image results may differ from expectations. For example, the generated images may vary in height, weight, hairstyle, or skin color compared to the original images.
In response, Google stated that these details are often crucial to the success of a project, allowing users to view and edit the text prompts that drive the image generation process.
Early testers, including some artists and creative professionals, reported that Whisk feels more like a new creative tool rather than a traditional image editor. Google hopes this tool will help users quickly brainstorm visually instead of performing precise edits, allowing users to rapidly generate and filter multiple options before saving their favorite works.
Initial testing indicates that while using Whisk is very enjoyable, there is a wait of several seconds for each new image to be generated. These delays may be due to high traffic as users flock to experience this new tool.
Currently, Whisk is only available to users in the United States, who can try it for free at labs.google/whisk and share feedback. Users from other countries do not yet have access to this tool.
Whisk is part of Google Labs, which serves as a testing ground for Google's AI projects, including Gemini, Imagen, and the latest video model, Veo 2. While most projects are still in the experimental phase, some successful projects, such as the recently launched AI assistant NotebookLM, will transition into complete products.
Experience link: https://labs.google/fx/zh/tools/whisk
Highlights:
🌟 Google launches Whisk, the first image-driven generative AI tool.
🎨 Users can upload or generate images for quick visual design rather than precise editing.
🚫 Currently available only to users in the U.S., with no access for other countries yet.