Google has introduced "Whisk," a new generative AI tool aimed at creative professionals and artists. Whisk allows users to generate and remix images by using visual prompts rather than relying solely on detailed text descriptions. Users can input images representing subjects, scenes, or styles, and the tool combines these elements to create unique visuals. This process is powered by Google's Gemini model, which generates captions for the input images, and the Imagen 3 image generation model, which produces the final outputs.
Whisk emphasizes creative exploration over precision editing. It is designed for rapid ideation, enabling users to experiment with multiple variations of their concepts. However, since Whisk extracts only key characteristics from input images, the generated outputs may not always align perfectly with user expectations. To address this, users can view and modify the underlying prompts to refine their results.
Currently available exclusively in the U.S., Whisk is part of Google Labs' experimental projects. This initiative seeks user feedback to refine emerging technologies like generative AI models. Early testers have described Whisk as a novel creative tool rather than a traditional image editor.
This launch highlights Google's commitment to advancing generative AI applications and fostering creativity through innovative tools. By integrating models like Gemini and Imagen 3, Whisk provides a new approach to visual creation that could appeal to artists, designers, and other creative professionals seeking fresh ways to develop ideas.