The Google research team has recently introduced a groundbreaking technology called Alchemist. This technology allows users to precisely edit the material properties of objects in images, such as color, glossiness, and transparency, without the need for professional image editing software and skills.
The core of the Alchemist technology is a finely-tuned "text-to-image" (T2I) generation model. The research team achieved fine control over material parameters by creating a synthetic dataset and modifying the Stable Diffusion 1.5 model architecture.
Specifically, researchers first generated a large number of synthetic images using computer graphics and physics-based rendering techniques. These images contained various 3D models and randomly selected materials, camera angles, and lighting conditions. They then changed a single attribute of these images to create multiple versions with different editing intensities.
By fine-tuning these synthetic data, the model learned how to change only the specified material properties, given a context image, instructions, and an editing intensity value, while maintaining the object shape and image lighting.
Experimental results show that the technology can effectively change the appearance of objects, such as enhancing the metallic feel or adjusting transparency. In user studies, this method significantly outperformed baseline methods in terms of photo realism and user preference.
The application prospects of this technology are broad. It can help interior designers preview the effect of repainting a room or assist architects, artists, and designers in quickly producing design sketches for new products. In addition, since the editing effects are visually consistent, this technology can also be used for downstream 3D tasks, such as NeRF (neural radiance field) reconstruction.
Although the Alchemist technology has made significant progress in material editing, the research team also pointed out some limitations. For example, there is still room for improvement in the model's ability to handle hidden details in images.
However, the researchers are confident in the potential of this technology for controllable material editing. With further research and optimization, Alchemist is expected to bring revolutionary changes to the field of image editing, making complex material editing tasks simpler and more intuitive.
Google's Alchemist technology represents another major breakthrough in the field of AI in image processing. It not only simplifies the complex image editing process but also provides new possibilities for the creative industry, with the potential to have a profound impact in areas such as design, art, and virtual reality.
Project Address: https://prafullsharma.net/alchemist/