Recently, OpenAI made an exciting announcement: in its internal testing project Sora, in addition to the already launched video generation feature, an image generation feature is also under rapid development. This new feature allows users to quickly switch between video and image generation, enhancing creative flexibility.
According to internal sources, Sora will introduce a hidden toggle button that allows users to switch between the two modes simply by selecting it in the prompt bar. When image generation is selected, the system will automatically prompt users to describe an image. This design aims to simplify user operations and improve the relevance and quality of the generated content.
In addition to the improvements in the image generation feature, Sora has also reclassified its video recommendations. The newly introduced "Best" and "Top" categories will help users better filter and find content. The "Best" category is similar to the current featured channels, while the "Top" category may rank videos based on user likes or time periods. This change in classification has generated great anticipation for Sora's content recommendation mechanism.
For DALL-E3 users, this news is undoubtedly exciting, as DALL-E3 has seemed somewhat outdated since its release, especially when compared to competitors like Midjourney. Although Sora's image generation feature has not been officially launched yet, the "Images Internal" category in the left navigation bar has already sparked user curiosity. While this category is currently mainly used for video recommendations, it may also provide related content for image generation in the future.
There is speculation that the upcoming image generation model may be called DALL-E4; however, OpenAI has not confirmed this. Industry experts suggest that the image generator in Sora may not directly use DALL-E4 but instead rely on the existing "sora-turbo" model. Additionally, industry insiders have pointed out that ChatGPT has not yet launched a multimodal image generation feature based on GPT-4o, making the launch of this Sora project a notable new development.
It is worth noting that the codename for the text-to-image generator in Sora is "papaya," which adds to the curiosity and anticipation surrounding this project. A year and a half after the release of DALL-E3, it is intriguing to wonder what innovations the next-generation model will bring.