OpenAI experiments with image generation for Sora platform

OpenAI seems to be preparing Sora for the launch of image generation in addition to its already available video generation capabilities. It has been spotted by reverse engineers that OpenAI is internally testing these image generation capabilities. In particular, there is a new hidden toggle that allows users to switch between video and image generation right in the prompt bar. If you switch to an image, the prompt bar description prompts you to describe an image. It also offers fewer options; for instance, there won’t be a storyboard button, but you can select the format, quality, and number of images to generate.

This feature is not yet functional, but there is also an "Images Internal" category on the left-side navigational bar. Currently, it opens the video feed; however, potentially in the future, users will be able to find a feed of images there as well. It’s unclear what kind of image-generation capabilities will be added and which model will power them.

The image generation capabilities of Sora were mentioned by OpenAI during the initial announcement last year and very likely we will see it being powered by the existing "sora-turbo" model.

In the AI community, there are speculations that we might expect DALL-E 4 at some point. However, there has been no official confirmation from OpenAI regarding this. Additionally, we still haven’t seen multimodal image generation from GPT-4o on ChatGPT, so we await any updates on this topic in the future.

OpenAI is also revamping Sora's video feed, which seems to be split into “Best” and “Top.” The “Best” category will likely be similar to the currently featured feed. However, the “Top” category might allow filtering by a certain period and probably rank videos based on the number of likes or other criteria.

Let’s see how quickly these features will be released, but it’s exciting to see that OpenAI is preparing something for image generation because the existing DALL-E 3 is quite outdated.