OpenAI president shares first image generated by GPT-4o


Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.


OpenAI’s president Greg Brockman has posted from his X account what appears to be the first public image generated using the company’s brand new GPT-4o model.

As you’ll see in the image below, it is quite convincingly photorealistic, showing a person wearing a black T-shirt with an OpenAI logo writing chalk text on a blackboard that reads “Transfer between Modalities. Suppose we directly model P (text, pixels, sound) with one big autoregressive transformer. What are the pros and cons?”

The new GPT-4o model, which debuted on Monday, improves upon the prior GPT-4 family of models (GPT-4, GPT-4 Vision, and GPT-4 Turbo) by being faster, cheaper, and retaining more information from inputs such as audio and vision.

It is able to do so because OpenAI took a different approach from its prior GPT-4 class LLMs. While those chained multiple different models together and converted other media such as audio and visuals to text and back, the new GPT-4o was trained on multimedia tokens from the get-go, allowing it to directly analyze and interpret vision and audio without first converting it into text.

VB Event

The AI Impact Tour: The AI Audit

Join us as we return to NYC on June 5th to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Request an invite

Based on the above image, the new approach is a noticeable improvement over OpenAI’s last image generation model DALL-E 3 which debuted in September 2023. I ran a similar prompt through DALL-E 3 in ChatGPT and here is the result.

open ai dall 3 eg

As you can see, the image shared by Brockman created with GPT-4o improves significantly in quality, photorealism, and accuracy of text generation.

However, GPT-4o’s native image generation capabilities are not yet publicly available. As Brockman alluded to in his X post by saying “Team is working hard to bring those to the world.”





Source link

About The Author

Scroll to Top