Microsoft brings AI image generation to Copilot, adds new model Deucalion


In a startling move, Microsoft today announced a redesigned look for its Copilot AI search and chatbot experience on the web (formerly known as Bing Chat), new built-in AI image creation functionality, and a new AI model, Deucalion, that is powering one version of it.

In addition, the Redmond, Washington-headquartered software and cloud giant unveiled a new video ad that will air during this coming Sunday’s NFL Super Bowl pro football championship game between the Kansas City Chiefs and the San Francisco Giants teams.

The redesign is arguably the least interesting part of the announcements today, with Microsoft giving its Copilot landing page on the web a cleaner look with more white space and less text, yet also adding more imagery in the form of a visual carousel of “cards” that show different AI generated images as examples of what the user can make, plus samples of the “prompts” or instructions that the user could type in to generate them.

Here’s images of the old Bing Chat and the new Microsoft Copilot design, one after another, for you to compare:

VB Event

The AI Impact Tour – NYC

We’ll be in New York on February 29 in partnership with Microsoft to discuss how to balance risks and rewards of AI applications. Request an invite to the exclusive event below.

 

Request an invite

Screen Shot 2024 02 07 at 11.23.47 AM
Bing Chat initial design circa July 2023. Credit: VentureBeat screenshot/Microsoft
Screen Shot 2024 02 07 at 11.22.57 AM
Microsoft Copilot redesign circa February 2024. Credit: VentureBeat screenshot/Microsoft

The new Copilot is available publicly for all users at “copilot.microsoft.com and our Copilot app on iOS and Android app stores,” though the AI image generation features are currently available “in English in the United States, United Kingdom, Australia, India and New Zealand,” for now.

A savvy pro-AI Super Bowl ad

The Super Bowl is of course one of the most widely viewed sports events in the world and the United States each year, and the price to air ads nationally during it starts in the multi-millions of dollars even for a short spot of just 30 or so seconds.

That’s not such a big cost for Microsoft given its position lately as one of, if not the most, valuable companies in the world by market capitalization (the lead spot fluctuates pretty regularly), but it does indicate how serious the company is about bolstering the Copilot name and its associations with generative AI as a technology more generally, convincing “Main Street,” to use the colloquialism for average U.S. residents, that they should be using Copilot for their web searching instead of, say, Google.

But the ad actually goes even further than this: in fact, if you watch it (embedded above), you’ll see it quickly but effectively shows people using it to “generate storyboard images” for scenes in a movie script, as well as “code for my 3D open world game.”

The message from Microsoft here is clear: Copilot can do much more than just search. It can create content and even software for you.

Emphasizing content creation in film/TV, video gaming, entertainment — even amid resistance and deepfake scandals

The emphasis on targeting the entertainment industry is notable as well at a time when many actors, writers, performers, musicians, VFX artists and even game makers are openly resisting and calling for more protections against AI taking away their work opportunities.

Microsoft’s add pretty clearly and cleanly brushes past these objections, in my view, positioning Copilot and AI more generally as a a creative tool for up-and-coming strivers.

It’s also notable that Microsoft is bringing new AI image generation capabilities directly to Copilot, which its release says is powered by its Designer AI art generator, similar to how OpenAI’s DALL-E 3 image generation AI model has been baked into ChatGPT.

Designer AI is of course, also powered by DALL-E 3 thanks to Microsoft’s big investment and support for OpenAI. As Microsoft’s news release authored by executive vice president and consumer marketing chief Yusuf Mehdi states:

With Designer in Copilot, you can go beyond just creating images to now customize your generated images with inline editing right inside Copilot, keeping you in the flow of your chat. Whether you want to highlight an object to make it pop with enhanced color, blur the background of your image to make your subject shine, or even reimagine your image with a different effect like pixel art, Copilot has you covered, all for free.  If you’re a Copilot Pro subscriber, in addition to the above, you can also now easily resize and regenerate images between square and landscape without leaving chat. Lastly, we will soon roll out our new Designer GPT inside Copilot, which offers an immersive, dedicated canvas inside of Copilot where you can visualize your ideas.

Microsoft is plowing ahead with its AI image generation capabilities, trying to make them even more accessible to users across mobile and desktop, even amid the scandal that erupted late last month when explicit, nonconsensual AI generated deepfakes of musician Taylor Swift (who is expected to appear at the Super Bowl in support of her NFL player boyfriend) circulated on social platforms and the web, allegedly created with Microsoft’s Designer AI generator. And that was coming after more local deepfake scandals in at least one U.S. high school.

The company seems outwardly unconcerned about any criticisms of AI being misused, as well as unbothered by the lawsuit and federal investigation it is facing from various parties over its use of AI and alliance with OpenAI.

A new AI model emerges: Deucalion

Buried amid the announcements today — actually, not even mentioned in the release itself — is the fact that Microsoft has added a new AI model under the hood of one version of Copilot: Deucalion.

According to a post on X (formerly Twitter) from Microsoft Corporate Vice President and Head of Engineering for Copilot and Bing, Jordi Ribas, the company has “shipped Deucalion, a fine tuned model that makes Balanced mode…richer and faster.”

“Balanced” mode refers to the middle category of results Copilot can produce. Users can select either “Creative,” “Balanced” or “Precise” modes for their responses from Copilot (and previously, Bing Chat), which will result in the AI assistant providing either more or less of its own generated output, and ultimately, more hallucinations the more creative one goes.

However, the “Creative” mode can be more effective for those seeking not specific facts but help with, as the name indicates, creative, open-ended projects such as fictional worldbuilding, writing, and designing.

For those doing research for school or work, the “Precise” and “Balanced” modes are probably a better bet. Of course, “Balanced,” as the name indicates, seeks to split the difference and provide both equal parts creativity and precision/factual responses for users.

Now, the big question is what the Deucalion is based on. Bing Chat itself was powered by OpenAI’s GPT-4, the model underlying ChatGPT Plus/Team/Enterprise, so it stands to reason that GPT-4 and GPT-4 Turbo/V continue to power Copilot.

However, is Deucalion based on GPT-4, or another model, say Microsoft’s Phi-2? The fact that Ribas said it was “a fine tuned model” makes me think it is a version of GPT-4 that has been further tweaked by Microsoft engineers for their purposes. OpenAI does support fine-tuning of GPT 4, according to its documentation.

Documentation for Deucalion is pretty scarce right now from what I’ve seen, but Mikhail Parakhin, Microsoft’s CEO of Advertising and Web Services, posted on X last week that the company was testing it and that it was named after the son of Prometheus in Greek mythology.

As seen in Mikhail’s X post/tweet, it was actually a response to third-party Windows developer and tech influencer Vitor de Lucca, who noticed that the answers provided in Copilot’s Balanced mode were “better and bigger.”

de Lucca further posted on X yesterday that the translation capabilities from the new Deucalion-powered Balanced mode were also superior to the Creative mode.

We’ve reached out to our Microsoft spokesperson contacts for more information and tweeted at Ribas for more information about, and will update when we hear back.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.





Source link

About The Author

Scroll to Top