Elon Musk just released an AI that’s smarter than ChatGPT — here’s why that matters

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Elon Musk’s artificial intelligence startup xAI has unveiled Grok 3, its latest AI model that the company claims outperforms leading competitors across key technical benchmarks. The announcement marks a significant escalation in the race to develop more powerful AI systems.

The launch comes just days after Musk’s failed $97.4 billion bid to acquire OpenAI, the company he co-founded with Sam Altman in 2015. During a livestreamed demonstration on X, Musk characterized Grok 3 as “an order of magnitude more capable than Grok 2” and emphasized its ability to reason through complex problems.

Early testing appears to support some of xAI’s claims. The model topped the influential Chatbot Arena leaderboard, scoring higher than OpenAI’s GPT-4o, Google’s Gemini and DeepSeek’s V3 model in blind user testing. Published benchmarks show Grok 3 achieving superior scores in mathematics (AIME ’24), scientific reasoning (GPQA) and coding tasks.

GkC F0naAAILcr8 1 — Grok 3 leads the Chatbot Arena leaderboard with a score of approximately 1400, significantly outperforming other major AI models in blind user testing. (Source: xAI)

Inside Grok 3’s massive computing infrastructure: 200,000 GPUs and a new data center

“Grok 3 clearly has around state of the art thinking capabilities,” wrote former OpenAI researcher Andrej Karpathy in an X post after early-access testing. “Few models get this right reliably. The top OpenAI thinking models get it too, but all of DeepSeek-R1, Gemini 2.0 Flash Thinking, and Claude do not.”

The model’s development required massive computational resources. xAI doubled its GPU cluster to 200,000 Nvidia chips for training, housed in a new Memphis data center. This infrastructure investment highlights the increasing computational demands of advanced AI development, as companies race to build more capable systems.

I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.
Thinking
✅ First, Grok 3 clearly has an around state of the art thinking model (“Think” button) and did great out of the box on my Settler’s of Catan… pic.twitter.com/qIrUAN1IfD
— Andrej Karpathy (@karpathy) February 18, 2025

DeepSearch and advanced reasoning: how Grok 3 aims to outsmart ChatGPT and Google Gemini

A key innovation is Grok 3’s “DeepSearch” feature, which combines web searching with reasoning capabilities to analyze information from multiple sources. The system also includes specialized modes for complex problem-solving, including a “Think” function that shows its reasoning process and a “Big Brain” mode that allocates additional computing power to difficult tasks.

“The thing to really pay attention to in AI is learning speed. And @xai is learning way faster than any other,” posted tech industry veteran Robert Scoble, citing a conversation with Apple Siri cofounder Tom Gruber.

Grok 3 benchmarks.
The thing to really pay attention to in AI is learning speed. And @xai is learning way faster than any other.
Who said that?
Apple Siri cofounder Tom Gruber. He told me at dinner a decade ago that that is the most important thing to pay attention to. pic.twitter.com/yWCiJsN9pU
— Robert Scoble (@Scobleizer) February 18, 2025

However, some limitations emerged during testing. Karpathy noted that the model sometimes fabricates citations and struggles with certain types of humor and ethical reasoning tasks. These challenges are common across current AI systems and highlight the ongoing difficulties in developing truly human-like artificial intelligence.

Scale.ai CEO Alexandr Wang praised the release, tweeting: “Grok 3 is a new best model in the world from the @xai team!” He noted its superior performance on various benchmarks and expressed enthusiasm for future collaboration.

Grok 3 is a new best model in the world from the @xai team!
Grok 3 ranks #1 on Chatbot Arena w/a big gap, and scores impressively on pretraining and reasoning evals.
congrats to @elonmusk @ibab @jimmybajimmyba @Yuhu_ai_
looking forward to more partnership on grok4 & beyond ? pic.twitter.com/BrPGz17P51
— Alexandr Wang (@alexandr_wang) February 18, 2025

AI industry competition heats up: what Grok 3’s launch means for OpenAI, DeepSeek and the future of artificial intelligence

The model will be available through X’s Premium+ subscription ($40/month) and a new standalone “SuperGrok” service ($30/month). Enterprise API access is planned for the coming weeks.

This launch intensifies competition in the AI industry, particularly as Chinese startup DeepSeek recently demonstrated comparable performance with reportedly lower computational requirements. The development also raises questions about the sustainability of the computational arms race in AI, as companies invest billions in increasingly powerful hardware infrastructure.

GkC H6FXEAA6Zpt — In key performance benchmarks, Grok 3 and its mini variant show superior scores across mathematics, science and coding tests compared to competing models from Google, OpenAI, Anthropic and DeepSeek. The full-size Grok 3 model (dark blue) achieved particularly strong results in scientific reasoning. (Source: xAI)

Musk emphasized that Grok 3 remains in beta, with improvements expected “almost every day.” The company plans to add voice interaction capabilities within weeks and will open-source its previous model, Grok 2, once the new version stabilizes.

Yet perhaps the most telling aspect of Grok 3’s debut isn’t its technical specifications or benchmark scores, but what it represents: the mounting tension between Musk and his former colleagues at OpenAI. Just days after his failed $97.4 billion bid to acquire OpenAI, Musk has unveiled a model that challenges its supremacy — suggesting that in the high-stakes race for AI dominance, even a rejected suitor can become a formidable rival.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link

Elon Musk just released an AI that’s smarter than ChatGPT — here’s why that matters

Inside Grok 3’s massive computing infrastructure: 200,000 GPUs and a new data center

DeepSearch and advanced reasoning: how Grok 3 aims to outsmart ChatGPT and Google Gemini

AI industry competition heats up: what Grok 3’s launch means for OpenAI, DeepSeek and the future of artificial intelligence

About The Author

Heather Quick

Inside Grok 3’s massive computing infrastructure: 200,000 GPUs and a new data center

DeepSearch and advanced reasoning: how Grok 3 aims to outsmart ChatGPT and Google Gemini

AI industry competition heats up: what Grok 3’s launch means for OpenAI, DeepSeek and the future of artificial intelligence

About The Author

Heather Quick

Start typing and press enter to search