Elon Musk just released an AI that’s smarter than ChatGPT — here’s why that matters


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Elon Musk’s artificial intelligence startup xAI has unveiled Grok 3, its latest AI model that the company claims outperforms leading competitors across key technical benchmarks. The announcement marks a significant escalation in the race to develop more powerful AI systems.

The launch comes just days after Musk’s failed $97.4 billion bid to acquire OpenAI, the company he co-founded with Sam Altman in 2015. During a livestreamed demonstration on X, Musk characterized Grok 3 as “an order of magnitude more capable than Grok 2” and emphasized its ability to reason through complex problems.

Early testing appears to support some of xAI’s claims. The model topped the influential Chatbot Arena leaderboard, scoring higher than OpenAI’s GPT-4o, Google’s Gemini and DeepSeek’s V3 model in blind user testing. Published benchmarks show Grok 3 achieving superior scores in mathematics (AIME ’24), scientific reasoning (GPQA) and coding tasks.

GkC F0naAAILcr8 1
Grok 3 leads the Chatbot Arena leaderboard with a score of approximately 1400, significantly outperforming other major AI models in blind user testing. (Source: xAI)

Inside Grok 3’s massive computing infrastructure: 200,000 GPUs and a new data center

“Grok 3 clearly has around state of the art thinking capabilities,” wrote former OpenAI researcher Andrej Karpathy in an X post after early-access testing. “Few models get this right reliably. The top OpenAI thinking models get it too, but all of DeepSeek-R1, Gemini 2.0 Flash Thinking, and Claude do not.”

The model’s development required massive computational resources. xAI doubled its GPU cluster to 200,000 Nvidia chips for training, housed in a new Memphis data center. This infrastructure investment highlights the increasing computational demands of advanced AI development, as companies race to build more capable systems.

DeepSearch and advanced reasoning: how Grok 3 aims to outsmart ChatGPT and Google Gemini

A key innovation is Grok 3’s “DeepSearch” feature, which combines web searching with reasoning capabilities to analyze information from multiple sources. The system also includes specialized modes for complex problem-solving, including a “Think” function that shows its reasoning process and a “Big Brain” mode that allocates additional computing power to difficult tasks.

“The thing to really pay attention to in AI is learning speed. And @xai is learning way faster than any other,” posted tech industry veteran Robert Scoble, citing a conversation with Apple Siri cofounder Tom Gruber.

However, some limitations emerged during testing. Karpathy noted that the model sometimes fabricates citations and struggles with certain types of humor and ethical reasoning tasks. These challenges are common across current AI systems and highlight the ongoing difficulties in developing truly human-like artificial intelligence.

Scale.ai CEO Alexandr Wang praised the release, tweeting: “Grok 3 is a new best model in the world from the @xai team!” He noted its superior performance on various benchmarks and expressed enthusiasm for future collaboration.

AI industry competition heats up: what Grok 3’s launch means for OpenAI, DeepSeek and the future of artificial intelligence

The model will be available through X’s Premium+ subscription ($40/month) and a new standalone “SuperGrok” service ($30/month). Enterprise API access is planned for the coming weeks.

This launch intensifies competition in the AI industry, particularly as Chinese startup DeepSeek recently demonstrated comparable performance with reportedly lower computational requirements. The development also raises questions about the sustainability of the computational arms race in AI, as companies invest billions in increasingly powerful hardware infrastructure.

GkC H6FXEAA6Zpt
In key performance benchmarks, Grok 3 and its mini variant show superior scores across mathematics, science and coding tests compared to competing models from Google, OpenAI, Anthropic and DeepSeek. The full-size Grok 3 model (dark blue) achieved particularly strong results in scientific reasoning. (Source: xAI)

Musk emphasized that Grok 3 remains in beta, with improvements expected “almost every day.” The company plans to add voice interaction capabilities within weeks and will open-source its previous model, Grok 2, once the new version stabilizes.

Yet perhaps the most telling aspect of Grok 3’s debut isn’t its technical specifications or benchmark scores, but what it represents: the mounting tension between Musk and his former colleagues at OpenAI. Just days after his failed $97.4 billion bid to acquire OpenAI, Musk has unveiled a model that challenges its supremacy — suggesting that in the high-stakes race for AI dominance, even a rejected suitor can become a formidable rival.



Source link

About The Author

Scroll to Top