Grok 3 and The New AI Landscape

Feb 19, 2025

Grok 3 was released yesterday (Feb 17, 2025) and has demonstrated astonishing performance. In the Chatbot Arena, a live competition, it ranked first in every category, with an ELO score that outpaced all competitors.

Grok-3 is also #1 across all categories in Arena leaderboards:

In the standard benchmarks of math, reasoning and coding, Grok 3 has surpassed other reasoning models. They didn’t show results for MMLU, so it’s not clear how Grok 3 did there.

Grok 3 wasn’t directly compared to OpenAI’s full o3 model, which was announced on December 20, 2024, and showed impressive performance against all previous models.

However, OpenAI has yet to make o3 publicly available.For now, that means Grok 3 remains the best model available for users.

Grok 3 has fundamentally reshaped the AI landscape. Until now, there were two dominant players in the closed-source model space: OpenAI and Google. Anthropic could be considered a third contender, but its performance lags behind the two major players, who continue to push out new models. That dynamic has now been disrupted. The newcomer, xAI, is taking the lead and asserting its dominance. Elon Musk even stated, "We are still improving, and it [Grok 3] will get better every day."

The success of Grok 3 is more of an engineering feat than an algorithmic breakthrough. Its architecture is very similar to OpenAI’s models and Google’s Gemini. The reasoning capability using test-time compute were first pioneered by OpenAI o1 model. Now, every company has added reasoning (or inference-time compute) to their models—DeepSeek R1, OpenAI’s o3, and Google’s Gemini 2 Flash Thinking all follow this trend.

What sets Grok 3 apart is its training resources and data. It was trained on 200,000 GPUs at xAI’s Memphis facility. Installing the first 100,000 GPUs took the team 122 days—a speed that is unheard of. Musk mentioned that they initially tried outsourcing the data center to other companies but were told it would take 18 to 24 months to complete. Instead, he decided to build it in-house, and his team got it done in just four months. They then installed the next 100,000 GPUs in another 92 days (three months). This unprecedented GPU capacity gives Grok 3 the training resources to rapidly learn and improve. The scaling law of LLMs is simple: more compute, more training tokens (and a larger model) lead to better performance. This time, Elon Musk is playing with no holds barred, aiming to build the most powerful AI model in the industry.

In fact, Musk expects their electricity needs to increase fivefold. If this corresponds to the number of GPUs, it means they plan to house 1 million GPUs (a figure Musk hinted at in a previous interview). This scale would dwarf all other companies. Grok 3 will soon surpass all competitor models in intelligence.

Grok 3 is available through the X platform and its own website, grok.com. I recently started using the website (which was running Grok 2) and found its responses—both in coding and understanding recent events—to be better than OpenAI’s GPT-4. Now, with Grok 3, we can expect a significant leap in answer quality.

Given how fast the xAI team is iterating and updating the Grok-3 models daily, they are on track to dominate the AI application landscape. Currently, they offer both a free and a paid version, similar to OpenAI’s business model. Users can access the model via a web UI or API calls, also mirroring OpenAI’s approach. Musk is pushing for enterprise applications, which is where the revenue comes from. They will likely ramp up enterprise support soon.

With a lower price and better performance, xAI has the potential to take a significant market share from its competitors. By the end of 2025, I expect xAI’s market share to be on par with OpenAI and Google. Given xAI’s faster pace of innovation, it will likely dominate the market in 2026 and become the largest AI company (hold me accountable for this prediction).

xAI’s growth aligns with the rise of Tesla robots. Grok will learn from robotic data, and the mass production of Tesla robots—another major business goal for Musk—will provide even more training data and intelligence to Grok. This will give Grok real-world intelligence, making it far superior to any other company’s AI models.

The unveiling of Grok 3 has intensified the AI race. Personally, I think the winner is already clear, and by the end of the year, the dust will settle. For many, this might not be obvious—after all, AI still feels new, with ChatGPT having launched just over two years ago in late 2022. The brand power of OpenAI and Google remains strong. But for AI insiders, LLMs have evolved through so many iterations and public releases that the field already feels mature. Now, xAI has stepped onto the stage as another heavyweight, challenging OpenAI and Google for dominance.

AI Frontiers

Discussion about this post