Grok 4 is crushing it — Elon Musk’s AI just topped the leaderboard that matters most

When it comes to chatbots, it's easy to forget about Grok because it seems like other big tech is always in the news. With Google's Nano Banana starting new trends and OpenAI's ChatGPT hyping their latest models, Elon Musk's chatbot simply exists in the background.
I've definitely found myself rolling my eyes at some of Grok's decisions, especially when it comes to image generation. However, it's clear that there are some reasons to sit in awe of what Elon Musk calls “the smartest AI in the world.”
As someone who has spent hours testing it, the truth is, it's not just hype. From near-instant web searches to jaw-dropping results on complex engineering queries, Grok 4 is delivering in ways its predecessors and rivals haven’t quite managed. Whether you love the direction or cringe at the controversies, Grok 4 may always be the underdog that quietly crushes it.
What makes xAI's Grok different
I now think @xAI has a chance of reaching AGI with @Grok 5. Never thought that before. https://t.co/FaBUYegl3DSeptember 17, 2025
Elon Musk posted on X highlighting that Grok 4 is at the top of the ARC-AGI leaderboard. To understand why that's impressive, it's important to become familiar with how models are tracked on it.
Essentially, the ARC-AGI leaderboard is a scoreboard for AI, that not only tracks how many problems a model can solve, but also how efficiently it solves them. In other words, it's measuring both the brain and the resourcefulness of the model. High performance with low cost per task is what matters most.
So, Grok's position at the very top is extrememly significant because it means the xAI model is not only keeping up with rivals like Gemini and ChatGPT, but outpacing them on some of the toughest benchmark criteria possible.
Beating every other chatbot suggests that Grok 4 is powerful and efficient, which is exactly the type of breakthrough that supports true progress in the evolution of artifical general intelligence (AGI).
Where Grok still stumbles
Whether used on X or on the standalone platform, real-time search pulls in fresh infromation from both the web and X, so it can keep up with breaking news at a moment's notice.
However, the accuracy and bias concerns are what critics keep coming back to. Grok has made some claims that turned out false, and there are questions about how its alignment is being guided (e.g. how much Musk’s own views factor in).
The model also struggles with issues of content moderation after xAI scrambled to pull posts and update filters when anitsemitc content popped up.
The takeaway
Despite the model beating it's rivals, questions still remain like, will it stay reliable as usage increases? Will “garbage data” or bias creep back in under pressure? How well will xAI handle moderation long-term? The past controversies suggest it’s an ongoing battle.
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
There are no doubts that Grok is not perfect. It carries some extremely controversial baggage, but the proof of what it does better in terms of speed, real-time data and flexible thinking makes it a serious contender in the AI race.
More from Tom's Guide
- I tested Pangram, the ‘black light’ of AI detection built by ex-Tesla and Google engineers — here's how well it worked
- I tested ChatGPT vs Claude with 7 personal productivity tests — here’s the clear winner
- Nano Banana just broke the internet with these viral trends — I tried these 5 prompts and I'm blown away












Amanda Caswell is an award-winning journalist, bestselling YA author, and one of today’s leading voices in AI and technology. A celebrated contributor to various news outlets, her sharp insights and relatable storytelling have earned her a loyal readership. Amanda’s work has been recognized with prestigious honors, including outstanding contribution to media.
Known for her ability to bring clarity to even the most complex topics, Amanda seamlessly blends innovation and creativity, inspiring readers to embrace the power of AI and emerging technologies. As a certified prompt engineer, she continues to push the boundaries of how humans and AI can work together.
Beyond her journalism career, Amanda is a bestselling author of science fiction books for young readers, where she channels her passion for storytelling into inspiring the next generation. A long-distance runner and mom of three, Amanda’s writing reflects her authenticity, natural curiosity, and heartfelt connection to everyday life — making her not just a journalist, but a trusted guide in the ever-evolving world of technology.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.