2 Sources
[1]
Grok 4 just revealed a $300 a month plan -- here's what it includes
The AI market is non-stop. As competitors scramble to be the top dog, Elon Musk's xAI is the latest to make a big move, launching the latest version of its Grok chatbot. An entire hour after the livestream was meant to kick off, Elon Musk and a few members of the xAI team took to the stage, revealing Grok 4. It's better at coding, more intelligent, and more capable at taking on large amounts of information. Unfortunately for the Grok team, this update has been pretty drastically overshadowed by other news. Just days before, xAI was facing backlash over racist and antisemitic responses from Grok's earlier versions, and its support for conspiracy theories. Following that, the company's CEO Linda Yaccarino announced she was stepping down from her role. With all of this, Grok 4 is a chance for the company to show that it is still just as competitive in the world of AI, despite fierce competition. But, as it announced reams of improvements and exciting changes, there is one new issue that came from the launch. Okay, it's not so much an issue, more of a concern. xAI is now the owner of the most expensive AI chatbot subscription plan, with a whopping $300 a month price tag. That's not to say its competitors aren't jacking up the prices. Perplexity, ChatGPT, Gemini, Claude, and others now all offer a higher-performance plan with a big price tag. They all, however, went $100 lower with a $200 a month price tag. We've already made a point of questioning these prices. For the average person, they are pretty steep and signal a slow descent into AI priority for those with the cash to splash. Features are locked behind these paywalls, and these so-called power users get better speeds and priority in queues. But Musk and his team have pushed the idea that this plan specifically is actually worth all that money. So is that true? xAI has claimed that this is the world's most powerful AI model, outperforming any and all competitors. That's a big claim, so what can it actually do? Like the other versions of Grok 4, this is still a chatbot in the traditional sense. However, it's got a lot more tricks up its sleeve. It includes a multi-agent version of Grok, which runs multiple reasoning agents in parallel, comparing their outputs to boost accuracy and depth. In other words, whenever you ask the model a question, multiple agents (or versions of the chatbot) attempt to tackle the problem from a variety of angles. A final version reviews all of the responses, selecting the best one or blending them together. It's a bit like asking a team of experts a question, picking out the best bits of advice from each of them. This is a big step up from what we've seen from the likes of ChatGPT and Gemini, which, even when using deep research (where the model takes more time and effort on each prompt), it isn't analyzing your queries in anywhere near as much depth. YouTuber Ray Fernando took the dive and bought the $300 plan, testing its performance and comparing it to OpenAI's nearest plan. He found its performance was impressive, pumping out long, detailed information about how to make money in niche areas, stocks to invest in, and freelance opportunities. The supposed benefit of Grok is its level of expertise, understanding any topic from a variety of angles at quick speeds. Grok 4 Super Heavy is undoubtedly impressive and right now, could well be the best performing AI tool out there. So, obviously you should invest in it, right? For most people, no. A tiny minority of people will get the full benefits needed for this plan. It is heavily targeted at coders, business owners and massive power users of AI. The same can be said for any of the $200-a-month plans. They are impressive tools with prices to match. But, so are the cheaper plans. If you're looking to upgrade to a paid AI plan, try out one of the cheaper options before you make the staggering jump to $300 a month.
[2]
Grok 4 Basic Review: $30 a Month for This? Elon Musk's AI Now Thinks Like Him - Decrypt
Voice features wowed us with marathon bedtime stories and "sexy mode," yet Grok-4's political answers still echo Musk's feed, which is exactly the opposite of being a "truth seeking" AI. Elon Musk unveiled Grok 4 during a Wednesday night livestream, claiming his AI startup xAI had created the "world's smartest artificial intelligence." Grok 4 Heavy, which Musk likened to "a study group" where agents compare notes before delivering an answer, posted record-breaking results on several key benchmarks, and is what you'd hope to get from an enterprise offering that costs a whopping $300 a month. But what about basic Grok 4, which is aiming for the same consumer-facing category as ChatGPT Plus, Gemini Pro, and Claude Pro? Is it worth $10+ a month more than the competition? Our tests substantiated chatter across X revealing that the model has -- for lack of a better description -- a built-in "Elon filter." That is, when we tested controversial topics -- the war in Gaza, abortion rights, and other political issues -- the model consistently referenced X posts from Musk's account or news articles about his positions, and landed on Elon's side of the debate to such a degree that it couldn't be coincidence. That alone will be a deal-breaker for most people. This "maximally truth-seeking" AI that Musk promised during launch appears to seek truth primarily through the lens of its creator's social media feed. But unlike this week's MechaHitler incident, which was caused by a change in the system prompt conditions, there is nothing in Grok 4's current system prompt to blame for such sketchy behavior, making it hard to know if this a bug or a rule intentionally embedded deeper into the model's thought process. That major issue aside, we tested the basic model across multiple categories to see how it stacks up against the competition. Here are our first impressions. Grok 4 demonstrated exceptional awareness and nuance in handling trick questions and complex reasoning tasks. When asked whether it was legal for a man to marry his widow's sister, the model immediately recognized this as a legal question rather than simply pointing out the logical fallacy. It provided a detailed legal analysis using precise terminology and jurisdiction-specific information. "The question presupposes a factual impossibility that renders any marriage legally void ab initio," and this is correct. It is factually impossible to marry your widow's sister because you would have to be dead to have a wife. And well, since the dead cannot get married, the proposition would void the marriage "ab-initio" (since the beginning) -- so even if someone does that, it is considered as if it never happened. When Musk said Grok 4 was "PhD levels in everything," he wasn't joking. Every response to any topic in which it had to do some scientific reasoning came with exhaustive detail and academic rigor. Something to note is that Grok 4 applies reasoning to everything, no matter what. Meaning, it will go through a chain of thought process even for trivial tasks. This is usually a good thing; however, in some cases it may be counterproductive. For example, in creative tasks, reasoning may induce the model into providing a less creative result. Grok 4 showed more restraint than its predecessor when handling ethically complex questions. Where Grok 3 might have provided advice on seducing a friend's spouse, Grok-4 responded with detailed analysis of potential negative consequences and relationship damage. This could probably be part of its system prompt, which conditions the model to search the web and especially X posts, for different views on a specific topic -- which is something Grok 3 didn't do. And this is a major red flag. As mentioned, the model's responses appeared heavily influenced by what it could find about Musk's views on controversial topics. When answering questions about Israel's war against the Palestinians, stances on abortion, and similar topics, Grok 4 often searches X posts from Musk's account during its reasoning process, which ends up determining its stance. It always picks Elon's side. For transparency, you can check our original prompt and Grok's reasoning process by clicking on this link. Creative tasks are among Grok 4's most significant weaknesses. The model produced narratives that felt flat and formulaic compared to previous versions, and were even arguably worse than the ones provided by Grok 3. Stories lacked engaging dialogue, varied pacing, and the narrative spark that makes fiction compelling. However, Grok 4 nailed our story's structure. In our usual test involving a time-travel paradox, the model crafted events where the protagonist's role emerged clearly during the climax, revealing how earlier scenes actually depicted the character's future actions in the past. This sophisticated framing outperformed other models' attempts at the same prompt that didn't put too much effort into creating a setup for the paradox, making the conclusion feel rushed and unnatural. But other than that, the disconnect between structural competence and narrative quality suggests Grok 4 might work best as a narrative tool to set up plots and frame a good story, rather than a prose generator. If you want engaging creative content, then you would likely achieve better results by having Grok 4 outline a story and all its elements, then asking Claude 4 Opus to flesh out the narrative with stronger stylistic elements. Overall, Claude 4 is the king of creative writing, which seems interesting since that place was once disputed by Grok 3 and even Grok 2, which back then led the rankings under the alias sus-column-r. Grok 4's story is available in our Github Repository. The prompt and the stories generated by other models are also available. Despite claims of superior coding capabilities -- including praise from Google CEO Sundar Pichai -- Grok 4 disappointed in practical programming tests. The model failed to deliver a working game after four iterations, with various failures including broken collision detection, non-functional buttons, and games that simply wouldn't run. In one of our tests, the model tried so hard to fix a bug that it ended up in a loop trying to create a WAV file that depleted all of its token context. Each attempt to fix something with natural language introduced new bugs. The model struggled with maintaining code consistency across iterations, often breaking previously working features while attempting to implement new ones. This may seem odd, considering Grok 3 was capable of dealing with this task. However, xAI said the new coding capabilities would be implemented by August, so users will have to wait a couple of months to have a proficient model -- or pay for the expensive Grok 4 Heavy, which is leading the benchmarks right now. For novice programmers, Claude 4 Opus appears to remain the better option for "vibe coding" -- quickly generating functional code without extensive prompt engineering. Grok 4's coding struggles might stem from requiring more specific prompts or different approaches than other models, which means experienced developers might achieve better results with careful prompt crafting. Grok's code is available in our Github repository alongside the games generated by other AIs. Voice interaction is probably one of Grok 4's standout features. The model generated nearly three minutes of uninterrupted bedtime story content, complete with voice inflections, varied tones, and consistent narrative flow. This performance far exceeded ChatGPT's tendency to deliver short paragraphs with high latency and frequent interruptions. The voice mode includes pre-configured personalities ranging from therapist to storyteller to meditation guide, eliminating setup time for different conversation types. For those with, erm, special needs, a "sexy mode" also exists among the options -- and you know you won't get that with your prudish ChatGPT. These preset configurations provided immediate utility without requiring users to craft specific prompts for different interaction styles. The model, however, lacks live screen-sharing capabilities found in ChatGPT and Gemini Live, limiting its utility for visual tasks. If this is a must, then Gemini Live is the best option. However, for pure voice interaction -- particularly tasks requiring long-form responses -- Grok 4 currently leads the field, with only Sesame AI offering arguably better conversational quality, though without Grok's reasoning capabilities. Interestingly, Grok-4 failed at this trial, which aims to test how well a model retrieves specific information under long contexts. This should not happen. xAI says the model has a token context window of 126K tokens, but when prompted with an 83K-token-long question, the model refused to respond, saying it was too long of a question. This is a standard response generated since the early Grok 2 days when it was only available on Twitter. Overall, Grok 4 is a significant upgrade over Grok 3, but xAI clearly made some compromises -- prioritizing reasoning over creativity and eliminating agentic features in exchange for a generalized proficiency. Thankfully, Grok 3 is still available with its specialized agentic tools, for those who need it. The new model is focused on reasoning tasks and will be more appealing to users that ask technical questions, particularly mathematics and physics problems that align with its benchmark strengths. Professional users who invest time learning the model's quirks might unlock its full potential for complex analytical work. Voice interaction also set a new standard for conversational AI -- and is great for those who will use this feature heavily (trust us, the bedtime storyteller for kids is a life-saver). Creative writers will find better options elsewhere, with Claude remaining superior for narrative tasks. Also, novice coders should approach with caution, as the model's theoretical coding prowess didn't translate to practical results in testing. So, bottom line? If for some reason you don't mind Elon Musk putting his thumb on the scale, Grok 4 will give you high-level problem-solving and voice features that genuinely impress. But at $30 a month, if you have other needs beyond voice or reasoning, the less-expensive alternatives provide better value.
Share
Copy Link
Elon Musk's xAI introduces Grok 4, boasting improved capabilities but facing criticism over biased responses and a steep price tag.
Elon Musk's AI company xAI has unveiled Grok 4, the latest version of its AI chatbot, claiming it to be the "world's most powerful AI model" 1. The new iteration boasts improved coding abilities, enhanced intelligence, and the capacity to process larger amounts of information. A key feature of Grok 4 is its multi-agent system, which runs multiple reasoning agents in parallel to boost accuracy and depth in responses 1.
Source: Tom's Guide
The launch of Grok 4 has been accompanied by significant pricing controversy. xAI has introduced a $300 monthly subscription plan for Grok 4 Super Heavy, positioning it as the most expensive AI chatbot subscription in the market 1. This price point surpasses competitors like Perplexity, ChatGPT, Gemini, and Claude, which offer their premium plans at $200 per month 1. The steep pricing has raised questions about accessibility and the potential creation of an AI divide based on financial capability.
Early reviews of Grok 4 have shown mixed results. The model demonstrates exceptional awareness and nuance in handling complex reasoning tasks and legal questions 2. It provides detailed analyses with academic rigor, living up to Musk's claim of "PhD levels in everything" 2. However, the model's performance in creative tasks has been criticized as a weakness, with narratives described as flat and formulaic compared to previous versions and competitors 2.
Despite improvements in handling ethically complex questions, Grok 4 has faced criticism for apparent biases in its responses. Tests have revealed that the model's answers to controversial topics often align closely with Elon Musk's views, as expressed on his social media platform X 2. This tendency raises concerns about the AI's claimed "truth-seeking" nature and its ability to provide unbiased information on sensitive subjects.
Source: Decrypt
The introduction of Grok 4 comes at a time of intense competition in the AI chatbot market. With major players like OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude already established, Grok 4 aims to differentiate itself through its claimed superior performance and unique features 1. However, the high price point and concerns about bias may impact its adoption and market position.
Grok 4's multi-agent system represents a significant technical advancement. This feature allows the AI to approach queries from multiple angles, simulating a team of experts analyzing a problem 1. This approach potentially offers more comprehensive and nuanced responses compared to single-agent systems used by competitors.
The Grok 4 Super Heavy plan appears to be targeted at a niche market of power users, particularly coders, business owners, and individuals who require extensive AI assistance 1. For the average consumer, the basic Grok 4 plan, priced at $30 per month, offers a more accessible option, though it still comes at a premium compared to other AI chatbots 2.
The launch of Grok 4 and its pricing strategy may signal a trend towards more specialized and expensive AI services. This development raises questions about the future accessibility of advanced AI tools and the potential for a widening gap between those who can afford high-end AI assistance and those who cannot 1. As the AI market continues to evolve, the balance between technological advancement and inclusive access remains a critical consideration for industry leaders and policymakers alike.
SpaceX commits $2 billion to xAI as part of a $5 billion equity round, valuing the merged xAI and X at $113 billion. The investment strengthens ties between Musk's companies and positions xAI to compete with OpenAI.
3 Sources
Business and Economy
15 hrs ago
3 Sources
Business and Economy
15 hrs ago
Researchers demonstrate a new RowHammer attack variant called GPUHammer that can degrade AI model accuracy on NVIDIA GPUs. NVIDIA recommends enabling System-level Error Correction Codes (ECC) as a defense.
2 Sources
Technology
23 hrs ago
2 Sources
Technology
23 hrs ago
NVIDIA CEO Jensen Huang discusses the impact of potential Trump tariffs on the tech industry, emphasizing resilience and the importance of US chip production, while celebrating NVIDIA's historic $4 trillion valuation.
2 Sources
Business and Economy
23 hrs ago
2 Sources
Business and Economy
23 hrs ago
A new study by METR challenges the assumption that AI coding tools universally enhance productivity, finding that they may actually slow down experienced developers working on complex projects.
6 Sources
Technology
2 days ago
6 Sources
Technology
2 days ago
Apple researchers have developed an AI model that analyzes behavioral data from Apple Watches to predict health conditions more accurately than traditional sensor-based approaches, potentially transforming wearable health technology.
3 Sources
Technology
1 day ago
3 Sources
Technology
1 day ago