xAI's New Model Grok 3 Receives Praise for Logical Reasoning Ability from OpenAI Founder

AIbase基地

Published inAI News · 4 min read · Feb 19, 2025

179

Elon Musk's artificial intelligence company xAI released its latest language model Grok3 this Monday, marking a significant advancement for the company in the field of artificial intelligence. According to Musk, the computational power required for the new model is ten times that of its predecessor, utilizing a data center located in Memphis equipped with approximately 200,000 GPUs.

The Grok3 series models have launched several variants, including a streamlined version designed to enhance speed at the expense of some accuracy. Additionally, the new "reasoning" model is specifically designed to tackle mathematical and scientific problems. Users can adjust these features through the "Thinking" and "Brain" settings in the Grok interface. xAI stated that this version is not yet finalized, and the model is still undergoing continuous training, with the team planning improvements in the coming weeks.

According to data from the AI benchmarking platform lmarena.ai, Grok3 scored over 1400 in the chatbot category, becoming a leader across all categories including programming, surpassing models from OpenAI, Anthropic, and Google. However, actual performance may differ from benchmark results. For example, although Claude3.5Sonnet scored lower in coding benchmarks compared to some models, many users still consider it a better choice for programming tasks.

OpenAI founder Andrej Karpathy gained early access to Grok3 and praised the model's logical reasoning capabilities. The "Thinking" feature can successfully handle complex tasks, such as calculating the training flops of GPT-2 or creating hexagonal grids for board games, abilities that were previously limited to OpenAI's high-end model o1-pro. Additionally, this feature improved the accuracy of basic mathematical operations, such as letter counting and comparing decimals.

Regarding the new search functionality, Karpathy noted that the quality of DeepSearch is comparable to the research tools of Perplexity, providing relevant answers on topics such as upcoming Apple products and Palantir stock dynamics. However, he also identified some apparent issues: the model sometimes generates false URLs, makes unsupported claims, and only references posts from X under specific prompts.

It also seems to lack awareness of its own existence, missing xAI's position among major AI laboratories. These limitations prevent DeepSearch from reaching the quality level of OpenAI's "deep research" and it performs poorly on humor and ethical issues.

Elon Musk's xAI Sparks Pollution Controversy in Memphis

Elon Musk's AI company, xAI, has recently sparked controversy in Memphis, Tennessee. The company is building a massive supercomputer in the area to support its operations. However, since the supercomputer became operational last summer, community residents and environmental activists have stated that the facility has become one of the main sources of air pollution locally. Image Note: Image generated by AI, image licensing service Midjourney. In response to these concerns, the Memphis City Health Department has scheduled a first public hearing for Friday.

Aethir Launches AI Unbundled Alliance to Accelerate Web3 AI Innovation

Aethir, a decentralized GPU cloud computing provider, today announced the launch of an industry alliance called "AI Unbundled" to advance the Web3 artificial intelligence ecosystem. This initiative brings together leading organizations including 0G Labs, Biconomy, Polyhedra, Oasis Protocol Foundation, ChainGPT, IoTeX, iExec, Geodnet, Flock.io, and Alpha Ne.

Musk's xAI Plans $25 Billion Colossus 2 Supercomputer

Elon Musk's AI company, xAI, has announced an ambitious new project to expand on its existing Colossus supercomputer. xAI reportedly plans to raise up to $25 billion in upcoming funding to support the development of its next-generation supercomputer, Colossus 2. Image caption: Image generated by AI, image licensing provider Midjourney. In a conference call with existing investors, Musk stated that the company will conduct a reasonable valuation.

xAI Releases Grok3Mini: A Cost-Effective AI Model for Developers

xAI recently unveiled its new language model, Grok3Mini, further advancing efficient AI technology. Designed for speed and affordability, Grok3Mini, despite its smaller size, outperforms many more expensive AI models across various domains, particularly excelling in math, coding, and scientific benchmarks. Grok3Mini: The perfect balance of high performance and low cost. Grok3Mini is part of the Grok3 series, which includes six variants, including the standard Grok3.

Intel Open-Sources AI Playground: Arc GPU-Powered Local AI Model Execution

Intel recently announced the open-sourcing of its AI Playground software, designed for local generative AI. AI Playground provides a powerful platform for running AI models on Intel Arc GPUs. It supports various image and video generation models, as well as Large Language Models (LLMs), significantly lowering the hardware barrier for AI applications by optimizing local computing resources. The project is available on GitHub and has attracted developers and AI enthusiasts worldwide.

Intel Open Sources AI Playground for Intel Arc GPUs and Various AI Models

Intel has announced the open-sourcing of its generative AI software, AI Playground, generating significant interest within the AI community. Optimized for Intel Arc GPUs and integrated graphics, AI Playground is described as an 'AI hub' that supports local running of chat-based Large Language Models (LLMs), as well as image and video generation capabilities. This open-sourcing signifies Intel's commitment to advancing the accessibility of generative AI technology.

AMD GPU Performance Leap! Significant Stable Diffusion Model Optimization

AMD's advancements in AI are noteworthy, particularly its latest optimizations for the Stable Diffusion model. Recently, Stability AI released an ONNX-optimized version of Stable Diffusion, resulting in significantly improved performance for AMD Radeon GPUs and Ryzen integrated graphics in AI tasks, with speed increases up to 3.8 times faster. This progress narrows the gap with NVIDIA's ecosystem...