The Evolution of GPT: Goodbye to 'Drill-and-Kill' Tactics, Can AI 'Comprehend' Like Humans?

AIbase基地

Published inAI News · 6 min read · Oct 25, 2024

211

Large language models (LLMs) such as the GPT series have demonstrated remarkable capabilities in language understanding, reasoning, and planning, reaching human-level performance in various challenging tasks, thanks to their vast datasets. Most research focuses on enhancing these models by training them on even larger datasets, aiming to develop more powerful foundational models.

However, while training more powerful foundational models is crucial, researchers believe that enabling models to continuously evolve during the reasoning phase, known as AI self-evolution, is equally vital for AI development. Unlike training with massive datasets, self-evolution might require only limited data or interactions.

Inspired by the columnar structure of the human brain cortex, researchers hypothesize that AI models can develop emergent cognitive abilities and construct internal representation models through iterative interactions with their environment.

To achieve this, researchers propose that models must possess long-term memory (LTM) to store and manage processed real-world interaction data. LTM not only represents long-tail individual data in statistical models but also facilitates self-evolution by supporting diverse experiences across various environments and agents.

LTM is key to AI self-evolution. Similar to how humans learn and improve through personal experiences and interactions with the environment, AI models' self-evolution relies on LTM data accumulated during interactions. Unlike human evolution, LTM-driven model evolution is not limited to real-world interactions. Models can interact with the physical environment like humans and receive direct feedback, which, after processing, enhances their capabilities—a key area of embodied AI research.

On the other hand, models can also interact in virtual environments and accumulate LTM data, which is more cost-effective and efficient compared to real-world interactions, thus more effectively enhancing capabilities.

Constructing LTM involves refining and structuring raw data. Raw data refers to all unprocessed data received by the model through interactions with the external environment or during training. This data includes various observations and records, potentially containing valuable patterns and much redundant or irrelevant information.

Although raw data forms the basis of model memory and cognition, further processing is needed to use it effectively for personalized or efficient task execution. LTM refines and structures this raw data, enabling the model to use it, enhancing its ability to provide personalized responses and suggestions.

Building LTM faces challenges such as data sparsity and user diversity. In continuously updating LTM systems, data sparsity is a common issue, especially for users with limited or sporadic interaction histories, making model training difficult. Additionally, user diversity adds complexity, requiring models to adapt to individual patterns while effectively generalizing across different user groups.

Researchers have developed a multi-agent collaboration framework called Omne, which implements AI self-evolution based on LTM. In this framework, each agent has an independent system structure capable of autonomously learning and storing a complete environmental model, thereby building an independent understanding of the environment. Through this LTM-based collaborative development, AI systems can adapt in real-time to changes in individual behavior, optimize task planning and execution, further promoting personalized and efficient AI self-evolution.

The Omne framework achieved first place in the GAIA benchmark, demonstrating the significant potential of using LTM for AI self-evolution and solving real-world problems. Researchers believe that advancing LTM research is crucial for the continuous development and practical application of AI technology, especially in self-evolution.

In summary, long-term memory is key to AI self-evolution, enabling AI models to learn and improve from experiences like humans. Building and utilizing LTM requires overcoming challenges such as data sparsity and user diversity. The Omne framework provides a feasible solution for LTM-based AI self-evolution, with its success in the GAIA benchmark indicating the field's significant potential.

Paper: https://arxiv.org/pdf/2410.15665

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Moonshot AI officially released its latest creation - the Kimi K2 model, and simultaneously announced its open source. This foundation model based on the MoE architecture has gained widespread attention in the AI field since its release, thanks to its strong coding capabilities and excellent general Agent task processing abilities. The Kimi K2 model has a total of 1T parameters, with 32B activated parameters. It has achieved top performance among open-source models in a series of benchmark performance tests such as SWE Bench Verified, Tau2, and AceBench.

AI Daily: Zhipu Launches PPT Generation Function AI Slides; Ke Ling AI Releases Ketur 2.1 Model

1. Zhipu launches free AI Slides for PPT generation. 2. Keling AI introduces KeTu 2.1 with 180 styles. 3. NVIDIA's DiffusionRenderer enables 3D scene editing. 4. Modao AI offers 30-second prototype generation. 5. Higgsfield creates avatars from 10 photos. 6. Google open-sources GenAI Processors. 7. Google Veo3 adds image-to-video. 8. Mistral AI releases Devstral2507 for code generation.....

Google DeepMind Open Sources GenAI Processors: One-Click Building of Real-Time AI Workflows

Google DeepMind open sources the GenAI Processors Python library, helping developers build efficient generative AI workflows. The library supports asynchronous processing of multimodal data and optimizes Gemini API application development, significantly reducing latency in real-time applications. Core features include a modular Processor interface, streaming API design, and concurrency optimization, enabling rapid development of real-time applications such as intelligent assistants. Currently only supports Python, but with an open community contribution model, future plans include expanding functionality to cover more scenarios.

Manus AI Official Website and Social Media Undergo Changes, Chinese Users May Be Affected

General AI company Manus adjusts its China operations, lays off employees, and relocates its core technology team to Singapore. The China region had approximately 120 employees, and the company states this move is aimed at improving operational efficiency and focusing on core business. The official website now shows that the region is unavailable, replacing previous messages about the development of the Chinese version. The official Weibo and Xiaohongshu accounts have also been cleared, indicating a significant shift in the company's market strategy in China.

Modo AI Launches: Input Your Idea and Generate a High-Fidelity, Editable Prototype in 30 Seconds

Modo AI introduces a 30-second rapid prototype generation feature, supporting multi-device adaptation and conversation optimization. Users can generate high-fidelity, editable prototypes through text, sketches, and other input methods, and support iterative conversation adjustments. The AI can intelligently parse uploaded sketches, wireframes, and more, automatically generating interfaces. It offers dual-mode editing, automatic documentation generation, and code integration features, covering multiple scenarios such as e-commerce and social networking, significantly lowering the barrier to prototype creation and improving product design efficiency.

Mistral AI Releases Devstral2507: Designed for Code-Centric Language Modeling

Mistral AI launched the Devstral2507 series with two AI models: the open-source Devstral Small1.1 (24 billion parameters, SWE-Bench score of 53.6%) and the enterprise version Devstral Medium2507 (score of 61.6%). Small1.1 supports a 128k context window and local deployment, while Medium2507 outperforms some commercial models. Both are optimized for code reasoning and program synthesis, and support integration with agent frameworks.

Musk's New AI Chatbot Grok 4: Pursuing Truth or Advocating Personal Opinions?

Musk's xAI launched Grok4 AI chatbot, promoting 'truth-seeking' but sparking controversy. Tests show it often cites Musk's views on sensitive topics like Israel-Palestine conflict and immigration. Grok previously faced anti-Semitic content issues, highlighting risks of linking AI to founder's opinions. While Grok4 outperforms rivals in some tests, frequent errors and lack of transparency may hinder commercialization. xAI is promoting $300/month s....

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

The Evolution of GPT: Goodbye to 'Drill-and-Kill' Tactics, Can AI 'Comprehend' Like Humans?

AIbase基地

This article is from AIbase Daily

AI News Recommendations