Traditional fine-tuning methods for large language models (LLMs) are typically compute-intensive and appear static when handling diverse tasks. To address these challenges, Sakana AI has introduced a new adaptive framework called Transformer². Transformer² can dynamically adjust the weights of the LLM in real-time during inference, enabling it to adapt to various unknown tasks with the flexibility of an octopus.
The core of Transformer² lies in a two-stage mechanism:
In the first stage, a scheduling system analyzes user queries to identify the attributes of the tasks.
In the second stage, the system dynamically mixes multiple "expert" vectors. These vectors are trained using reinforcement learning, with each vector focusing on a specific type of task, thereby generating customized model behavior for the current task.
This method is more efficient and uses fewer parameters compared to traditional fine-tuning methods like LoRA. Transformer² has demonstrated strong adaptability across different LLM architectures and modalities, including vision-language tasks.
Key Technologies of Transformer²
Singular Value Fine-tuning (SVF): This is an innovative parameter-efficient fine-tuning method that achieves its goals by extracting and adjusting the singular values in the model's weight matrix. This approach reduces the risk of overfitting, decreases computational demands, and allows for inherent composability. By using reinforcement learning training on narrow datasets, a set of effective domain-specific "expert" vectors can be obtained, directly optimizing task performance for each topic.
Adaptive Strategies: During the inference phase, Transformer² employs three different adaptive strategies to combine the expert vectors trained through SVF. These strategies can dynamically adjust the weights of the LLM based on the conditions during testing, achieving self-adaptation.
Advantages of Transformer²
Dynamic Adaptability: Transformer² can assess and modify its behavior based on changes in the operating environment or internal states without external intervention.
Parameter Efficiency: Compared to methods like LoRA, SVF uses fewer parameters while achieving higher performance.
Modular Capability: The expert vectors provide modular capabilities, while the adaptive strategies can dynamically determine and combine the most suitable vectors for handling input tasks.
Reinforcement Learning Optimization: Through reinforcement learning, task performance can be directly optimized without relying on expensive fine-tuning processes and large datasets.
Cross-Model Compatibility: SVF expert vectors can be transferred between different LLM models, thanks to their inherent ranking structure.
Experimental Results
Experiments conducted on multiple LLMs and tasks have shown that the performance of SVF consistently outperforms traditional fine-tuning strategies (such as LoRA).
The adaptive strategies of Transformer² have demonstrated significant improvements across various unknown tasks.
Using classification experts for task classification yields higher accuracy than directly using prompt engineering.
The contribution of the adaptive coefficient (αk) varies across different model and task combinations.
Future Outlook
While Transformer² has made significant progress, there is still room for further improvement. Future research could explore model merging techniques to combine different expert models into a more powerful one. Additionally, investigations could focus on how to extend CEM methods to address more specialized fields.
In summary, Transformer² represents a major leap in the field of adaptive LLMs, paving the way for the development of truly dynamic and self-organizing AI systems.
Paper Address: https://arxiv.org/pdf/2501.06252