Researchers from the University of Tübingen's Elis Institute, the University of Maryland, and Lawrence Livermore National Laboratory have developed a novel language model called Huginn. This model utilizes a recursive architecture, significantly enhancing its reasoning capabilities.

Unlike traditional models, Huginn doesn't require specialized "reasoning chain" training. Instead, it autonomously reasons within the neural network's "latent space" before outputting results.

Huginn was trained at a massive scale on the Frontier supercomputer using 4096 AMD GPUs. Its unique training method employs a variable number of computational iterations. The system randomly determines the number of times a computational module is repeated, allowing the model to better adapt to varying task complexities.

Robot Thinking

Image Source: AI-generated image, licensed through Midjourney

Tests show Huginn excels at mathematical and programming tasks, outperforming open-source models with significantly larger parameter scales and training datasets in benchmarks like GSM8k and MATH. Researchers observed that Huginn adjusts its computational depth based on task complexity and develops reasoning chains within the "latent space." Analysis reveals complex computational patterns forming in the "latent space," such as circular trajectories when solving math problems. This demonstrates Huginn's ability to autonomously learn and reason in novel ways.

While acknowledging that Huginn's absolute performance still needs improvement, researchers consider it a compelling proof-of-concept. With increased reasoning time and enhanced capabilities, large models employing Huginn's architecture could potentially replace traditional reasoning models. The team highlights that Huginn's approach may capture inexpressible reasoning types and plans future research exploring extensions like reinforcement learning to further boost model performance.