On July 19, 2024, the RWKV Open Source Foundation announced the global open-source release of the RWKV-6-World14B model, which is currently the most powerful dense pure RNN large language model. The model has performed exceptionally well in the latest performance tests, with English capabilities comparable to Llama213B and significantly leading in multilingual performance, supporting over 100 global languages and code.
The model's benchmark tests included four open-source large language models with nearly 14B parameters. English performance was evaluated through 12 independent benchmark tests, while multilingual capabilities were assessed using xLAMBDA, xStoryCloze, xWinograd, and xCopa benchmarks. RWKV-6-World14B excelled in these tests, particularly in the Uncheatable Eval ranking, where it surpassed both llama213B and Qwen1.514B in comprehensive evaluation scores.
The performance improvement of the RWKV-6-World14B model is attributed to the architectural enhancements from RWKV-4 to RWKV-6. The model was trained without incorporating any benchmark datasets, avoiding special optimizations, and thus its actual capabilities are stronger than the scoring rankings suggest. In the Uncheatable Eval evaluation, RWKV-6-World14B was assessed on real-time data such as the latest arXiv papers, news, ao3 novels, and GitHub code, demonstrating its true modeling and generalization capabilities.
Currently, the RWKV-6-World14B model can be downloaded and deployed locally through platforms such as Hugging Face, ModelScope, and WiseModel. Since Ai00 only supports models in safetensor (.st) format, models converted to .st format can also be downloaded from the Ai00HF repository. The GPU memory requirements for local deployment and inference of the RWKV-6-World14B model vary from approximately 10G to 28G depending on the quantization method.
The preview of the RWKV-6-World14B model's effects includes various applications such as natural language processing (sentiment analysis, machine reading comprehension), prose poetry literary creation, reading and modifying code, financial thesis topic suggestions, extracting key news content, expanding text with a single sentence, and writing a Python snake game, among others.
It is important to note that all open-source RWKV models are base models with a certain level of instruction and dialogue capabilities but are not optimized for specific tasks. If you want the RWKV model to perform well on a specific task, it is recommended to fine-tune the training with relevant task datasets.
Project Links: