Large World Models is a neural network trained with RingAttention technology, focusing on processing long videos and language sequences to comprehend human knowledge and a multimodal world. Through training on massive datasets, it has achieved an unprecedented size of context and has released a series of 7-billion parameter models capable of handling text and video with over 1 million tokens. These models are designed to facilitate long video understanding, long text processing, multimodal learning, and visual-language interaction.