TEN-framework is an innovative AI agent framework designed to provide high-performance support for real-time multimodal interactions. It supports multiple programming languages and platforms, achieves edge-cloud integration, and flexibly transcends the limitations of single models. By managing agent states in real time, TEN-framework enables AI agents to dynamically respond and adjust their behavior instantly. This framework is built to meet the growing demand for complex AI applications, especially in audio-visual contexts. It not only offers efficient development support but also promotes innovation and application of AI technologies through modular and reusable extensions.