Alibaba Cloud's Bailian Model Service Platform has recently launched the "Real-time Audio and Video Interaction" feature, allowing users to easily build multimodal AI applications without any programming knowledge. This new feature enables users to quickly integrate AI models into Web, iOS, and Android applications and share them with others.

Users can set up intelligent applications through simple steps: first, create a new intelligent application, then select and configure the required text, speech, or visual understanding large models on the Alibaba Cloud Bailian platform. The platform offers over 200 large models, including the Alibaba Cloud Qwen2-VL model, which has powerful visual intelligence capabilities. Next, users need to write prompts, set the audio and video API-KEY, and publish their own exclusive AI applications. After publishing, users can choose different distribution channels, including API, web, WeChat mini-programs, DingTalk bots, and can also integrate the intelligent agent into Web, iOS, or Android applications using the audio and video SDK.

WeChat Screenshot_20241221100512.png

Additionally, the Alibaba Cloud Bailian platform provides extra tutorials to help users configure knowledge bases to improve interaction recognition accuracy and set up workflows to ensure more stable AI responses. Currently, the price for the General API on Alibaba Cloud Bailian has dropped to as low as 0.3 yuan per million tokens, enabling users to build multimodal intelligent agents that can hear, see, and speak at a low cost, such as AI assistants, AI teachers, and virtual companions.

The launch of this new feature further lowers the barrier to AI application development, allowing individuals and businesses to quickly build and deploy intelligent applications to meet diverse business needs. This update from Alibaba Cloud's Bailian Model Service Platform demonstrates significant progress in promoting the widespread use of AI technology and reducing the complexity of its application.