This product utilizes the Gemini 2.0 language model and Google Imagen image generation technology, integrating speech recognition and synthesis to provide users with an interactive storytelling experience. Users can choose the direction of the story through voice input, and the system will generate story content and related images in real-time. Its main advantages are innovative interaction methods and powerful content generation capabilities, making it suitable for education, entertainment, and creative inspiration. Currently, the product is in the open-source phase, with no specific pricing established, primarily targeting developers and educational institutions.