AppAgent is an LLM (Large Language Model) based multimodal agent framework designed for operating mobile applications. It imitates human-like interactions through simplified action spaces (such as clicking and swiping) to achieve application operation without requiring system backend access. The agent autonomously explores or learns new application usage methods by observing human demonstrations, creating a knowledge base for executing complex tasks within different applications.