PhysGen is an innovative method for image-to-video generation that transforms a single image and input conditions (such as force and torque applied to objects in the image) into realistic, physically plausible, and temporally coherent videos. This technology achieves dynamic simulation in image space by combining model-based physical simulation with data-driven video generation processes. The main advantages of PhysGen include producing videos that are both physically and visually realistic, and offering precise control, demonstrating its superiority over existing data-driven image-to-video generation methods through quantitative comparisons and comprehensive user studies.