Janus is an innovative autoregressive framework that addresses the limitations of previous methods by decoupling visual encoding into distinct pathways while utilizing a single, unified transformer architecture for processing. This decoupling not only alleviates the role conflict of the visual encoder in understanding and generation but also enhances the framework's flexibility. Janus outperforms earlier unified models and matches or exceeds the performance of task-specific models. Its simplicity, high flexibility, and effectiveness make it a strong candidate for next-generation unified multimodal models.