Liquid is an autoregressive generative model that facilitates seamless integration of visual understanding and text generation by decomposing images into discrete codes and sharing feature space with text tokens. The main advantage of this model is the elimination of the need for externally pre-trained visual embeddings, reducing resource dependence, while simultaneously discovering a synergistic effect between understanding and generation tasks through the law of scaling.