Translated data: ByteDance's multimodal large model, PixelLM, introduces efficient pixel-level reasoning without relying on SAM. The advantage of this model lies in its ability to handle diverse and complex reasoning segmentation tasks, providing multiple sets of actual segmentation results, enabling it to effectively address open-domain issues. This marks a step forward for multimodal large models into fine-grained tasks such as image editing, autonomous driving, and robotics.