MaXTron is an advanced meta-architecture specifically designed for video segmentation, enhancing temporal consistency through its intra-frame and inter-frame tracking modules. This unified meta-architecture streamlines the segmentation process, making it an effective tool for researchers and practitioners in the field of computer vision. By introducing intra-frame and inter-frame tracking modules, MaXTron enriches segment-level segmentation, ensuring smoother segmentation results. Key features include the unified meta-architecture, intra-frame tracking module, and inter-frame tracking module, all aimed at improving the efficiency of segmentation. The introduction of MaXTron brings advanced video panoptic segmentation technology to the field of computer vision.