SpacTor-T5

Pre-trained T5 model using a combination of span corruption (SC) and replacement tag detection (RTD).

CommonProductProgrammingNLPPre-trained model
SpacTor is a new training procedure that includes (1) a mixed objective combining span corruption (SC) and replacement tag detection (RTD), and (2) a two-stage curriculum that optimizes the mixed objective in the initial \tau iterations and then transitions to standard SC loss. Experiments on various NLP tasks, using the encoder-decoder architecture (T5), show that SpacTor-T5 achieves comparable downstream performance to standard SC pre-training while reducing the pre-training iterations by 50% and the total FLOPs by 40%. Additionally, under the same computational budget, we find that SpacTor can significantly improve downstream benchmark performance.
Visit

SpacTor-T5 Visit Over Time

Monthly Visits

17104189

Bounce Rate

44.67%

Page per Visit

5.5

Visit Duration

00:05:49

SpacTor-T5 Visit Trend

SpacTor-T5 Visit Geography

SpacTor-T5 Traffic Sources

SpacTor-T5 Alternatives