This repository contains the official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space". You can pretrain the pondering Pythia-70m model on the minipile dataset ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results