This repository contains the official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space". You can pretrain the pondering Pythia-70m model on the minipile dataset ...