PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training RunsMay 1, 2025·Oskar Van Der Wal,Pietro Lesci,Max Müller-EbersteinNaomi Saphra,Hailey Schoelkopf,Willem Zuidema,Stella Biderman· 0 min read PDF CiteTypeConference paperPublicationInternational Conference on Learning Representations (ICLR)Last updated on May 1, 2025 AuthorsNaomi SaphraResearch Fellow ← How to visualize training dynamics in neural networks May 1, 2025Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon May 1, 2025 →