FOCUS: Effective Embedding Initialization for Specializing Pretrained Multilingual Models on a Single Language Paper • 2305.14481 • Published May 23, 2023 • 2
Efficient Parallelization Layouts for Large-Scale Distributed Model Training Paper • 2311.05610 • Published Nov 9, 2023