Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper • 2505.22618 • Published May 28, 2025 • 44
VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections Paper • 2405.17991 • Published May 28, 2024 • 14
HumanRankEval: Automatic Evaluation of LMs as Conversational Assistants Paper • 2405.09186 • Published May 15, 2024 • 22