-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 77 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 56 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
Collections
Discover the best community collections!
Collections including paper arxiv:2603.06351
-
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data
Paper • 2603.15594 • Published • 149 -
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence
Paper • 2603.13398 • Published • 155 -
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
Paper • 2603.06569 • Published • 120
-
Beyond Language Modeling: An Exploration of Multimodal Pretraining
Paper • 2603.03276 • Published • 105 -
Qwen3-Coder-Next Technical Report
Paper • 2603.00729 • Published • 65 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
Paper • 2602.23166 • Published • 45
-
FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use
Paper • 2603.08262 • Published • 42 -
On-Policy Context Distillation for Language Models
Paper • 2602.12275 • Published • 4 -
Online Experiential Learning for Language Models
Paper • 2603.16856 • Published • 60 -
Mixture-of-Depths Attention
Paper • 2603.15619 • Published • 80
-
AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation
Paper • 2602.17100 • Published • 4 -
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
Paper • 2603.01059 • Published • 1 -
Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models
Paper • 2603.00618 • Published -
Heterogeneous Agent Collaborative Reinforcement Learning
Paper • 2603.02604 • Published • 197
-
Beyond Imitation: Reinforcement Learning for Active Latent Planning
Paper • 2601.21598 • Published • 10 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 43 -
Self-Hinting Language Models Enhance Reinforcement Learning
Paper • 2602.03143 • Published • 31 -
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
Paper • 2602.12099 • Published • 62
-
SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices
Paper • 2601.08303 • Published • 21 -
SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers
Paper • 2411.10510 • Published • 9 -
Dynamic Chunking Diffusion Transformer
Paper • 2603.06351 • Published • 16 -
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
Paper • 2603.06577 • Published • 50
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 77 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 56 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use
Paper • 2603.08262 • Published • 42 -
On-Policy Context Distillation for Language Models
Paper • 2602.12275 • Published • 4 -
Online Experiential Learning for Language Models
Paper • 2603.16856 • Published • 60 -
Mixture-of-Depths Attention
Paper • 2603.15619 • Published • 80
-
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data
Paper • 2603.15594 • Published • 149 -
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence
Paper • 2603.13398 • Published • 155 -
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
Paper • 2603.06569 • Published • 120
-
AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation
Paper • 2602.17100 • Published • 4 -
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
Paper • 2603.01059 • Published • 1 -
Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models
Paper • 2603.00618 • Published -
Heterogeneous Agent Collaborative Reinforcement Learning
Paper • 2603.02604 • Published • 197
-
Beyond Language Modeling: An Exploration of Multimodal Pretraining
Paper • 2603.03276 • Published • 105 -
Qwen3-Coder-Next Technical Report
Paper • 2603.00729 • Published • 65 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
Paper • 2602.23166 • Published • 45
-
Beyond Imitation: Reinforcement Learning for Active Latent Planning
Paper • 2601.21598 • Published • 10 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 43 -
Self-Hinting Language Models Enhance Reinforcement Learning
Paper • 2602.03143 • Published • 31 -
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
Paper • 2602.12099 • Published • 62
-
SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices
Paper • 2601.08303 • Published • 21 -
SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers
Paper • 2411.10510 • Published • 9 -
Dynamic Chunking Diffusion Transformer
Paper • 2603.06351 • Published • 16 -
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
Paper • 2603.06577 • Published • 50