kaicheng001

https://kaicheng001.github.io/

kaicheng001

AI & ML interests

None yet

Recent Activity

upvoted a paper 25 days ago

mHC: Manifold-Constrained Hyper-Connections

upvoted an article 3 months ago

Visualize and understand GPU memory in PyTorch

upvoted an article 3 months ago

Chat Templates: An End to the Silent Performance Killer

View all activity

Organizations

None yet

upvoted a paper 25 days ago

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published 27 days ago • 284

upvoted 2 articles 3 months ago

Article

Visualize and understand GPU memory in PyTorch

Dec 24, 2024

•

261

Article

Chat Templates: An End to the Silent Performance Killer

Oct 3, 2023

•

upvoted 2 papers 4 months ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 273

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 116

upvoted a paper 5 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21, 2025 • 262

upvoted 4 papers 6 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 181

upvoted a paper 7 months ago

A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8, 2025 • 93

upvoted an article 7 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8, 2025

•

751

upvoted a paper 8 months ago

Reward Reasoning Model

Paper • 2505.14674 • Published May 20, 2025 • 37

upvoted 3 papers 9 months ago

Parallel Scaling Law for Language Models

Paper • 2505.10475 • Published May 15, 2025 • 83

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 189

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Paper • 2505.04921 • Published May 8, 2025 • 185

upvoted an article 9 months ago

Article

I trained a Language Model to schedule events with GRPO!

Apr 29, 2025

•

upvoted a paper 9 months ago

ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published Apr 16, 2025 • 48

upvoted an article 10 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

•

1.05k

upvoted a paper 10 months ago

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Paper • 2504.08685 • Published Apr 11, 2025 • 130

kaicheng001

AI & ML interests

Recent Activity

Organizations

kaicheng001's activity

Visualize and understand GPU memory in PyTorch

Chat Templates: An End to the Silent Performance Killer

SmolLM3: smol, multilingual, long-context reasoner

I trained a Language Model to schedule events with GRPO!

Mixture of Experts Explained