Starstrek's picture

Starstrek

Stars321123

·

Stars321

AI & ML interests

AI

Recent Activity

upvoted a paper about 6 hours ago

DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data

upvoted a paper about 6 hours ago

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

liked a model about 6 hours ago

VladShash/deepseek-math-7b-lean-prover-dpo-olmo-3

View all activity

Organizations

upvoted 2 papers about 6 hours ago

DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data

Paper • 2604.19859 • Published 6 days ago • 47

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 65

upvoted a collection about 6 hours ago

Stuff I'm going to read

47 items • Updated 3 days ago • 3

upvoted a paper about 6 hours ago

Near-Future Policy Optimization

Paper • 2604.20733 • Published 5 days ago • 64

upvoted 2 papers about 7 hours ago

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published 5 days ago • 229

Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation

Paper • 2604.18168 • Published 7 days ago • 96

upvoted a collection about 23 hours ago

MathNet

2 items • Updated 1 day ago • 2

upvoted 3 collections 1 day ago

GRM2

Powerfull Reasoning-focused models for general reasoning and agentic tasks. • 2 items • Updated 3 days ago • 4

GRM-2.5

Reasoning models for complex reasoning, challenging tasks, and all kinds of chat and everyday use. • 2 items • Updated 3 days ago • 3

GRM-2.6

1 item • Updated 3 days ago • 2

upvoted a paper 1 day ago

Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision

Paper • 2512.15489 • Published Dec 17, 2025 • 13

upvoted 3 collections 1 day ago

Nemotron Supervised Fine-Tuning

SFT datasets covering math, code, chat, safety, agentic, VLM, multilingual, and specialized domains. • 38 items • Updated 3 days ago • 3

DeepSeek-OCR

2 items • Updated Feb 2 • 31

DeepSeek-V4

4 items • Updated 3 days ago • 525

upvoted an article 2 days ago

Article

DeepSeek-V4: a million-token context that agents can actually use

3 days ago

•

29

upvoted 3 papers 2 days ago

ClawNet: Human-Symbiotic Agent Network for Cross-User Autonomous Cooperation

Paper • 2604.19211 • Published 6 days ago • 10

TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents

Paper • 2602.07274 • Published Feb 6 • 210

A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

Paper • 2604.19572 • Published 6 days ago • 18

upvoted an article 2 days ago

Article

How to Use Transformers.js in a Chrome Extension

4 days ago

•

14

upvoted an article 3 days ago

Article

Kimina-Prover-RL

Aug 14, 2025

•

16