amico's picture

amico

amico

·

AI & ML interests

None yet

Recent Activity

liked a model about 2 hours ago

CohereLabs/North-Mini-Code-1.0

liked a model about 22 hours ago

PixArt-alpha/PixArt-Sigma-XL-2-1024-MS

upvoted an article 8 days ago

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

View all activity

Organizations

None yet

upvoted an article 8 days ago

Article

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

JetBrains

•

11 days ago

• 31

upvoted an article 15 days ago

Article

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

ibm-research

•

16 days ago

• 16

upvoted a collection about 1 month ago

Gemma 4

15 items • Updated 1 day ago • 951

upvoted a paper about 1 month ago

Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding

Paper • 2604.26779 • Published Apr 29 • 13

upvoted an article about 2 months ago

Article

The PR you would have opened yourself

pcuenq, awni

•

Apr 16

• 72

upvoted a collection about 2 months ago

VRAG

6 items • Updated Apr 2 • 12

upvoted 2 papers 2 months ago

TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published Apr 6 • 114

Synthetic Sandbox for Training Machine Learning Engineering Agents

Paper • 2604.04872 • Published Apr 6 • 14

upvoted 2 papers 3 months ago

Hyperagents

Paper • 2603.19461 • Published Mar 19 • 51

Mixture-of-Depths Attention

Paper • 2603.15619 • Published Mar 16 • 81

upvoted an article 3 months ago

Article

How NVIDIA Builds Open Data for AI

nvidia

•

Mar 10

• 16

upvoted a paper 4 months ago

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 266

upvoted an article 4 months ago

Article

OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments

+3

christian-washington, ajasuja, santosh-iima, lewtun, burtenshaw

•

Feb 12

• 34

upvoted 2 articles 5 months ago

Article

Optimizing GLM4-MoE for Production: 65% Faster TTFT with SGLang

novita

•

Jan 22

• 10

Article

Differential Transformer V2

microsoft

•

Jan 20

• 52

upvoted a paper 5 months ago

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Paper • 2601.09708 • Published Jan 14 • 56

upvoted a collection 6 months ago

Mistral Large 3

A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 100

upvoted an article 7 months ago

Article

Continuous batching from first principles

+1

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 403

upvoted an article 8 months ago

Article

Building the Open Agent Ecosystem Together: Introducing OpenEnv

+8

spisakjo, darktex, zkwentz, mortimerp9, Sanyam, Hamid-Nazeri, Pankit01, emre0, lewtun, reach-vb

•

Oct 23, 2025

• 163

upvoted a collection 8 months ago

Spaces for Audio / Voices

539 items • Updated 1 day ago • 33