THE ORB - a galois77 Collection

galois77 's Collections

Thousand brains theory

THE ORB

energy based models

OCR

Poetry

Agentic

Videos

ahan

Image generation

Training optimization

RL

Benchmarks and challenges

THE ORB

updated 5 days ago

UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist

Paper • 2511.08521 • Published Nov 11, 2025 • 37
Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published Nov 13, 2025 • 49
Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published Nov 13, 2025 • 96
VGGT: Visual Geometry Grounded Transformer

Paper • 2503.11651 • Published Mar 14, 2025 • 35
Music Flamingo: Scaling Music Understanding in Audio Language Models

Paper • 2511.10289 • Published Nov 13, 2025 • 10
Canvas-to-Image: Compositional Image Generation with Multimodal Controls

Paper • 2511.21691 • Published Nov 26, 2025 • 35
DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders

Paper • 2512.13690 • Published 22 days ago • 2
SpotEdit: Selective Region Editing in Diffusion Transformers

Paper • 2512.22323 • Published 11 days ago • 37