Emergences Labs

Team

company

https://agentarena.org

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

hhua2 authored a paper about 1 month ago

Video-R4: Reinforcing Text-Rich Video Reasoning with Visual Rumination

hhua2 authored a paper 2 months ago

Latent Chain-of-Thought for Visual Reasoning

hhua2 authored a paper 3 months ago

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

View all activity

hhua2

authored a paper about 1 month ago

Video-R4: Reinforcing Text-Rich Video Reasoning with Visual Rumination

Paper • 2511.17490 • Published Nov 21, 2025 • 21

hhua2

authored a paper 2 months ago

Latent Chain-of-Thought for Visual Reasoning

Paper • 2510.23925 • Published Oct 27, 2025 • 9

hhua2

authored 6 papers 3 months ago

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Paper • 2510.09781 • Published Oct 10, 2025 • 26

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

Paper • 2504.05541 • Published Apr 7, 2025 • 15

The Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report

Paper • 2504.10686 • Published Apr 14, 2025

MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models

Paper • 2505.19415 • Published May 26, 2025 • 2

MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness

Paper • 2505.20426 • Published May 26, 2025 • 7

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Paper • 2510.05034 • Published Oct 6, 2025 • 48

ultra7chen

authored 7 papers 3 months ago

Apple Intelligence Foundation Language Models

Paper • 2407.21075 • Published Jul 29, 2024 • 5

UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing

Paper • 2503.12652 • Published Mar 16, 2025

CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling

Paper • 2502.00965 • Published Feb 3, 2025

GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing

Paper • 2505.11493 • Published May 16, 2025 • 3

AToken: A Unified Tokenizer for Vision

Paper • 2509.14476 • Published Sep 17, 2025 • 36

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

Paper • 2509.16197 • Published Sep 19, 2025 • 56

CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching

Paper • 2509.19300 • Published Sep 23, 2025 • 6

eric008

authored a paper 7 months ago

Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning

Paper • 2505.20561 • Published May 26, 2025 • 7

ultra7chen

authored 4 papers 10 months ago

MOFI: Learning Image Representations from Noisy Entity Annotated Images

Paper • 2306.07952 • Published Jun 13, 2023 • 2

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Paper • 2404.07973 • Published Apr 11, 2024 • 32

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Paper • 2410.02740 • Published Oct 3, 2024 • 54

STIV: Scalable Text and Image Conditioned Video Generation

Paper • 2412.07730 • Published Dec 10, 2024 • 74

AI & ML interests

Recent Activity

Team members 13

EmergencesLabs's activity