Collection for "Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders"
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
SIMS-V: Simulated Instruction-Tuning for Spatial Video Understanding
Organization Card
Edit this README.md markdown file to author your organization card.
models
36
nyu-visionx/siglip2_decoder
Image-to-Image
•
Updated
•
177
nyu-visionx/webssl300m_decoder
Image-to-Image
•
Updated
•
34
nyu-visionx/Scale-RAE-Qwen1.5B_DiT2.4B-WebSSL
Text-to-Image
•
4B
•
Updated
•
61
nyu-visionx/Scale-RAE-Qwen7B_DiT9.8B
Text Generation
•
17B
•
Updated
•
18
nyu-visionx/Scale-RAE-Qwen1.5B_DiT2.4B
Text Generation
•
4B
•
Updated
•
408
nyu-visionx/Cambrian-S-3B-S3
3B
•
Updated
•
251
nyu-visionx/Cambrian-S-3B-S2
3B
•
Updated
•
280
nyu-visionx/Cambrian-S-3B-S1
3B
•
Updated
•
9
nyu-visionx/Cambrian-S-1.5B-S3
2B
•
Updated
•
185
nyu-visionx/Cambrian-S-1.5B-S2
2B
•
Updated
•
287
datasets
13
nyu-visionx/scale-rae-data
Updated
•
864
•
1
nyu-visionx/Cambrian-S-3M
Updated
•
9.88k
•
2
nyu-visionx/VSI-Bench
Viewer
•
Updated
•
10.3k
•
8.76k
•
58
nyu-visionx/VSI-Train-10k
Viewer
•
Updated
•
10k
•
652
•
3
nyu-visionx/VSI-SUPER-Count
Viewer
•
Updated
•
400
•
933
•
4
nyu-visionx/VSI-SUPER-Recall
Viewer
•
Updated
•
300
•
448
•
3
nyu-visionx/VSI-590K
Preview
•
Updated
•
3.13k
•
11
nyu-visionx/CV-Bench
Viewer
•
Updated
•
5.28k
•
4.62k
•
41
nyu-visionx/pyramid_flow_ft_results
Viewer
•
Updated
•
8.42k
•
59
nyu-visionx/pisa-experiments
Updated
•
64
•
2