Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
nvidia
/
audio-flamingo-2
like
49
Follow
NVIDIA
59.4k
Audio-Text-to-Text
nvidia/AudioSkills
nvidia/LongAudio
arxiv:
2503.03983
arxiv:
2402.01831
arxiv:
2204.14198
License:
other
Model card
Files
Files and versions
xet
Community
2
Copy to bucket
new
New discussion
New pull request
Resources
PR & discussions documentation
Code of Conduct
Hub documentation
All
Discussions
Pull requests
View closed (1)
Sort: Recently created
What visual model would you use in tandem? Distallignation?
1
#2 opened about 1 year ago by
TimeLordRaps