Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Tushar Goyal's picture
1 4

Tushar Goyal

tushargoyal
ยท
  • thetushargoyal
  • thetushargoyal

AI & ML interests

Multimodal LLMs, Speech-to-Speech Translation, Latent Diffusion/Consistency Models

Organizations

Hugging Face Discord Community's profile picture

Collections 2

Video-to-Audio Synthesis
  • Runtime error
    Featured
    911

    MMAudio โ€” generating synchronized audio from video/text

    ๐Ÿ”Š
    911

    Generate audio from videos or text descriptions

Multimodal LLMs
  • 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

    Paper โ€ข 2501.00958 โ€ข Published Jan 1, 2025 โ€ข 109
Video-to-Audio Synthesis
  • Runtime error
    Featured
    911

    MMAudio โ€” generating synchronized audio from video/text

    ๐Ÿ”Š
    911

    Generate audio from videos or text descriptions

Multimodal LLMs
  • 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

    Paper โ€ข 2501.00958 โ€ข Published Jan 1, 2025 โ€ข 109

models 1

tushargoyal/sarvam-2b-ft

Updated Jan 8, 2025

datasets 0

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs