Gradio Templates

community

https://gradio.app

Gradio

gradio-app/gradio

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

akhaliq submitted a paper 9 days ago

V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising

akhaliq submitted a paper 11 days ago

Multimodal OCR: Parse Anything from Documents

freddyaboulton updated a Space about 1 month ago

gradio-templates/leaderboard

View all activity

akhaliq

submitted a paper to Daily Papers 9 days ago

V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising

Paper • 2603.16792 • Published 10 days ago • 3

fffiloni

posted an update 9 days ago

Post

3926

I brought DALL·E mini back to life 🤖🎨

You can try it here:
fffiloni/dalle-mini-reboot

And I also built a batch version using Hugging Face Jobs (up to 50 images per prompt):
fffiloni/dalle-mini-via-jobs

The goal was to stay close to the original JAX/Flax pipeline, while integrating it with modern tooling (Gradio + Jobs).

It ended up being a fun way to revisit this model — still weird, still fun 😄

3 replies

akhaliq

submitted a paper to Daily Papers 11 days ago

Multimodal OCR: Parse Anything from Documents

Paper • 2603.13032 • Published 14 days ago • 37

fffiloni

posted an update 14 days ago

Post

458

A clearer demo for TADA (now multilingual) 🔊🌍

I improved the public demo for TADA — a generative framework for speech modeling via text–acoustic dual alignment.

TADA models speech as a joint sequence of text tokens and acoustic tokens, using a transformer backbone to keep text and audio synchronized during generation.

The original demo already exposed these mechanisms, but the workflow made the pipeline hard to understand.

This updated demo makes the process clearer:

• load the model
• prepare a reference voice (optionally with transcript or Whisper auto-transcription)
• generate speech conditioned on that reference

It also adds multilingual support.

Presets are included for a few languages, but the model supports more:

English, French, Spanish, German, Arabic, Mandarin Chinese, Italian, Japanese, Polish, Portuguese

Feel free to try different voices, accents, or languages and see how the alignment behaves.

👉 fffiloni/tada-dual-alignment-tts-demo

Paper
TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment (2602.23068)