Whisper
Transcribe audio files or YouTube videos into text
Transcribe audio files or YouTube videos into text
Upscale low-resolution images to high resolution
Translate text between 200 languages
Generate depth maps from images
Generate high-quality videos from text prompts and optional images
Powerful Watermark Removal API
Annotate satellite images to detect roofs
Extend images to desired dimensions using prompts
Generate images from text descriptions
The agent using over 9000 vision models from the HF Hub.
Execute custom code from environment variable
Audio Conditioned LipSync with Latent Diffusion Models
Separate vocals and accompaniment from audio
Versatile Single-Image 3D Face Reconstruction
NSFW Uncensored text/Image to image for AI Limits
Translate text between languages
Expressive Zeroshot TTS
Video deep fake
Flexible Photo Recrafting While Preserving Your Identity