nari-labs/Dia-1.6B-0626
Text-to-Speech β’ 2B β’ Updated β’ 20.3k β’ 129
Control 3D models using hand gestures and voice
Generate protein fitness scores and visualizations
Ultra Fast FLUX Kontext Dev for Image Editing
Audio-Driven Multi-Person Conversational Video Generation
edit images with Kontext and LoRAs
Hand-controlled arpeggiator, drum machine, and visualizer
Chat with AI using text, images, audio, and video inputs
Kontext multi image composition on FLUX[dev]
LightGlue demo
Convert web content to JSON using a custom schema
OmniGen2: Unified Image Understanding and Generation.
Generate or edit images using text prompts