--- title: Tini-Lad emoji: โšก colorFrom: pink colorTo: red sdk: gradio sdk_version: 5.33.0 app_file: app.py pinned: false license: other short_description: DLM --- # ๐Ÿ’ฌ Diffusion Language Model Demo Note: Paper coming out soon; if anyone is interested in discussing the model, please contact me. This is an interactive demo of a **diffusion-style language model**, which generates text through iterative refinement. Inspired by diffusion processes in vision models, the system gradually improves a corrupted text sequence until convergence. This implementation has several benefits: - **Noiseless convergence**: A unique feature of this implementation is its ability to convergence **without intermediate noising**. - *Scalable test time compute*: By increasing the number of iterations, the answer quality improves. - *Reduced inference time*: Most questions can be answered with less iterations then the number of tokens generated! - *Greatly reduced training time*: By LoRA-based finetuning of an autoregressive model, this model can be trained within several hours on a single GPU. --- ## ๐Ÿ”ง Settings - **Disable Intermediate Noising**: Speeds up convergence by skipping the noising step between iterations. Works best for short, factual questions. - **Iterations**: Number of refinement steps. More iterations means more time to refine the answer. - **Pause Between Steps**: Slows down the process so you can visually follow the changes. --- ## ๐Ÿ–๏ธ Visualization - **Red tokens**: Masked (noised) tokens that will be regenerated. - **Green tokens**: Newly generated tokens compared to the previous step. --- ## ๐Ÿงช Example Prompt For noiseless diffusion, try short questions like: > What's the capital of France? For more in-depth questions, enable intermediate noising. Increasing the number of iterations generally improves answer quality. > What do you know about Amsterdam? See how low you can go with the number of iterations while still receiving adequate answers! --- More technical details (architecture, training, and evaluation) can be found in the accompanying blog post: ๐Ÿ“˜ [Read the blog post here](https://example.com/diffusion-language-model-blog) For a more tweakable version that includes all inference parameters, check out this version: ๐ŸŽ›๏ธ [Explore the model here](https://huggingface.co/spaces/Ruurd/tini) Paper coming out soon! If you already want to cite this model, please refer to the blogpost