Papers
arxiv:2405.13729

ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models

Published on Apr 29
ยท Submitted by
Rui Xu
on May 5
Authors:
,
,
,
,
,
,
,
,

Abstract

Combining stochastic processes with diffusion models addresses combinatorial complexity limitations, accelerating training and enabling asynchronous generation across data modalities.

AI-generated summary

In this paper, we study an under-explored but important factor of diffusion generative models, i.e., the combinatorial complexity. Data samples are generally high-dimensional, and for various structured generation tasks, additional attributes are combined to associate with data samples. We show that the space spanned by the combination of dimensions and attributes can be insufficiently covered by existing training schemes of diffusion generative models, potentially limiting test time performance. We present a simple fix to this problem by constructing stochastic processes that fully exploit the combinatorial structures, hence the name ComboStoc. Using this simple strategy, we show that network training is significantly accelerated across diverse data modalities, including images and 3D structured shapes. Moreover, ComboStoc enables a new way of test time generation which uses asynchronous time steps for different dimensions and attributes, thus allowing for varying degrees of control over them. Our code is available at: https://github.com/Xrvitd/ComboStoc

Community

Paper submitter

Today we're releasing ComboStoc, a simple new training strategy for diffusion generative models that unlocks faster training and more flexible control at test time.

Diffusion models usually treat each training sample as a point moving along a single path. But for high dimensional data, and especially for structured generation tasks with additional attributes, that is often not enough. Large parts of the underlying combinatorial space remain poorly sampled during training, which can hurt generation quality at test time.

ComboStoc addresses this with a simple idea: instead of training only along a narrow set of paths, it constructs stochastic processes that more fully explore the combinatorial structures induced by dimensions and attributes. This leads to much better coverage of the training space, with no need for a complicated redesign of the model itself.

The result is a diffusion framework that trains significantly faster across very different data modalities, including images and structured 3D shapes.

But ComboStoc is not only about speed.
It also enables a new test time generation scheme with asynchronous time steps across different dimensions or attributes. In practice, this means you can preserve some parts more strongly than others, or apply different levels of control across regions, components, or conditions within the same sample.

This opens up a new way to think about diffusion generation: not as one synchronized denoising process, but as a more flexible system where different parts of the data can evolve at different rates.

We think this perspective is especially promising for structured generation problems, where dimensions and attributes are deeply entangled and should not always be treated uniformly.

Faster training. Better path coverage. More controllable generation.
ComboStoc is a simple step toward diffusion models that make fuller use of the structure already present in the data.

๐ŸŒ Project Page: https://ruixu.me/html/ComboStoc/index.html

๐Ÿ“„ Paper: https://arxiv.org/abs/2405.13729

๐Ÿ’ป Code: https://github.com/Xrvitd/ComboStoc

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2405.13729
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2405.13729 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2405.13729 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2405.13729 in a Space README.md to link it from this page.

Collections including this paper 1