SandwichZ commited on
Commit
f23910f
·
verified ·
1 Parent(s): 7f8222d

Upload 9 files

Browse files
.gitattributes CHANGED
@@ -46,3 +46,9 @@ assets/video_dit_arch.jpg filter=lfs diff=lfs merge=lfs -text
46
  assets/video_vae_res.jpg filter=lfs diff=lfs merge=lfs -text
47
  examples/i2v_input.JPG filter=lfs diff=lfs merge=lfs -text
48
  assets/vben_1.3b_vs_sota.png filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
46
  assets/video_vae_res.jpg filter=lfs diff=lfs merge=lfs -text
47
  examples/i2v_input.JPG filter=lfs diff=lfs merge=lfs -text
48
  assets/vben_1.3b_vs_sota.png filter=lfs diff=lfs merge=lfs -text
49
+ assets/bg_edit.gif filter=lfs diff=lfs merge=lfs -text
50
+ assets/camera_ctrl.gif filter=lfs diff=lfs merge=lfs -text
51
+ assets/fg_edit.gif filter=lfs diff=lfs merge=lfs -text
52
+ assets/motion_transfer.gif filter=lfs diff=lfs merge=lfs -text
53
+ assets/object.gif filter=lfs diff=lfs merge=lfs -text
54
+ assets/teaser.gif filter=lfs diff=lfs merge=lfs -text
LICENSE ADDED
@@ -0,0 +1,192 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+ "License" shall mean the terms and conditions for use, reproduction,
9
+ and distribution as defined by Sections 1 through 9 of this document.
10
+ "Licensor" shall mean the copyright owner or entity authorized by
11
+ the copyright owner that is granting the License.
12
+ "Legal Entity" shall mean the union of the acting entity and all
13
+ other entities that control, are controlled by, or are under common
14
+ control with that entity. For the purposes of this definition,
15
+ "control" means (i) the power, direct or indirect, to cause the
16
+ direction or management of such entity, whether by contract or
17
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
18
+ outstanding shares, or (iii) beneficial ownership of such entity.
19
+ "You" (or "Your") shall mean an individual or Legal Entity
20
+ exercising permissions granted by this License.
21
+ "Source" form shall mean the preferred form for making modifications,
22
+ including but not limited to software source code, documentation
23
+ source, and configuration files.
24
+ "Object" form shall mean any form resulting from mechanical
25
+ transformation or translation of a Source form, including but
26
+ not limited to compiled object code, generated documentation,
27
+ and conversions to other media types.
28
+ "Work" shall mean the work of authorship, whether in Source or
29
+ Object form, made available under the License, as indicated by a
30
+ copyright notice that is included in or attached to the work
31
+ (an example is provided in the Appendix below).
32
+ "Derivative Works" shall mean any work, whether in Source or Object
33
+ form, that is based on (or derived from) the Work and for which the
34
+ editorial revisions, annotations, elaborations, or other modifications
35
+ represent, as a whole, an original work of authorship. For the purposes
36
+ of this License, Derivative Works shall not include works that remain
37
+ separable from, or merely link (or bind by name) to the interfaces of,
38
+ the Work and Derivative Works thereof.
39
+ "Contribution" shall mean any work of authorship, including
40
+ the original version of the Work and any modifications or additions
41
+ to that Work or Derivative Works thereof, that is intentionally
42
+ submitted to Licensor for inclusion in the Work by the copyright owner
43
+ or by an individual or Legal Entity authorized to submit on behalf of
44
+ the copyright owner. For the purposes of this definition, "submitted"
45
+ means any form of electronic, verbal, or written communication sent
46
+ to the Licensor or its representatives, including but not limited to
47
+ communication on electronic mailing lists, source code control systems,
48
+ and issue tracking systems that are managed by, or on behalf of, the
49
+ Licensor for the purpose of discussing and improving the Work, but
50
+ excluding communication that is conspicuously marked or otherwise
51
+ designated in writing by the copyright owner as "Not a Contribution."
52
+ "Contributor" shall mean Licensor and any individual or Legal Entity
53
+ on behalf of whom a Contribution has been received by Licensor and
54
+ subsequently incorporated within the Work.
55
+
56
+ 2. Grant of Copyright License. Subject to the terms and conditions of
57
+ this License, each Contributor hereby grants to You a perpetual,
58
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
59
+ copyright license to reproduce, prepare Derivative Works of,
60
+ publicly display, publicly perform, sublicense, and distribute the
61
+ Work and such Derivative Works in Source or Object form.
62
+
63
+ 3. Grant of Patent License. Subject to the terms and conditions of
64
+ this License, each Contributor hereby grants to You a perpetual,
65
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
66
+ (except as stated in this section) patent license to make, have made,
67
+ use, offer to sell, sell, import, and otherwise transfer the Work,
68
+ where such license applies only to those patent claims licensable
69
+ by such Contributor that are necessarily infringed by their
70
+ Contribution(s) alone or by combination of their Contribution(s)
71
+ with the Work to which such Contribution(s) was submitted. If You
72
+ institute patent litigation against any entity (including a
73
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
74
+ or a Contribution incorporated within the Work constitutes direct
75
+ or contributory patent infringement, then any patent licenses
76
+ granted to You under this License for that Work shall terminate
77
+ as of the date such litigation is filed.
78
+
79
+ 4. Redistribution. You may reproduce and distribute copies of the
80
+ Work or Derivative Works thereof in any medium, with or without
81
+ modifications, and in Source or Object form, provided that You
82
+ meet the following conditions:
83
+ (a) You must give any other recipients of the Work or
84
+ Derivative Works a copy of this License; and
85
+ (b) You must cause any modified files to carry prominent notices
86
+ stating that You changed the files; and
87
+ (c) You must retain, in the Source form of any Derivative Works
88
+ that You distribute, all copyright, patent, trademark, and
89
+ attribution notices from the Source form of the Work,
90
+ excluding those notices that do not pertain to any part of
91
+ the Derivative Works; and
92
+ (d) If the Work includes a "NOTICE" text file as part of its
93
+ distribution, then any Derivative Works that You distribute must
94
+ include a readable copy of the attribution notices contained
95
+ within such NOTICE file, excluding those notices that do not
96
+ pertain to any part of the Derivative Works, in at least one
97
+ of the following places: within a NOTICE text file distributed
98
+ as part of the Derivative Works; within the Source form or
99
+ documentation, if provided along with the Derivative Works; or,
100
+ within a display generated by the Derivative Works, if and
101
+ wherever such third-party notices normally appear. The contents
102
+ of the NOTICE file are for informational purposes only and
103
+ do not modify the License. You may add Your own attribution
104
+ notices within Derivative Works that You distribute, alongside
105
+ or as an addendum to the NOTICE text from the Work, provided
106
+ that such additional attribution notices cannot be construed
107
+ as modifying the License.
108
+ You may add Your own copyright statement to Your modifications and
109
+ may provide additional or different license terms and conditions
110
+ for use, reproduction, or distribution of Your modifications, or
111
+ for any such Derivative Works as a whole, provided Your use,
112
+ reproduction, and distribution of the Work otherwise complies with
113
+ the conditions stated in this License.
114
+
115
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
116
+ any Contribution intentionally submitted for inclusion in the Work
117
+ by You to the Licensor shall be under the terms and conditions of
118
+ this License, without any additional terms or conditions.
119
+ Notwithstanding the above, nothing herein shall supersede or modify
120
+ the terms of any separate license agreement you may have executed
121
+ with Licensor regarding such Contributions.
122
+
123
+ 6. Trademarks. This License does not grant permission to use the trade
124
+ names, trademarks, service marks, or product names of the Licensor,
125
+ except as required for reasonable and customary use in describing the
126
+ origin of the Work and reproducing the content of the NOTICE file.
127
+
128
+ 7. Disclaimer of Warranty. Unless required by applicable law or
129
+ agreed to in writing, Licensor provides the Work (and each
130
+ Contributor provides its Contributions) on an "AS IS" BASIS,
131
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
132
+ implied, including, without limitation, any warranties or conditions
133
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
134
+ PARTICULAR PURPOSE. You are solely responsible for determining the
135
+ appropriateness of using or redistributing the Work and assume any
136
+ risks associated with Your exercise of permissions under this License.
137
+
138
+ 8. Limitation of Liability. In no event and under no legal theory,
139
+ whether in tort (including negligence), contract, or otherwise,
140
+ unless required by applicable law (such as deliberate and grossly
141
+ negligent acts) or agreed to in writing, shall any Contributor be
142
+ liable to You for damages, including any direct, indirect, special,
143
+ incidental, or consequential damages of any character arising as a
144
+ result of this License or out of the use or inability to use the
145
+ Work (including but not limited to damages for loss of goodwill,
146
+ work stoppage, computer failure or malfunction, or any and all
147
+ other commercial damages or losses), even if such Contributor
148
+ has been advised of the possibility of such damages.
149
+
150
+ 9. Accepting Warranty or Additional Liability. While redistributing
151
+ the Work or Derivative Works thereof, You may choose to offer,
152
+ and charge a fee for, acceptance of support, warranty, indemnity,
153
+ or other liability obligations and/or rights consistent with this
154
+ License. However, in accepting such obligations, You may act only
155
+ on Your own behalf and on Your sole responsibility, not on behalf
156
+ of any other Contributor, and only if You agree to indemnify,
157
+ defend, and hold each Contributor harmless for any liability
158
+ incurred by, or claims asserted against, such Contributor by reason
159
+ of your accepting any such warranty or additional liability.
160
+
161
+ END OF TERMS AND CONDITIONS
162
+
163
+ Copyright 2026 FlexAM Authors
164
+
165
+ Licensed under the Apache License, Version 2.0 (the "License");
166
+ you may not use this file except in compliance with the License.
167
+ You may obtain a copy of the License at
168
+
169
+ http://www.apache.org/licenses/LICENSE-2.0
170
+
171
+ Unless required by applicable law or agreed to in writing, software
172
+ distributed under the License is distributed on an "AS IS" BASIS,
173
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
174
+ See the License for the specific language governing permissions and
175
+ limitations under the License.
176
+
177
+ ========================================================================
178
+ PART 2: DELTA / Snap Inc. License (Applies to DELTA Checkpoints/Modules)
179
+ ========================================================================
180
+
181
+ This project incorporates weights/code from DELTA, which is subject to the following license from Snap Inc.:
182
+
183
+ Copyright (c) 2025 Snap Inc. All rights reserved.
184
+
185
+ This sample code is made available by Snap Inc. for non-commercial, research purposes only.
186
+ Non-commercial means not primarily intended for or directed towards commercial advantage or monetary compensation. Research purposes mean solely for study, instruction, or non-commercial research, testing or validation.
187
+
188
+ No commercial license, whether implied or otherwise, is granted in or to this code, unless you have entered into a separate agreement with Snap Inc. for such rights.
189
+
190
+ This sample code is provided as-is, without warranty of any kind, express or implied, including any warranties of merchantability, title, fitness for a particular purpose, non-infringement, or that the code is free of defects, errors or viruses. In no event will Snap Inc. be liable for any damages or losses of any kind arising from this sample code or your use thereof.
191
+
192
+ Any redistribution of this sample code, including in binary form, must retain or reproduce the above copyright notice, conditions and disclaimer.
README.md ADDED
@@ -0,0 +1,292 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+
3
+ # FlexAM: Flexible Appearance-Motion Decomposition for Versatile Video Generation Control
4
+
5
+ <a href="https://arxiv.org/abs/2602.xxxxx"><img src="https://img.shields.io/badge/arXiv-2602.xxxxx-b31b1b.svg" alt="arXiv"></a>
6
+ <a href="https://huggingface.co/SandwichZ/Wan2.2-Fun-5B-FLEXAM"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-yellow" alt="Hugging Face"></a>
7
+ <a href="assets/flexam_workflow.json"><img src="https://img.shields.io/badge/ComfyUI-Download_Workflow-4fd63d" alt="ComfyUI"></a>
8
+
9
+ <br>
10
+ <br>
11
+
12
+ Mingzhi Sheng<sup>1*</sup>, Zekai Gu<sup>2*</sup>, Peng Li<sup>2</sup>, Cheng Lin<sup>3</sup>, Hao-Xiang Guo<sup>4</sup>, Ying-Cong Chen<sup>1,2†</sup>, Yuan Liu<sup>2†</sup>
13
+
14
+ <br>
15
+
16
+ <sup>1</sup>HKUST(GZ), <sup>2</sup>HKUST, <sup>3</sup>MUST, <sup>4</sup>Tsinghua University
17
+ <br>
18
+ <small><sup>*</sup>Equal Contribution, <sup>†</sup>Corresponding Authors</small>
19
+
20
+ </div>
21
+
22
+ <br>
23
+
24
+ ![teaser](assets/teaser.gif)
25
+
26
+ ## 📰 News
27
+ - **[2026.02.13]** 🚀 We have released the inference code and **ComfyUI** support!
28
+ - **[2026.02.14]** 📄 The paper is available on arXiv.
29
+
30
+
31
+ ## 🛠️ Installation
32
+ > 📢 **System Requirements**: Both the official Python inference code and the ComfyUI workflow were tested on **Ubuntu 20.04** with **Python 3.10**, **PyTorch 2.5.1**, and **CUDA 12.1** on an **NVIDIA A800** GPU.
33
+
34
+ Before running any inference (Python or ComfyUI), please setup the environment and download the checkpoints.
35
+
36
+ ### 1. Create environment
37
+ Clone the repository and create conda environment:
38
+
39
+ ```
40
+ git clone https://github.com/IGL-HKUST/FlexAM
41
+ conda create -n flexam python=3.10
42
+ conda activate flexam
43
+ ```
44
+
45
+ Install pytorch, we recommend `Pytorch 2.5.1` with `CUDA 12.1`:
46
+
47
+ ```
48
+ pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu121
49
+ ```
50
+ ```
51
+ pip install -r requirements.txt
52
+ ```
53
+ ### 2. Download Submodules
54
+ We rely on several external modules (MoGe, Pi3, etc.).
55
+
56
+ ```
57
+ mkdir -p submodules
58
+ git submodule update --init --recursive
59
+ pip install -r requirements.txt
60
+ ```
61
+ <details>
62
+ <summary><em>(Optional) Manual clone if submodule update fails</em></summary>
63
+ ```
64
+ # DELTA
65
+ git clone https://github.com/snap-research/DELTA_densetrack3d.git submodules/MoGe
66
+ # Pi3
67
+ git clone https://github.com/yyfz/Pi3.git submodules/Pi3
68
+ # MoGe
69
+ git clone https://github.com/microsoft/MoGe.git submodules/MoGe
70
+ # VGGT
71
+ git clone https://github.com/facebookresearch/vggt.git submodules/vggt
72
+ ```
73
+ </details>
74
+
75
+ ### 3. Download checkpoints
76
+ Download the FlexAM checkpoint and place it in the`checkpoints/` directory.
77
+
78
+ - HuggingFace Link: [Wan2.2-Fun-5B-FLEXAM](https://huggingface.co/SandwichZ/Wan2.2-Fun-5B-FLEXAM)
79
+
80
+
81
+
82
+ ## 🚀 Inference
83
+ We provide two ways to use FlexAM: Python Script and ComfyUI.
84
+
85
+ ### Option A: ComfyUI Integration
86
+ We provide a native node for seamless integration into ComfyUI workflows.
87
+ > ⚠️ **Note**: Currently, the ComfyUI node supports **Motion Transfer**, **Foreground Edit**, and **Background Edit**. For *Camera Control* and *Object Manipulation*, please use the Python script.
88
+ #### 1. Install Node
89
+ Since we are not yet in the Manager, please install manually:
90
+ ```
91
+ cd ComfyUI/custom_nodes/
92
+ git clone https://github.com/IGL-HKUST/FlexAM
93
+ cd FlexAM
94
+ pip install -r requirements.txt
95
+ ```
96
+ #### 2. Run Workflow
97
+ - Step 1: Download the workflow JSON: [workflow.json](assets/flexam_workflow.json)
98
+ - Step 2: Drag and drop it into ComfyUI.
99
+ - Step 3: Ensure checkpoints are in `ComfyUI/models/checkpoints`.
100
+
101
+ ### Option B: Python Script
102
+
103
+ We provide a inference script for our tasks. Please refer to `run_demo.sh` to run the `demo.py` script.
104
+
105
+ Or you can run these tasks one by one as follows.
106
+
107
+ #### 1. Motion Transfer
108
+ ![motion_transfer](assets/motion_transfer.gif)
109
+ ---
110
+
111
+ ```python
112
+ python demo.py \
113
+ --prompt <"prompt text"> \ # prompt text
114
+ --checkpoint_path <model_path> \ # FlexAM checkpoint path (e.g checkpoints/Diffusion_Transformer/Wan2.2-Fun-5B-FLEXAM)
115
+ --output_dir <output_dir> \ # output directory
116
+ --input_path <input_path> \ # the reference video path
117
+ --repaint <True/repaint_path > \ # the repaint first frame image path of input source video or use FLUX to repaint the first frame \
118
+ --video_length=97 \
119
+ --sample_size 512 896 \
120
+ --generate_type='full_edit' \
121
+ --density 10 \ # Control the sparsity of tracking points
122
+ --gpu <gpu_id> \ # the gpu id
123
+ ```
124
+
125
+
126
+ #### 2. foreground edit
127
+ ![fg_edit](assets/fg_edit.gif)
128
+ ```python
129
+ python demo.py \
130
+ --prompt <"prompt text"> \ # prompt text
131
+ --checkpoint_path <model_path> \ # FlexAM checkpoint path (e.g checkpoints/Diffusion_Transformer/Wan2.2-Fun-5B-FLEXAM)
132
+ --output_dir <output_dir> \ # output directory
133
+ --input_path <input_path> \ # the reference video path
134
+ --repaint <True/repaint_path > \ # the repaint first frame image path of input source video or use FLUX to repaint the first frame \
135
+ --mask_path <mask_path> \ # White (255) represents the foreground to be edited, and black (0) remains unchanged
136
+ --video_length=97 \
137
+ --sample_size 512 896 \
138
+ --generate_type='foreground_edit' \
139
+ --dilation_pixels=30 \ # Dilation pixels for mask processing in foreground_edit mode
140
+ --density 10 \ # Control the sparsity of tracking points
141
+ --gpu <gpu_id> \ # the gpu id
142
+ ```
143
+
144
+ #### 3. background edit
145
+ ![bg_edit](assets/bg_edit.gif)
146
+ ```python
147
+ python demo.py \
148
+ --prompt <"prompt text"> \ # prompt text
149
+ --checkpoint_path <model_path> \ # FlexAM checkpoint path (e.g checkpoints/Diffusion_Transformer/Wan2.2-Fun-5B-FLEXAM)
150
+ --output_dir <output_dir> \ # output directory
151
+ --input_path <input_path> \ # the reference video path
152
+ --repaint <True/repaint_path > \ # the repaint first frame image path of input source video or use FLUX to repaint the first frame \
153
+ --mask_path <mask_path> \ # White (255) represents the unchanged foreground, while the background indicates the area to be edited
154
+ --video_length=97 \
155
+ --sample_size 512 896 \
156
+ --generate_type='background_edit' \
157
+ --density 10 \ # Control the sparsity of tracking points
158
+ --gpu <gpu_id> \ # the gpu id
159
+ ```
160
+
161
+ #### 4. Camera Control
162
+ ![camera_ctrl](assets/camera_ctrl.gif)
163
+
164
+ We provide three camera control methods: 1. Use predefined templates; 2. Use a pose text file (pose txt); 3. Input another video, where the "Pi3" automatically estimates the camera pose from it and applies it to the video to be generated.
165
+
166
+ ##### 1. Use predefined templates
167
+
168
+ We provide several template camera motion types, you can choose one of them. In practice, we find that providing a description of the camera motion in prompt will get better results.
169
+ ```python
170
+ python demo.py \
171
+ --prompt <"prompt text"> \ # prompt text
172
+ --checkpoint_path <model_path> \ # FlexAM checkpoint path (e.g checkpoints/Diffusion_Transformer/Wan2.2-Fun-5B-FLEXAM)
173
+ --output_dir <output_dir> \ # output directory
174
+ --input_path <input_path> \ # the reference image or video path
175
+ --camera_motion <camera_motion> \ # the camera motion type, see examples below
176
+ --tracking_method <tracking_method> \ # the tracking method (moge, DELTA). For image input, 'moge' is necessary.
177
+ --override_extrinsics <override/append> \ # how to apply camera motion: "override" to replace original camera, "append" to build upon it
178
+ --video_length=97 \
179
+ --sample_size 512 896 \
180
+ --density 5 \ # Control the sparsity of tracking points
181
+ --gpu <gpu_id> \ # the gpu id
182
+ ```
183
+
184
+ Here are some tips for camera motion:
185
+ - trans: translation motion, the camera will move in the direction of the vector (dx, dy, dz) with range [-1, 1]
186
+ - Positive X: Move left, Negative X: Move right
187
+ - Positive Y: Move down, Negative Y: Move up
188
+ - Positive Z: Zoom in, Negative Z: Zoom out
189
+ - e.g., 'trans -0.1 -0.1 -0.1' moving right, down and zoom in
190
+ - e.g., 'trans -0.1 0.0 0.0 5 45' moving right 0.1 from frame 5 to 45
191
+ - rot: rotation motion, the camera will rotate around the axis (x, y, z) by the angle
192
+ - X-axis rotation: positive X: pitch down, negative X: pitch up
193
+ - Y-axis rotation: positive Y: yaw left, negative Y: yaw right
194
+ - Z-axis rotation: positive Z: roll counter-clockwise, negative Z: roll clockwise
195
+ - e.g., 'rot y 25' rotating 25 degrees around y-axis (yaw left)
196
+ - e.g., 'rot x -30 10 40' rotating -30 degrees around x-axis (pitch up) from frame 10 to 40
197
+ - spiral: spiral motion, the camera will move in a spiral path with the given radius
198
+ - e.g., 'spiral 2' spiral motion with radius 2
199
+ - e.g., 'spiral 2 15 35' spiral motion with radius 2 from frame 15 to 35
200
+
201
+ Multiple transformations can be combined using semicolon (;) as separator:
202
+ - e.g., "trans 0 0 -0.5 0 30; rot x -25 0 30; trans -0.1 0 0 30 48"
203
+ This will:
204
+ 1. Zoom in (z-0.5) from frame 0 to 30
205
+ 2. Pitch up (rotate -25 degrees around x-axis) from frame 0 to 30
206
+ 3. Move right (x-0.1) from frame 30 to 48
207
+
208
+ Notes:
209
+ - If start_frame and end_frame are not specified, the motion will be applied to all frames (0-48)
210
+ - Frames after end_frame will maintain the final transformation
211
+ - For combined transformations, they are applied in sequence
212
+
213
+
214
+ ##### 2. Use a pose text file (pose txt)
215
+
216
+ ```python
217
+ python demo.py \
218
+ --prompt <"prompt text"> \ # prompt text
219
+ --checkpoint_path <model_path> \ # FlexAM checkpoint path (e.g checkpoints/Diffusion_Transformer/Wan2.2-Fun-5B-FLEXAM)
220
+ --output_dir <output_dir> \ # output directory
221
+ --input_path <input_path> \ # the reference image or video path
222
+ --camera_motion "path" \ # if camera motion type is "path", --pose_file is needed
223
+ --pose_file <pose_file_txt> \ # txt file of camera pose, Each line corresponds to one frame
224
+ --tracking_method <tracking_method> \ # the tracking method (moge, DELTA). For image input, 'moge' is necessary.
225
+ --override_extrinsics <override/append> \ # how to apply camera motion: "override" to replace original camera, "append" to build upon it
226
+ --video_length=97 \
227
+ --sample_size 512 896 \
228
+ --density 5 \ # Control the sparsity of tracking points
229
+ --gpu <gpu_id> \ # the gpu id
230
+ ```
231
+
232
+ ##### 3. Input another video for extract camera pose
233
+
234
+ ```python
235
+ python demo.py \
236
+ --prompt <"prompt text"> \ # prompt text
237
+ --checkpoint_path <model_path> \ # FlexAM checkpoint path (e.g checkpoints/Diffusion_Transformer/Wan2.2-Fun-5B-FLEXAM)
238
+ --output_dir <output_dir> \ # output directory
239
+ --input_path <input_path> \ # the reference image or video path
240
+ --camera_motion "path" \ # if camera motion type is "path", --pose_file is needed
241
+ --pose_file <pose_file_mp4> \ # "Pi3" automatically estimates the camera pose from this video file
242
+ --tracking_method <tracking_method> \ # the tracking method (moge, DELTA). For image input, 'moge' is necessary.
243
+ --override_extrinsics <override/append> \ # how to apply camera motion: "override" to replace original camera, "append" to build upon it
244
+ --video_length=97 \
245
+ --sample_size 512 896 \
246
+ --density 5 \ # Control the sparsity of tracking points
247
+ --gpu <gpu_id> \ # the gpu id
248
+ ```
249
+
250
+
251
+ #### 5. Object Manipulation
252
+ ![object](assets/object.gif)
253
+ We provide several template object manipulation types, you can choose one of them. In practice, we find that providing a description of the object motion in prompt will get better results.
254
+ ```python
255
+ python demo.py \
256
+ --prompt <"prompt text"> \ # prompt text
257
+ --checkpoint_path <model_path> \ # FlexAM checkpoint path (e.g checkpoints/Diffusion_Transformer/Wan2.2-Fun-5B-FLEXAM)
258
+ --input_path <input_path> \ # the reference image path
259
+ --object_motion <object_motion> \ # the object motion type (up, down, left, right)
260
+ --object_mask <object_mask_path> \ # the object mask path
261
+ --tracking_method <tracking_method> \ # the tracking method (moge, DELTA). For image input, 'moge' is nesserary.
262
+ --sample_size 512 896 \
263
+ --video_length=49 \
264
+ --density 30 \
265
+ --gpu <gpu_id> \ # the gpu id
266
+ ```
267
+ It should be noted that depending on the tracker you choose, you may need to modify the scale of translation.
268
+
269
+
270
+ ## 🙏 Acknowledgements
271
+
272
+ This project builds upon several excellent open source projects:
273
+
274
+ * [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
275
+
276
+ * [DELTA](https://github.com/snap-research/DELTA_densetrack3d)
277
+
278
+ * [MoGe](https://github.com/microsoft/MoGe)
279
+
280
+ * [vggt](https://github.com/facebookresearch/vggt)
281
+
282
+ * [Pi3](https://github.com/yyfz/Pi3)
283
+
284
+ We thank the authors and contributors of these projects for their valuable contributions to the open source community!
285
+
286
+ ## 🌟 Citation
287
+ If you find FlexAM useful for your research, please cite our paper:
288
+ ```
289
+
290
+ ```
291
+
292
+
assets/bg_edit.gif ADDED

Git LFS Details

  • SHA256: e49f61db3dfbce3195afaf3918478c910d1b6a39c1f464c9202104dc4687cd81
  • Pointer size: 132 Bytes
  • Size of remote file: 7.45 MB
assets/camera_ctrl.gif ADDED

Git LFS Details

  • SHA256: c7dd357520396841a8a18d53fa2a3afdbb6197acc336394b272e7cf8dcf6eee9
  • Pointer size: 133 Bytes
  • Size of remote file: 13.1 MB
assets/fg_edit.gif ADDED

Git LFS Details

  • SHA256: cf8be80255bc6ab7f785f841416770234f8326d9516314dee62b641326253732
  • Pointer size: 132 Bytes
  • Size of remote file: 5.25 MB
assets/flexam_workflow.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"id":"bfd25a8e-6175-448c-96fe-b50bd11c792c","revision":0,"last_node_id":37,"last_link_id":66,"nodes":[{"id":16,"type":"VHS_VideoInfoLoaded","pos":[172.1576385498047,858.3464965820312],"size":[247.837890625,106],"flags":{},"order":10,"mode":0,"inputs":[{"label":"video_info","localized_name":"video_info","name":"video_info","type":"VHS_VIDEOINFO","link":37}],"outputs":[{"label":"fps🟦","localized_name":"fps🟦","name":"fps🟦","type":"FLOAT","links":[66]},{"label":"frame_count🟦","localized_name":"frame_count🟦","name":"frame_count🟦","type":"INT","links":[]},{"label":"duration🟦","localized_name":"duration🟦","name":"duration🟦","type":"FLOAT","links":null},{"label":"width🟦","localized_name":"width🟦","name":"width🟦","type":"INT","links":[]},{"label":"height🟦","localized_name":"height🟦","name":"height🟦","type":"INT","links":[]}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"8550981384301e9bc5bfea83e5c2c75258102593","Node name for S&R":"VHS_VideoInfoLoaded","widget_ue_connectable":{}},"widgets_values":{}},{"id":20,"type":"SAM2ModelLoader (segment anything2)","pos":[116.06722259521484,1170.0262451171875],"size":[351.2642517089844,58],"flags":{},"order":0,"mode":0,"inputs":[{"localized_name":"model_name","name":"model_name","type":"COMBO","widget":{"name":"model_name"},"link":null}],"outputs":[{"localized_name":"SAM2_MODEL","name":"SAM2_MODEL","type":"SAM2_MODEL","links":[31]}],"properties":{"cnr_id":"comfyui-sam2","ver":"1.0.3","Node name for S&R":"SAM2ModelLoader (segment anything2)"},"widgets_values":["sam2_hiera_small.pt"]},{"id":17,"type":"GroundingDinoModelLoader (segment anything2)","pos":[83.4976577758789,1269.9647216796875],"size":[407.3121032714844,58],"flags":{},"order":1,"mode":0,"inputs":[{"localized_name":"model_name","name":"model_name","type":"COMBO","widget":{"name":"model_name"},"link":null}],"outputs":[{"localized_name":"GROUNDING_DINO_MODEL","name":"GROUNDING_DINO_MODEL","type":"GROUNDING_DINO_MODEL","links":[32]}],"properties":{"cnr_id":"comfyui-sam2","ver":"1.0.3","Node name for S&R":"GroundingDinoModelLoader (segment anything2)"},"widgets_values":["GroundingDINO_SwinB (938MB)"]},{"id":24,"type":"FunTextBox","pos":[-535.8018188476562,178.36338806152344],"size":[368.5529479980469,159.4075927734375],"flags":{},"order":2,"mode":0,"inputs":[{"localized_name":"prompt","name":"prompt","type":"STRING","widget":{"name":"prompt"},"link":null}],"outputs":[{"localized_name":"prompt","name":"prompt","type":"STRING_PROMPT","slot_index":0,"links":[41]}],"properties":{"aux_id":"Sandwich-2020/das_v2","ver":"567e9a3a2ba875b1da8f817beffdbdf50c1cfb37","Node name for S&R":"FunTextBox","cnr_id":"videox-fun"},"widgets_values":["色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"]},{"id":15,"type":"VideoToTrackingVisualizeAll","pos":[525.6013793945312,242.7574920654297],"size":[396.8941345214844,206],"flags":{},"order":13,"mode":0,"inputs":[{"localized_name":"input_video","name":"input_video","type":"IMAGE","link":23},{"localized_name":"pred_tracks","name":"pred_tracks","type":"TRACKING_DATA","link":21},{"localized_name":"pred_visibility","name":"pred_visibility","type":"TRACKING_DATA","link":22},{"localized_name":"mask_video","name":"mask_video","shape":7,"type":"IMAGE","link":38},{"localized_name":"point_size","name":"point_size","type":"INT","widget":{"name":"point_size"},"link":null},{"localized_name":"cos_level","name":"cos_level","type":"INT","widget":{"name":"cos_level"},"link":null},{"localized_name":"generate_type","name":"generate_type","type":"COMBO","widget":{"name":"generate_type"},"link":56}],"outputs":[{"localized_name":"tracking_video","name":"tracking_video","type":"IMAGE","links":[48,58]},{"localized_name":"depth_video","name":"depth_video","type":"IMAGE","links":[25,49]},{"localized_name":"cos_level_0","name":"cos_level_0","type":"IMAGE","links":[26,50]},{"localized_name":"cos_level_1","name":"cos_level_1","type":"IMAGE","links":[27,51]},{"localized_name":"cos_level_2","name":"cos_level_2","type":"IMAGE","links":[28,52]},{"localized_name":"cos_level_3","name":"cos_level_3","type":"IMAGE","links":[29,53]}],"properties":{"aux_id":"Sandwich-2020/FlexAM","ver":"223059aa32cbbfa6fb848d09f050cda6e9368ad3","Node name for S&R":"VideoToTrackingVisualizeAll"},"widgets_values":[4,4,"motion_transfer"]},{"id":18,"type":"ImageAndMaskPreview","pos":[145.69639587402344,1370.8953857421875],"size":[270,338.0001220703125],"flags":{},"order":11,"mode":0,"inputs":[{"localized_name":"image","name":"image","shape":7,"type":"IMAGE","link":null},{"localized_name":"mask","name":"mask","shape":7,"type":"MASK","link":30},{"localized_name":"mask_opacity","name":"mask_opacity","type":"FLOAT","widget":{"name":"mask_opacity"},"link":null},{"localized_name":"mask_color","name":"mask_color","type":"STRING","widget":{"name":"mask_color"},"link":null},{"localized_name":"pass_through","name":"pass_through","type":"BOOLEAN","widget":{"name":"pass_through"},"link":null}],"outputs":[{"localized_name":"composite","name":"composite","type":"IMAGE","links":null}],"properties":{"cnr_id":"comfyui-kjnodes","ver":"6c996e1877db08c7de020ee14421dd28d7574ec2","Node name for S&R":"ImageAndMaskPreview","aux_id":"kijai/ComfyUI-KJNodes"},"widgets_values":[1,"255, 255, 255",false]},{"id":35,"type":"VHS_VideoCombine","pos":[522.1612548828125,831.1239624023438],"size":[336.2972106933594,534.8648071289062],"flags":{},"order":14,"mode":0,"inputs":[{"label":"images","localized_name":"images","name":"images","type":"IMAGE","link":65},{"label":"audio","localized_name":"audio","name":"audio","shape":7,"type":"AUDIO","link":null},{"label":"meta_batch","localized_name":"meta_batch","name":"meta_batch","shape":7,"type":"VHS_BatchManager","link":null},{"label":"vae","localized_name":"vae","name":"vae","shape":7,"type":"VAE","link":null},{"localized_name":"frame_rate","name":"frame_rate","type":"FLOAT","widget":{"name":"frame_rate"},"link":66},{"localized_name":"loop_count","name":"loop_count","type":"INT","widget":{"name":"loop_count"},"link":null},{"localized_name":"filename_prefix","name":"filename_prefix","type":"STRING","widget":{"name":"filename_prefix"},"link":null},{"localized_name":"format","name":"format","type":"COMBO","widget":{"name":"format"},"link":null},{"localized_name":"pingpong","name":"pingpong","type":"BOOLEAN","widget":{"name":"pingpong"},"link":null},{"localized_name":"save_output","name":"save_output","type":"BOOLEAN","widget":{"name":"save_output"},"link":null},{"name":"pix_fmt","type":["yuv420p10le","yuv420p"],"widget":{"name":"pix_fmt"},"link":null},{"name":"crf","type":"INT","widget":{"name":"crf"},"link":null},{"name":"save_metadata","type":"BOOLEAN","widget":{"name":"save_metadata"},"link":null}],"outputs":[{"label":"Filenames","localized_name":"Filenames","name":"Filenames","type":"VHS_FILENAMES","links":[]}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"8550981384301e9bc5bfea83e5c2c75258102593","Node name for S&R":"VHS_VideoCombine","widget_ue_connectable":{}},"widgets_values":{"frame_rate":16,"loop_count":0,"filename_prefix":"AnimateDiff","format":"video/h265-mp4","pix_fmt":"yuv420p10le","crf":22,"save_metadata":true,"pingpong":false,"save_output":true,"videopreview":{"paused":false,"hidden":false,"params":{"filename":"AnimateDiff_00078.mp4","format":"video/h265-mp4","subfolder":"","type":"output","frame_rate":24,"workflow":"AnimateDiff_00078.png","fullpath":"/hpc2hdd/JH_DATA/share/yingcongchen/PrivateShareGroup/yingcongchen_datasets/das/ComfyUI/output/AnimateDiff_00078.mp4"}}}},{"id":26,"type":"LoadWan2_2FunModel_FlexAM","pos":[624.5392456054688,7.184566497802734],"size":[276.705078125,154],"flags":{},"order":3,"mode":0,"inputs":[{"localized_name":"model","name":"model","type":"COMBO","widget":{"name":"model"},"link":null},{"localized_name":"model_type","name":"model_type","type":"COMBO","widget":{"name":"model_type"},"link":null},{"localized_name":"GPU_memory_mode","name":"GPU_memory_mode","type":"COMBO","widget":{"name":"GPU_memory_mode"},"link":null},{"localized_name":"config","name":"config","type":"COMBO","widget":{"name":"config"},"link":null},{"localized_name":"precision","name":"precision","type":"COMBO","widget":{"name":"precision"},"link":null}],"outputs":[{"localized_name":"funmodels","name":"funmodels","type":"FunModels","links":[39]}],"properties":{"aux_id":"Sandwich-2020/FlexAM","ver":"8010f2ca4872f6c6bb2a7cce01d60e79dcb614e1","Node name for S&R":"LoadWan2_2FunModel_FlexAM"},"widgets_values":["Wan2.2-Fun-5B-Control","Control","model_cpu_offload","wan2.2/wan_civitai_5b_FlexAM.yaml","fp16"]},{"id":28,"type":"VHS_VideoCombine","pos":[1343.7791748046875,87.93187713623047],"size":[390.9534912109375,601.810302734375],"flags":{},"order":22,"mode":0,"inputs":[{"label":"图像","localized_name":"images","name":"images","type":"IMAGE","link":44},{"label":"音频","localized_name":"audio","name":"audio","shape":7,"type":"AUDIO","link":null},{"label":"批次管理","localized_name":"meta_batch","name":"meta_batch","shape":7,"type":"VHS_BatchManager","link":null},{"localized_name":"vae","name":"vae","shape":7,"type":"VAE","link":null},{"localized_name":"frame_rate","name":"frame_rate","type":"FLOAT","widget":{"name":"frame_rate"},"link":null},{"localized_name":"loop_count","name":"loop_count","type":"INT","widget":{"name":"loop_count"},"link":null},{"localized_name":"filename_prefix","name":"filename_prefix","type":"STRING","widget":{"name":"filename_prefix"},"link":null},{"localized_name":"format","name":"format","type":"COMBO","widget":{"name":"format"},"link":null},{"localized_name":"pingpong","name":"pingpong","type":"BOOLEAN","widget":{"name":"pingpong"},"link":null},{"localized_name":"save_output","name":"save_output","type":"BOOLEAN","widget":{"name":"save_output"},"link":null},{"name":"pix_fmt","type":["yuv420p","yuv420p10le"],"widget":{"name":"pix_fmt"},"link":null},{"name":"crf","type":"INT","widget":{"name":"crf"},"link":null},{"name":"save_metadata","type":"BOOLEAN","widget":{"name":"save_metadata"},"link":null},{"name":"trim_to_audio","type":"BOOLEAN","widget":{"name":"trim_to_audio"},"link":null}],"outputs":[{"label":"文件名","localized_name":"Filenames","name":"Filenames","type":"VHS_FILENAMES","slot_index":0,"links":null}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"3234937ff5f3ca19068aaba5042771514de2429d","Node name for S&R":"VHS_VideoCombine"},"widgets_values":{"frame_rate":16,"loop_count":0,"filename_prefix":"flexam/output","format":"video/h264-mp4","pix_fmt":"yuv420p","crf":22,"save_metadata":true,"trim_to_audio":false,"pingpong":false,"save_output":true,"videopreview":{"hidden":false,"paused":false,"params":{"filename":"output_00262.mp4","subfolder":"flexam","type":"output","format":"video/h264-mp4","frame_rate":16,"workflow":"output_00262.png","fullpath":"/hpc2hdd/JH_DATA/share/yingcongchen/PrivateShareGroup/yingcongchen_datasets/das/ComfyUI/output/flexam/output_00262.mp4"}}}},{"id":3,"type":"VHS_VideoCombine","pos":[904.107421875,741.1229248046875],"size":[390.9534912109375,595.3023071289062],"flags":{},"order":15,"mode":0,"inputs":[{"label":"图像","localized_name":"images","name":"images","type":"IMAGE","link":58},{"label":"音频","localized_name":"audio","name":"audio","shape":7,"type":"AUDIO","link":null},{"label":"批次管理","localized_name":"meta_batch","name":"meta_batch","shape":7,"type":"VHS_BatchManager","link":null},{"localized_name":"vae","name":"vae","shape":7,"type":"VAE","link":null},{"localized_name":"frame_rate","name":"frame_rate","type":"FLOAT","widget":{"name":"frame_rate"},"link":null},{"localized_name":"loop_count","name":"loop_count","type":"INT","widget":{"name":"loop_count"},"link":null},{"localized_name":"filename_prefix","name":"filename_prefix","type":"STRING","widget":{"name":"filename_prefix"},"link":null},{"localized_name":"format","name":"format","type":"COMBO","widget":{"name":"format"},"link":null},{"localized_name":"pingpong","name":"pingpong","type":"BOOLEAN","widget":{"name":"pingpong"},"link":null},{"localized_name":"save_output","name":"save_output","type":"BOOLEAN","widget":{"name":"save_output"},"link":null},{"name":"pix_fmt","type":["yuv420p","yuv420p10le"],"widget":{"name":"pix_fmt"},"link":null},{"name":"crf","type":"INT","widget":{"name":"crf"},"link":null},{"name":"save_metadata","type":"BOOLEAN","widget":{"name":"save_metadata"},"link":null},{"name":"trim_to_audio","type":"BOOLEAN","widget":{"name":"trim_to_audio"},"link":null}],"outputs":[{"label":"文件名","localized_name":"Filenames","name":"Filenames","type":"VHS_FILENAMES","slot_index":0,"links":null}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"3234937ff5f3ca19068aaba5042771514de2429d","Node name for S&R":"VHS_VideoCombine"},"widgets_values":{"frame_rate":16,"loop_count":0,"filename_prefix":"flexam/output","format":"video/h264-mp4","pix_fmt":"yuv420p","crf":22,"save_metadata":true,"trim_to_audio":false,"pingpong":false,"save_output":true,"videopreview":{"hidden":false,"paused":false,"params":{"filename":"output_00255.mp4","subfolder":"flexam","type":"output","format":"video/h264-mp4","frame_rate":16,"workflow":"output_00255.png","fullpath":"/hpc2hdd/JH_DATA/share/yingcongchen/PrivateShareGroup/yingcongchen_datasets/das/ComfyUI/output/flexam/output_00255.mp4"}}}},{"id":10,"type":"VHS_VideoCombine","pos":[1318.115966796875,733.9944458007812],"size":[390.9534912109375,595.3023071289062],"flags":{},"order":16,"mode":0,"inputs":[{"label":"图像","localized_name":"images","name":"images","type":"IMAGE","link":25},{"label":"音频","localized_name":"audio","name":"audio","shape":7,"type":"AUDIO","link":null},{"label":"批次管理","localized_name":"meta_batch","name":"meta_batch","shape":7,"type":"VHS_BatchManager","link":null},{"localized_name":"vae","name":"vae","shape":7,"type":"VAE","link":null},{"localized_name":"frame_rate","name":"frame_rate","type":"FLOAT","widget":{"name":"frame_rate"},"link":null},{"localized_name":"loop_count","name":"loop_count","type":"INT","widget":{"name":"loop_count"},"link":null},{"localized_name":"filename_prefix","name":"filename_prefix","type":"STRING","widget":{"name":"filename_prefix"},"link":null},{"localized_name":"format","name":"format","type":"COMBO","widget":{"name":"format"},"link":null},{"localized_name":"pingpong","name":"pingpong","type":"BOOLEAN","widget":{"name":"pingpong"},"link":null},{"localized_name":"save_output","name":"save_output","type":"BOOLEAN","widget":{"name":"save_output"},"link":null},{"name":"pix_fmt","type":["yuv420p","yuv420p10le"],"widget":{"name":"pix_fmt"},"link":null},{"name":"crf","type":"INT","widget":{"name":"crf"},"link":null},{"name":"save_metadata","type":"BOOLEAN","widget":{"name":"save_metadata"},"link":null},{"name":"trim_to_audio","type":"BOOLEAN","widget":{"name":"trim_to_audio"},"link":null}],"outputs":[{"label":"文件名","localized_name":"Filenames","name":"Filenames","type":"VHS_FILENAMES","slot_index":0,"links":null}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"3234937ff5f3ca19068aaba5042771514de2429d","Node name for S&R":"VHS_VideoCombine"},"widgets_values":{"frame_rate":16,"loop_count":0,"filename_prefix":"flexam/output","format":"video/h264-mp4","pix_fmt":"yuv420p","crf":22,"save_metadata":true,"trim_to_audio":false,"pingpong":false,"save_output":true,"videopreview":{"hidden":false,"paused":false,"params":{"filename":"output_00254.mp4","subfolder":"flexam","type":"output","format":"video/h264-mp4","frame_rate":16,"workflow":"output_00254.png","fullpath":"/hpc2hdd/JH_DATA/share/yingcongchen/PrivateShareGroup/yingcongchen_datasets/das/ComfyUI/output/flexam/output_00254.mp4"}}}},{"id":11,"type":"VHS_VideoCombine","pos":[1729.705322265625,732.6654663085938],"size":[390.9534912109375,595.3023071289062],"flags":{},"order":17,"mode":0,"inputs":[{"label":"图像","localized_name":"images","name":"images","type":"IMAGE","link":26},{"label":"音频","localized_name":"audio","name":"audio","shape":7,"type":"AUDIO","link":null},{"label":"批次管理","localized_name":"meta_batch","name":"meta_batch","shape":7,"type":"VHS_BatchManager","link":null},{"localized_name":"vae","name":"vae","shape":7,"type":"VAE","link":null},{"localized_name":"frame_rate","name":"frame_rate","type":"FLOAT","widget":{"name":"frame_rate"},"link":null},{"localized_name":"loop_count","name":"loop_count","type":"INT","widget":{"name":"loop_count"},"link":null},{"localized_name":"filename_prefix","name":"filename_prefix","type":"STRING","widget":{"name":"filename_prefix"},"link":null},{"localized_name":"format","name":"format","type":"COMBO","widget":{"name":"format"},"link":null},{"localized_name":"pingpong","name":"pingpong","type":"BOOLEAN","widget":{"name":"pingpong"},"link":null},{"localized_name":"save_output","name":"save_output","type":"BOOLEAN","widget":{"name":"save_output"},"link":null},{"name":"pix_fmt","type":["yuv420p","yuv420p10le"],"widget":{"name":"pix_fmt"},"link":null},{"name":"crf","type":"INT","widget":{"name":"crf"},"link":null},{"name":"save_metadata","type":"BOOLEAN","widget":{"name":"save_metadata"},"link":null},{"name":"trim_to_audio","type":"BOOLEAN","widget":{"name":"trim_to_audio"},"link":null}],"outputs":[{"label":"文件名","localized_name":"Filenames","name":"Filenames","type":"VHS_FILENAMES","slot_index":0,"links":null}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"3234937ff5f3ca19068aaba5042771514de2429d","Node name for S&R":"VHS_VideoCombine"},"widgets_values":{"frame_rate":16,"loop_count":100,"filename_prefix":"flexam/output","format":"video/h264-mp4","pix_fmt":"yuv420p","crf":22,"save_metadata":true,"trim_to_audio":false,"pingpong":false,"save_output":true,"videopreview":{"hidden":false,"paused":false,"params":{"filename":"output_00256.mp4","subfolder":"flexam","type":"output","format":"video/h264-mp4","frame_rate":16,"workflow":"output_00256.png","fullpath":"/hpc2hdd/JH_DATA/share/yingcongchen/PrivateShareGroup/yingcongchen_datasets/das/ComfyUI/output/flexam/output_00256.mp4"}}}},{"id":12,"type":"VHS_VideoCombine","pos":[904.847900390625,1337.927001953125],"size":[390.9534912109375,595.3023071289062],"flags":{},"order":18,"mode":0,"inputs":[{"label":"图像","localized_name":"images","name":"images","type":"IMAGE","link":27},{"label":"音频","localized_name":"audio","name":"audio","shape":7,"type":"AUDIO","link":null},{"label":"批次管理","localized_name":"meta_batch","name":"meta_batch","shape":7,"type":"VHS_BatchManager","link":null},{"localized_name":"vae","name":"vae","shape":7,"type":"VAE","link":null},{"localized_name":"frame_rate","name":"frame_rate","type":"FLOAT","widget":{"name":"frame_rate"},"link":null},{"localized_name":"loop_count","name":"loop_count","type":"INT","widget":{"name":"loop_count"},"link":null},{"localized_name":"filename_prefix","name":"filename_prefix","type":"STRING","widget":{"name":"filename_prefix"},"link":null},{"localized_name":"format","name":"format","type":"COMBO","widget":{"name":"format"},"link":null},{"localized_name":"pingpong","name":"pingpong","type":"BOOLEAN","widget":{"name":"pingpong"},"link":null},{"localized_name":"save_output","name":"save_output","type":"BOOLEAN","widget":{"name":"save_output"},"link":null},{"name":"pix_fmt","type":["yuv420p","yuv420p10le"],"widget":{"name":"pix_fmt"},"link":null},{"name":"crf","type":"INT","widget":{"name":"crf"},"link":null},{"name":"save_metadata","type":"BOOLEAN","widget":{"name":"save_metadata"},"link":null},{"name":"trim_to_audio","type":"BOOLEAN","widget":{"name":"trim_to_audio"},"link":null}],"outputs":[{"label":"文件名","localized_name":"Filenames","name":"Filenames","type":"VHS_FILENAMES","slot_index":0,"links":null}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"3234937ff5f3ca19068aaba5042771514de2429d","Node name for S&R":"VHS_VideoCombine"},"widgets_values":{"frame_rate":16,"loop_count":0,"filename_prefix":"flexam/output","format":"video/h264-mp4","pix_fmt":"yuv420p","crf":22,"save_metadata":true,"trim_to_audio":false,"pingpong":false,"save_output":true,"videopreview":{"hidden":false,"paused":false,"params":{"filename":"output_00258.mp4","subfolder":"flexam","type":"output","format":"video/h264-mp4","frame_rate":16,"workflow":"output_00258.png","fullpath":"/hpc2hdd/JH_DATA/share/yingcongchen/PrivateShareGroup/yingcongchen_datasets/das/ComfyUI/output/flexam/output_00258.mp4"}}}},{"id":13,"type":"VHS_VideoCombine","pos":[1317.166015625,1338.3345947265625],"size":[390.9534912109375,595.3023071289062],"flags":{},"order":19,"mode":0,"inputs":[{"label":"图像","localized_name":"images","name":"images","type":"IMAGE","link":28},{"label":"音频","localized_name":"audio","name":"audio","shape":7,"type":"AUDIO","link":null},{"label":"批次管理","localized_name":"meta_batch","name":"meta_batch","shape":7,"type":"VHS_BatchManager","link":null},{"localized_name":"vae","name":"vae","shape":7,"type":"VAE","link":null},{"localized_name":"frame_rate","name":"frame_rate","type":"FLOAT","widget":{"name":"frame_rate"},"link":null},{"localized_name":"loop_count","name":"loop_count","type":"INT","widget":{"name":"loop_count"},"link":null},{"localized_name":"filename_prefix","name":"filename_prefix","type":"STRING","widget":{"name":"filename_prefix"},"link":null},{"localized_name":"format","name":"format","type":"COMBO","widget":{"name":"format"},"link":null},{"localized_name":"pingpong","name":"pingpong","type":"BOOLEAN","widget":{"name":"pingpong"},"link":null},{"localized_name":"save_output","name":"save_output","type":"BOOLEAN","widget":{"name":"save_output"},"link":null},{"name":"pix_fmt","type":["yuv420p","yuv420p10le"],"widget":{"name":"pix_fmt"},"link":null},{"name":"crf","type":"INT","widget":{"name":"crf"},"link":null},{"name":"save_metadata","type":"BOOLEAN","widget":{"name":"save_metadata"},"link":null},{"name":"trim_to_audio","type":"BOOLEAN","widget":{"name":"trim_to_audio"},"link":null}],"outputs":[{"label":"文件名","localized_name":"Filenames","name":"Filenames","type":"VHS_FILENAMES","slot_index":0,"links":null}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"3234937ff5f3ca19068aaba5042771514de2429d","Node name for S&R":"VHS_VideoCombine"},"widgets_values":{"frame_rate":16,"loop_count":0,"filename_prefix":"flexam/output","format":"video/h264-mp4","pix_fmt":"yuv420p","crf":22,"save_metadata":true,"trim_to_audio":false,"pingpong":false,"save_output":true,"videopreview":{"hidden":false,"paused":false,"params":{"filename":"output_00257.mp4","subfolder":"flexam","type":"output","format":"video/h264-mp4","frame_rate":16,"workflow":"output_00257.png","fullpath":"/hpc2hdd/JH_DATA/share/yingcongchen/PrivateShareGroup/yingcongchen_datasets/das/ComfyUI/output/flexam/output_00257.mp4"}}}},{"id":14,"type":"VHS_VideoCombine","pos":[1731.470703125,1340.0020751953125],"size":[390.9534912109375,595.3023071289062],"flags":{},"order":20,"mode":0,"inputs":[{"label":"图像","localized_name":"images","name":"images","type":"IMAGE","link":29},{"label":"音频","localized_name":"audio","name":"audio","shape":7,"type":"AUDIO","link":null},{"label":"批次管理","localized_name":"meta_batch","name":"meta_batch","shape":7,"type":"VHS_BatchManager","link":null},{"localized_name":"vae","name":"vae","shape":7,"type":"VAE","link":null},{"localized_name":"frame_rate","name":"frame_rate","type":"FLOAT","widget":{"name":"frame_rate"},"link":null},{"localized_name":"loop_count","name":"loop_count","type":"INT","widget":{"name":"loop_count"},"link":null},{"localized_name":"filename_prefix","name":"filename_prefix","type":"STRING","widget":{"name":"filename_prefix"},"link":null},{"localized_name":"format","name":"format","type":"COMBO","widget":{"name":"format"},"link":null},{"localized_name":"pingpong","name":"pingpong","type":"BOOLEAN","widget":{"name":"pingpong"},"link":null},{"localized_name":"save_output","name":"save_output","type":"BOOLEAN","widget":{"name":"save_output"},"link":null},{"name":"pix_fmt","type":["yuv420p","yuv420p10le"],"widget":{"name":"pix_fmt"},"link":null},{"name":"crf","type":"INT","widget":{"name":"crf"},"link":null},{"name":"save_metadata","type":"BOOLEAN","widget":{"name":"save_metadata"},"link":null},{"name":"trim_to_audio","type":"BOOLEAN","widget":{"name":"trim_to_audio"},"link":null}],"outputs":[{"label":"文件名","localized_name":"Filenames","name":"Filenames","type":"VHS_FILENAMES","slot_index":0,"links":null}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"3234937ff5f3ca19068aaba5042771514de2429d","Node name for S&R":"VHS_VideoCombine"},"widgets_values":{"frame_rate":16,"loop_count":0,"filename_prefix":"flexam/output","format":"video/h264-mp4","pix_fmt":"yuv420p","crf":22,"save_metadata":true,"trim_to_audio":false,"pingpong":false,"save_output":true,"videopreview":{"hidden":false,"paused":false,"params":{"filename":"output_00259.mp4","subfolder":"flexam","type":"output","format":"video/h264-mp4","frame_rate":16,"workflow":"output_00259.png","fullpath":"/hpc2hdd/JH_DATA/share/yingcongchen/PrivateShareGroup/yingcongchen_datasets/das/ComfyUI/output/flexam/output_00259.mp4"}}}},{"id":6,"type":"VideoToTrackingPredict","pos":[-149.93978881835938,376.44244384765625],"size":[280.4302673339844,78],"flags":{},"order":8,"mode":0,"inputs":[{"localized_name":"input_video","name":"input_video","type":"IMAGE","link":7},{"localized_name":"density","name":"density","type":"INT","widget":{"name":"density"},"link":null}],"outputs":[{"localized_name":"pred_tracks","name":"pred_tracks","type":"TRACKING_DATA","links":[21]},{"localized_name":"pred_visibility","name":"pred_visibility","type":"TRACKING_DATA","links":[22]}],"properties":{"aux_id":"Sandwich-2020/FlexAM","ver":"6e51ca03cab9988a8ad23cb3be6d7ea5c38e8e2b","Node name for S&R":"VideoToTrackingPredict"},"widgets_values":[10]},{"id":22,"type":"MaskToImage","pos":[254.90896606445312,613.7998657226562],"size":[217.51222229003906,63.1148681640625],"flags":{"collapsed":false},"order":12,"mode":0,"inputs":[{"label":"mask","localized_name":"遮罩","name":"mask","type":"MASK","link":35}],"outputs":[{"label":"IMAGE","localized_name":"图像","name":"IMAGE","type":"IMAGE","links":[38,45,65]}],"properties":{"cnr_id":"comfy-core","ver":"0.3.60","Node name for S&R":"MaskToImage","widget_ue_connectable":{}},"widgets_values":[]},{"id":30,"type":"PrimitiveNode","pos":[-476.5188903808594,1297.36083984375],"size":[210,106],"flags":{},"order":5,"mode":0,"inputs":[],"outputs":[{"name":"COMBO","type":"COMBO","widget":{"name":"generate_type"},"links":[56,57]}],"properties":{"Run widget replace on values":false},"widgets_values":["motion_transfer","fixed",""]},{"id":23,"type":"LoadImage","pos":[-520.7343139648438,943.2153930664062],"size":[315,314.00006103515625],"flags":{},"order":6,"mode":0,"inputs":[{"localized_name":"图像","name":"image","type":"COMBO","widget":{"name":"image"},"link":null},{"localized_name":"选择文件上传","name":"upload","type":"IMAGEUPLOAD","widget":{"name":"upload"},"link":null}],"outputs":[{"localized_name":"图像","name":"IMAGE","type":"IMAGE","links":[42,43]},{"localized_name":"遮罩","name":"MASK","type":"MASK","links":null}],"properties":{"cnr_id":"comfy-core","ver":"0.6.0","Node name for S&R":"LoadImage"},"widgets_values":["34.jpg","image"]},{"id":2,"type":"VHS_LoadVideo","pos":[-538.6278076171875,379.6762390136719],"size":[367.5401611328125,555.6934204101562],"flags":{},"order":7,"mode":0,"inputs":[{"label":"meta_batch","localized_name":"meta_batch","name":"meta_batch","shape":7,"type":"VHS_BatchManager","link":null},{"label":"vae","localized_name":"vae","name":"vae","shape":7,"type":"VAE","link":null},{"localized_name":"video","name":"video","type":"COMBO","widget":{"name":"video"},"link":null},{"localized_name":"force_rate","name":"force_rate","type":"FLOAT","widget":{"name":"force_rate"},"link":null},{"localized_name":"custom_width","name":"custom_width","type":"INT","widget":{"name":"custom_width"},"link":null},{"localized_name":"custom_height","name":"custom_height","type":"INT","widget":{"name":"custom_height"},"link":null},{"localized_name":"frame_load_cap","name":"frame_load_cap","type":"INT","widget":{"name":"frame_load_cap"},"link":null},{"localized_name":"skip_first_frames","name":"skip_first_frames","type":"INT","widget":{"name":"skip_first_frames"},"link":null},{"localized_name":"select_every_nth","name":"select_every_nth","type":"INT","widget":{"name":"select_every_nth"},"link":null},{"localized_name":"format","name":"format","shape":7,"type":"COMBO","widget":{"name":"format"},"link":null}],"outputs":[{"label":"IMAGE","localized_name":"图像","name":"IMAGE","type":"IMAGE","slot_index":0,"links":[7,23,36,47]},{"label":"frame_count","localized_name":"frame_count","name":"frame_count","type":"INT"},{"label":"audio","localized_name":"audio","name":"audio","type":"AUDIO"},{"label":"video_info","localized_name":"video_info","name":"video_info","type":"VHS_VIDEOINFO","links":[37]}],"properties":{"cnr_id":"comfyui-videohelpersuite","ver":"8550981384301e9bc5bfea83e5c2c75258102593","Node name for S&R":"VHS_LoadVideo","widget_ue_connectable":{}},"widgets_values":{"video":"34.mp4","force_rate":0,"custom_width":0,"custom_height":0,"frame_load_cap":0,"skip_first_frames":0,"select_every_nth":1,"format":"AnimateDiff","choose video to upload":"image","videopreview":{"paused":false,"hidden":false,"params":{"force_rate":0,"filename":"34.mp4","select_every_nth":1,"frame_load_cap":0,"format":"video/mp4","skip_first_frames":0,"type":"input"}}}},{"id":27,"type":"Wan2_2FunV2VSampler_FlexAM","pos":[1004.994384765625,-76.85031127929688],"size":[280.724609375,766],"flags":{},"order":21,"mode":0,"inputs":[{"localized_name":"funmodels","name":"funmodels","type":"FunModels","link":39},{"localized_name":"prompt","name":"prompt","type":"STRING_PROMPT","link":40},{"localized_name":"negative_prompt","name":"negative_prompt","type":"STRING_PROMPT","link":41},{"localized_name":"original_video","name":"original_video","shape":7,"type":"IMAGE","link":47},{"localized_name":"depth_video","name":"depth_video","shape":7,"type":"IMAGE","link":49},{"localized_name":"control_video","name":"control_video","shape":7,"type":"IMAGE","link":48},{"localized_name":"cos_video0","name":"cos_video0","shape":7,"type":"IMAGE","link":50},{"localized_name":"cos_video1","name":"cos_video1","shape":7,"type":"IMAGE","link":51},{"localized_name":"cos_video2","name":"cos_video2","shape":7,"type":"IMAGE","link":52},{"localized_name":"cos_video3","name":"cos_video3","shape":7,"type":"IMAGE","link":53},{"localized_name":"mask_video","name":"mask_video","shape":7,"type":"IMAGE","link":45},{"localized_name":"start_image","name":"start_image","shape":7,"type":"IMAGE","link":42},{"localized_name":"end_image","name":"end_image","shape":7,"type":"IMAGE","link":null},{"localized_name":"ref_image","name":"ref_image","shape":7,"type":"IMAGE","link":43},{"localized_name":"camera_conditions","name":"camera_conditions","shape":7,"type":"STRING","link":null},{"localized_name":"riflex_k","name":"riflex_k","shape":7,"type":"RIFLEXT_ARGS","link":null},{"localized_name":"video_length","name":"video_length","type":"INT","widget":{"name":"video_length"},"link":null},{"localized_name":"base_resolution","name":"base_resolution","type":"COMBO","widget":{"name":"base_resolution"},"link":null},{"localized_name":"seed","name":"seed","type":"INT","widget":{"name":"seed"},"link":null},{"localized_name":"steps","name":"steps","type":"INT","widget":{"name":"steps"},"link":null},{"localized_name":"cfg","name":"cfg","type":"FLOAT","widget":{"name":"cfg"},"link":null},{"localized_name":"denoise_strength","name":"denoise_strength","type":"FLOAT","widget":{"name":"denoise_strength"},"link":null},{"localized_name":"scheduler","name":"scheduler","type":"COMBO","widget":{"name":"scheduler"},"link":null},{"localized_name":"shift","name":"shift","type":"INT","widget":{"name":"shift"},"link":null},{"localized_name":"boundary","name":"boundary","type":"FLOAT","widget":{"name":"boundary"},"link":null},{"localized_name":"teacache_threshold","name":"teacache_threshold","type":"FLOAT","widget":{"name":"teacache_threshold"},"link":null},{"localized_name":"enable_teacache","name":"enable_teacache","type":"COMBO","widget":{"name":"enable_teacache"},"link":null},{"localized_name":"num_skip_start_steps","name":"num_skip_start_steps","type":"INT","widget":{"name":"num_skip_start_steps"},"link":null},{"localized_name":"teacache_offload","name":"teacache_offload","type":"COMBO","widget":{"name":"teacache_offload"},"link":null},{"localized_name":"cfg_skip_ratio","name":"cfg_skip_ratio","type":"FLOAT","widget":{"name":"cfg_skip_ratio"},"link":null},{"localized_name":"generate_type","name":"generate_type","type":"COMBO","widget":{"name":"generate_type"},"link":57},{"localized_name":"dilation_pixels","name":"dilation_pixels","type":"INT","widget":{"name":"dilation_pixels"},"link":null}],"outputs":[{"localized_name":"images","name":"images","type":"IMAGE","links":[44]}],"properties":{"aux_id":"Sandwich-2020/FlexAM","ver":"8010f2ca4872f6c6bb2a7cce01d60e79dcb614e1","Node name for S&R":"Wan2_2FunV2VSampler_FlexAM"},"widgets_values":[49,512,1078487255624295,"randomize",50,6,1,"Flow",5,0.9,0.1,true,5,true,0,"motion_transfer",13]},{"id":19,"type":"GroundingDinoSAM2Segment (segment anything2)","pos":[99.71199035644531,1005.0245361328125],"size":[359.7466735839844,122],"flags":{},"order":9,"mode":0,"inputs":[{"localized_name":"sam_model","name":"sam_model","type":"SAM2_MODEL","link":31},{"localized_name":"grounding_dino_model","name":"grounding_dino_model","type":"GROUNDING_DINO_MODEL","link":32},{"localized_name":"image","name":"image","type":"IMAGE","link":36},{"localized_name":"prompt","name":"prompt","type":"STRING","widget":{"name":"prompt"},"link":null},{"localized_name":"threshold","name":"threshold","type":"FLOAT","widget":{"name":"threshold"},"link":null}],"outputs":[{"localized_name":"图像","name":"IMAGE","type":"IMAGE","links":null},{"localized_name":"遮罩","name":"MASK","type":"MASK","links":[30,35]}],"properties":{"cnr_id":"comfyui-sam2","ver":"1.0.3","Node name for S&R":"GroundingDinoSAM2Segment (segment anything2)"},"widgets_values":["building",0.35]},{"id":25,"type":"FunTextBox","pos":[-545.2793579101562,-27.938735961914062],"size":[380.845703125,157.68350219726562],"flags":{},"order":4,"mode":0,"inputs":[{"localized_name":"prompt","name":"prompt","type":"STRING","widget":{"name":"prompt"},"link":null}],"outputs":[{"localized_name":"prompt","name":"prompt","type":"STRING_PROMPT","slot_index":0,"links":[40]}],"title":"Positive Prompt(正向提示词)","properties":{"aux_id":"Sandwich-2020/das_v2","ver":"567e9a3a2ba875b1da8f817beffdbdf50c1cfb37","Node name for S&R":"FunTextBox","cnr_id":"videox-fun"},"widgets_values":["A building"]}],"links":[[7,2,0,6,0,"IMAGE"],[21,6,0,15,1,"TRACKING_DATA"],[22,6,1,15,2,"TRACKING_DATA"],[23,2,0,15,0,"IMAGE"],[25,15,1,10,0,"IMAGE"],[26,15,2,11,0,"IMAGE"],[27,15,3,12,0,"IMAGE"],[28,15,4,13,0,"IMAGE"],[29,15,5,14,0,"IMAGE"],[30,19,1,18,1,"MASK"],[31,20,0,19,0,"SAM2_MODEL"],[32,17,0,19,1,"GROUNDING_DINO_MODEL"],[35,19,1,22,0,"MASK"],[36,2,0,19,2,"IMAGE"],[37,2,3,16,0,"VHS_VIDEOINFO"],[38,22,0,15,3,"IMAGE"],[39,26,0,27,0,"FunModels"],[40,25,0,27,1,"STRING_PROMPT"],[41,24,0,27,2,"STRING_PROMPT"],[42,23,0,27,11,"IMAGE"],[43,23,0,27,13,"IMAGE"],[44,27,0,28,0,"IMAGE"],[45,22,0,27,10,"IMAGE"],[47,2,0,27,3,"IMAGE"],[48,15,0,27,5,"IMAGE"],[49,15,1,27,4,"IMAGE"],[50,15,2,27,6,"IMAGE"],[51,15,3,27,7,"IMAGE"],[52,15,4,27,8,"IMAGE"],[53,15,5,27,9,"IMAGE"],[56,30,0,15,6,"COMBO"],[57,30,0,27,30,"COMBO"],[58,15,0,3,0,"IMAGE"],[65,22,0,35,0,"IMAGE"],[66,16,0,35,4,"FLOAT"]],"groups":[],"config":{},"extra":{"ds":{"scale":0.5274369073675707,"offset":[1296.7416827385423,224.42370781480264]}},"version":0.4}
assets/motion_transfer.gif ADDED

Git LFS Details

  • SHA256: 558e93bc865381709da03f75b2f204386fd16c9bb555eb273c08baf5d10e619a
  • Pointer size: 132 Bytes
  • Size of remote file: 5.31 MB
assets/object.gif ADDED

Git LFS Details

  • SHA256: 881c049b2a8d00e36e1b581aa9c006c08400cc6d39af265593bdaf126bac2086
  • Pointer size: 131 Bytes
  • Size of remote file: 850 kB
assets/teaser.gif ADDED

Git LFS Details

  • SHA256: a3b26483e27eccfd47a3455da2d805a4737438255cd6a5412673e6e525439e1e
  • Pointer size: 133 Bytes
  • Size of remote file: 11.3 MB