Manga Light Colorizer β ONNX Inference
Standalone inference script for the Manga Light Colorizer model.
Gallery
The following gallery uses the same source images as the manga-colorization-v2 project to facilitate direct comparison between models.
Comparison between input (left) and colorized output (right):
| Input (BW) | Colorized Output |
|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Quick Start
# Install dependencies
pip install -r requirements.txt
# Single image
python inference.py --input input/bw1.jpg
# All images in a folder
python inference.py --input input/
# Custom output folder
python inference.py --input input/ --output_dir output/
# Custom inference resolution
python inference.py --input input/ --infer-size 1024
Arguments
| Argument | Required | Default | Description |
|---|---|---|---|
--input |
Yes | - | Input grayscale image or folder |
--onnx-model |
No | models/v6_generator.onnx |
Generator ONNX model path |
--sam-onnx |
No | models/v6_sam_encoder.onnx |
SAM 2.1 encoder ONNX path |
--output_dir |
No | ./output/ |
Output folder for colorized images |
--infer-size |
No | 768 |
Inference resolution (square) |
--ort-device |
No | cpu |
ONNX Runtime device (cpu or cuda) |
Model Information
- Architecture: FastViT-SA36 Encoder + DualSemanticSAM Guide + UNet V6 Decoder
- Training Resolution: 512Γ512 pixels
- Current Inference Resolution: 768Γ768 pixels (default)
- Output: Resized back to original input resolution
Important: Resolution Notice
The model was trained at 512Γ512 pixels. Inference currently runs at 768Γ768 pixels by default.
More the inference resolution differs from 512Γ512, the less faithful the colors will be.
For best results, use the training resolution:
# Best color accuracy, but lower resolution β matches training resolution python inference.py --input input/ --infer-size 512 # Default (good quality) python inference.py --input input/ # Higher resolution (may reduce color accuracy) python inference.py --input input/ --infer-size 1024
Pipeline
Input (grayscale) β Resize to infer-size β SAM 2.1 (zeros) β Generator ONNX β Resize to original
Requirements
- Python 3.10+
- onnxruntime
- numpy
- opencv-python
See requirements.txt for full list.
License
Model Weights
Licensed under CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International).
You may:
- Share β copy and redistribute the material in any medium or format
- Adapt β remix, transform, and build upon the material
Under the following terms:
- Attribution β You must give appropriate credit
- NonCommercial β You may not use the material for commercial purposes
- ShareAlike β If you remix, transform, or build upon the material, you must distribute your contributions under the same license
See: https://creativecommons.org/licenses/by-nc-sa/4.0/
Inference Code
Licensed under GNU General Public License v3 (GPL-3.0).
You may use, modify, and distribute this code under the terms of the GPL-3.0 license.













