Instructions to use deepseek-ai/deepseek-vl2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use deepseek-ai/deepseek-vl2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="deepseek-ai/deepseek-vl2")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("deepseek-ai/deepseek-vl2", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use deepseek-ai/deepseek-vl2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "deepseek-ai/deepseek-vl2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepseek-ai/deepseek-vl2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/deepseek-ai/deepseek-vl2
- SGLang
How to use deepseek-ai/deepseek-vl2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "deepseek-ai/deepseek-vl2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepseek-ai/deepseek-vl2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "deepseek-ai/deepseek-vl2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepseek-ai/deepseek-vl2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use deepseek-ai/deepseek-vl2 with Docker Model Runner:
docker model run hf.co/deepseek-ai/deepseek-vl2
I can't get the model to run.
Hi, I'm new in this world and I'm trying to run the model but I'm failing.
My code looks like:
import torch
from transformers import AutoModelForCausalLM
from deepseek_vl2.models import DeepseekVLV2Processor, DeepseekVLV2ForCausalLM
from deepseek_vl2.utils.io import load_pil_images
# specify the path to the model
model_path = "deepseek-ai/deepseek-vl2-tiny"
When I run it I get an error:
> python my_extraction.py
Traceback (most recent call last):
File "/Users/andresinaka/Desktop/LLMs/my_extraction.py", line 4, in <module>
from deepseek_vl2.models import DeepseekVLV2Processor, DeepseekVLV2ForCausalLM
ModuleNotFoundError: No module named 'deepseek_vl2'
Not sure what's going on, I'm following the readme...
I have torch installed:
pip show torch
Name: torch
Version: 2.5.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3-Clause
Location: /Users/andres.canal/Desktop/LLMs/myenv/lib/python3.12/site-packages
Requires: filelock, fsspec, jinja2, networkx, setuptools, sympy, typing-extensions
Required-by: accelerate, easyocr, llms, torchvision
I have transformers installed:
pip show torch
Name: torch
Version: 2.5.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3-Clause
Location: /Users/andres.canal/Desktop/LLMs/myenv/lib/python3.12/site-packages
Requires: filelock, fsspec, jinja2, networkx, setuptools, sympy, typing-extensions
Required-by: accelerate, easyocr, llms, torchvision
(myenv) andres.canal:LLMs[main] $ pip show transformers
Name: transformers
Version: 4.49.0.dev0
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: /Users/andres.canal/Desktop/LLMs/myenv/lib/python3.12/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: llms
I also didn't manage to get it working, but seems I'm few seteps ahead of you. First of all, you should
git clone https://github.com/deepseek-ai/DeepSeek-VL2.git
then install module with following command according to ReadMe:
pip install -e .
After that you will notice it requires torch==2.0.1... and this torch version is available for Python 3.10 (not higher...) :)
Now I'm stuck at xformers module missing error:
File ~\python_virtual_envs\torch_test\lib\site-packages\deepseek_vl2\models\siglip_vit.py:16
14 from timm.models._manipulate import named_apply, checkpoint_seq, adapt_input_conv
15 from transformers.modeling_utils import is_flash_attn_2_available
---> 16 from xformers.ops import memory_efficient_attention
17 from functools import partial
20 if is_flash_attn_2_available():
ModuleNotFoundError: No module named 'xformers'
Thanks @beednarz-p100 !
I really can't understand why they don't make this more friendly to use with better documentation. I'm so used to Ollama that this feels like a nightmare hahahah...
I also didn't manage to get it working, but seems I'm few seteps ahead of you. First of all, you should
git clone https://github.com/deepseek-ai/DeepSeek-VL2.git
then install module with following command according to ReadMe:
pip install -e .After that you will notice it requires torch==2.0.1... and this torch version is available for Python 3.10 (not higher...) :)
Now I'm stuck at xformers module missing error:
File ~\python_virtual_envs\torch_test\lib\site-packages\deepseek_vl2\models\siglip_vit.py:16
14 from timm.models._manipulate import named_apply, checkpoint_seq, adapt_input_conv
15 from transformers.modeling_utils import is_flash_attn_2_available
---> 16 from xformers.ops import memory_efficient_attention
17 from functools import partial
20 if is_flash_attn_2_available():ModuleNotFoundError: No module named 'xformers'
Hi @beednarz-p100 , I was also stuck at xformers error. I tried a few solutions such as installing different xformers version based on your cuda version. You can get it from here https://github.com/facebookresearch/xformers#installing-xformers
My cuda version is 12.2 so tried to download xformers 12.4 but couldn't get to work.
Please let me know if you were able to get it to work
Hello @miral-songhela ,
Deepseek-VL2 uses torch==2.0.1 so you should be able to make it work by downloading xformers==0.0.20 --> https://github.com/facebookresearch/xformers/issues/752#issuecomment-1555756372
Figured out how to run deepseek-vl2-small, has someone figured out how to run deepseek-vl2, I know it's going to take multiple gpus, but having issues embedding inputs?
I did many trials and errors. Finally, I succeeded when I installed the following versions of the following packages:
Name: torch, Version: 2.0.1
Name: xformers, Version: 0.0.20
Name: flash-attn, Version: 2.5.8
Thanks for hint about xformers version. Finally it works for me with following setup:
xformers==0.0.20
torch==2.0.1+cu118
torchaudio==2.0.2+cu118
torchvision==0.15.2+cu118
numpy==1.26.4
CUDA Toolkit 11.8, because according to torch v2.0.1 page it is provided only CUDA v11.7 and 11.8. I had to downgrade numpy from v2 to latest v1. I do not have flash-attn at all. I'm getting error about missing Triton, but any case it works :)
"A matching Triton is not available, some optimizations will not be enabled."
I've cloned github repo and executed pip install -e, but now I'm suffering this error:
RuntimeError: Failed to import transformers.models.cohere.configuration_cohere because of the following error (look up to see its traceback):
No module named 'transformers.models.cohere.configuration_cohere'
Does anyone know how to solve it? I've tried with other versions of transformers library, but I guess deepseek_vl2 requires transformers 4.38.2