Text Generation
Transformers
Safetensors
multilingual
phi3_v
nlp
code
vision
conversational
custom_code
Instructions to use microsoft/Phi-3-vision-128k-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/Phi-3-vision-128k-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="microsoft/Phi-3-vision-128k-instruct", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-vision-128k-instruct", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use microsoft/Phi-3-vision-128k-instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "microsoft/Phi-3-vision-128k-instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "microsoft/Phi-3-vision-128k-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/microsoft/Phi-3-vision-128k-instruct
- SGLang
How to use microsoft/Phi-3-vision-128k-instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "microsoft/Phi-3-vision-128k-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "microsoft/Phi-3-vision-128k-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "microsoft/Phi-3-vision-128k-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "microsoft/Phi-3-vision-128k-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use microsoft/Phi-3-vision-128k-instruct with Docker Model Runner:
docker model run hf.co/microsoft/Phi-3-vision-128k-instruct
Will microsoft/Phi-3-vision-128k-instruct support transformers v4.51.3?
1
#71 opened 11 months ago
by
Huaiyu666
I am trying the last few hours to make this working. Is there any working script that somebody has?
1
#70 opened about 1 year ago
by
JLouisBiz
[Bug] Model Breaking Dynamic Cache
#69 opened over 1 year ago
by
kylesayrs
Extend eos_token_id
#68 opened over 1 year ago
by
Wovchena
Is that an error in the backend code?
#67 opened over 1 year ago
by
MHRDYN7
Is it possible to train it on a single 3090 using LoRA on PEFT?
1
#66 opened over 1 year ago
by
boskx
Custom Code
#65 opened over 1 year ago
by
Gliding-Hawk
input_ids and attention mask are not of equal size. can someone help me
โ 2
1
#64 opened over 1 year ago
by
harsh99
Placeholder storage has not been allocated on MPS device!
๐ 2
#63 opened over 1 year ago
by
babulance10
Only supporting PIL image?
#62 opened over 1 year ago
by
ryantong3
Accept multi image ?
1
#60 opened almost 2 years ago
by
AyoubChLin
Batch inference
๐ 6
2
#59 opened almost 2 years ago
by
epishchik
Fix h_crop and w_crop device
๐ 1
#58 opened almost 2 years ago
by
lutzroeder
Not be able to deploy in dedicated inference endpoint for phi3 vision model
#56 opened almost 2 years ago
by
Gokulram2710
Is JSON mode available in the Phi-3-vision API
3
#55 opened almost 2 years ago
by
gaurav646
LoRA Finetuning - Text vs Vision Effects
3
#54 opened almost 2 years ago
by
brecker
Will Phi-3 Vision receive the same re-training as Phi-3 Mini ?
๐ 2
1
#50 opened almost 2 years ago
by
Glider95
Is there any inference server which can support Phi-3-vision-128K-instruct?
3
#49 opened almost 2 years ago
by
farzanehnakhaee70
Please add support for ollama.
๐ 5
#48 opened almost 2 years ago
by
BrainSlugs83
Heavy Hallucination | non truncating o/p in Phi 3 Vision Finetuning to convert chart to json
1
#47 opened almost 2 years ago
by
ar9av
Vision encoder 1024 output mapping to img_projection 4096 input?
2
#46 opened almost 2 years ago
by
hmhanna
[AUTOMATED] Model Memory Requirements
1
#44 opened almost 2 years ago
by
model-sizer-bot
pip install phi3vision
1
#43 opened almost 2 years ago
by
ninjannnnnnnnnnnnn
phi3 image tokens
1
#42 opened almost 2 years ago
by
sachin
[Issue] Help with the integration of phi-3-vision on llama.cpp
๐ 19
13
#40 opened almost 2 years ago
by
ManniX-ITA
inference and generation runtime - how to reduce latency
1
#38 opened almost 2 years ago
by
wamozart
How to get embedding vectors of images or texts?
1
#37 opened almost 2 years ago
by
iceleaf97tech
fix-tokenizer
#35 opened almost 2 years ago
by
xinsu
No support for pipline or TextStreamer
3
#34 opened almost 2 years ago
by
mohamedlotfy50
Over alignment issues
#33 opened almost 2 years ago
by
Metricon
Handle Batch Sizes (v0.2)
3
#32 opened almost 2 years ago
by
WilliamSotoM
Shorter context window to reduct inference memory allocation
๐ 2
2
#31 opened almost 2 years ago
by
JochenGrey
Should Phi-3V provide support in llama.cpp?
6
#24 opened almost 2 years ago
by
haohoo
Endless blank space generation
1
#20 opened almost 2 years ago
by
mascIT
The image provided does not contain a table.
#19 opened almost 2 years ago
by
mascIT
Finetuning Scripts
๐โ 16
11
#5 opened about 2 years ago
by
abrakaa