Sentence Similarity
sentence-transformers
Safetensors
Transformers
qwen2
text-generation
mteb
Qwen2
custom_code
Eval Results (legacy)
text-embeddings-inference
Instructions to use Alibaba-NLP/gte-Qwen2-1.5B-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Alibaba-NLP/gte-Qwen2-1.5B-instruct with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Alibaba-NLP/gte-Qwen2-1.5B-instruct", trust_remote_code=True) sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use Alibaba-NLP/gte-Qwen2-1.5B-instruct with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Alibaba-NLP/gte-Qwen2-1.5B-instruct", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("Alibaba-NLP/gte-Qwen2-1.5B-instruct", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
Infinity support: Short max_length=2048 for more optimized deployment
#20
by michaelfeil - opened
@ Alibaba Team, thanks so much for the support of this model.
How to run with https://github.com/michaelfeil/infinity
Run via Docker:
docker run --gpus all -p 7997 michaelf34/infinity:0.0.68-trt-onnx v2 --model-id Alibaba-NLP/gte-Qwen2-1.5B-instruct --revision "refs/pr/20" --dtype bfloat16 --batch-size 8 --device cuda --engine torch --port 7997 --no-bettertransformer
Run via CLI:
pip install infinity_emb flash-attn
infinity_emb v2 --model-id Alibaba-NLP/gte-Qwen2-1.5B-instruct --revision "refs/pr/20" --dtype bfloat16 --batch-size 8 --device cuda --engine torch --port 7997 --no-bettertransformer
DO NOT MERGE!
michaelfeil changed pull request title from Infinity support: Short max_length for more optimized deployment to Infinity support: Short max_length=2048 for more optimized deployment
michaelfeil changed pull request status to closed
michaelfeil changed pull request status to open
thenlper changed pull request status to merged
Please do not merge PR, as mentioned above.
Ill open a new PR to undo this -> https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct/discussions/22