Instructions to use SalmanFaroz/Llama-2-7b-samsum with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SalmanFaroz/Llama-2-7b-samsum with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SalmanFaroz/Llama-2-7b-samsum")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("SalmanFaroz/Llama-2-7b-samsum")
model = AutoModelForCausalLM.from_pretrained("SalmanFaroz/Llama-2-7b-samsum")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use SalmanFaroz/Llama-2-7b-samsum with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SalmanFaroz/Llama-2-7b-samsum"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SalmanFaroz/Llama-2-7b-samsum",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/SalmanFaroz/Llama-2-7b-samsum

SGLang

How to use SalmanFaroz/Llama-2-7b-samsum with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SalmanFaroz/Llama-2-7b-samsum" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SalmanFaroz/Llama-2-7b-samsum",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SalmanFaroz/Llama-2-7b-samsum" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SalmanFaroz/Llama-2-7b-samsum",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use SalmanFaroz/Llama-2-7b-samsum with Docker Model Runner:
```
docker model run hf.co/SalmanFaroz/Llama-2-7b-samsum
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Llama-2-7b Fine-Tuned Summarization Model

Overview

The Llama-2-7b Fine-Tuned Summarization Model is a language model fine-tuned for the task of text summarization using QLora. It has been fine-tuned on the samsum dataset, which contains a wide variety of coversation.

Model Details

Base Model: meta-llama/Llama-2-7b-chat-hf
Fine-Tuned on: samsum dataset
Language: English

How to Use

You can use this model for text summarization tasks by utilizing the Hugging Face Transformers library. Here's a basic example in Python:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_id =  "SalmanFaroz/Llama-2-7b-samsum"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")

tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

# Define the input prompt
prompt = """
Summarize the following conversation.

### Input:
Itachi: Kakashi, you must understand the gravity of the situation. The Akatsuki's plans are far more sinister than you can imagine.
Kakashi: Itachi, I need more than vague warnings. Tell me what you know.
Itachi: Very well. The Akatsuki seeks to capture Naruto for the power of the Nine-Tails sealed within him, but there's an even darker secret lurking within their goals.
Kakashi: Darker than that? What are they truly after?
Itachi: They're hunting the Tailed Beasts for a cataclysmic plan to reshape the world, and only we can stop them, together.

### Summary:
"""

inputs = tokenizer(prompt, return_tensors='pt')
output = tokenizer.decode(
    model.generate(
        inputs["input_ids"],
        max_new_tokens=100,
    )[0],
    skip_special_tokens=True
)

print("Output:",output)

Downloads last month: 3

Model tree for SalmanFaroz/Llama-2-7b-samsum

Quantizations

2 models

SalmanFaroz
/

Llama-2-7b-samsum

Llama-2-7b Fine-Tuned Summarization Model

Overview

Model Details

How to Use

Model tree for SalmanFaroz/Llama-2-7b-samsum

Space using SalmanFaroz/Llama-2-7b-samsum 1