---
license: apache-2.0
library_name: gguf
tags:
- gguf
- llama.cpp
- quantized
base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
---

# deepseek-ai-deepseek-r1-distill-llama-8b-f16

This repository contains GGUF quantized models converted from [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B).

## Model Details

- **Original Model**: [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
- **Quantization**: F16
- **File Size**: ~14.97 GB
- **Format**: GGUF (llama.cpp compatible)
- **Converted by**: Kaleemullah

## Quantization Information

- **F16**: Half precision (16-bit floating point)

## Usage

### With llama.cpp

```bash
# Download the model
huggingface-cli download Kaleemullah/deepseek-ai-deepseek-r1-distill-llama-8b-f16 deepseek-ai-deepseek-r1-distill-llama-8b-f16.gguf --local-dir ./models

# Run inference
./llama-cli -m ./models/deepseek-ai-deepseek-r1-distill-llama-8b-f16.gguf -p "Your prompt here"
```

### With Python (llama-cpp-python)

```python
from llama_cpp import Llama

# Load the model
llm = Llama(
    model_path="./models/deepseek-ai-deepseek-r1-distill-llama-8b-f16.gguf",
    n_ctx=2048,  # Context window
    n_gpu_layers=-1  # Use GPU if available
)

# Generate text
output = llm("Your prompt here", max_tokens=100)
print(output['choices'][0]['text'])
```

### With Ollama

```bash
# Create a Modelfile
echo 'FROM ./models/deepseek-ai-deepseek-r1-distill-llama-8b-f16.gguf' > Modelfile

# Create the model
ollama create my-model -f Modelfile

# Run the model
ollama run my-model
```

## Model Architecture

This is a quantized version of deepseek-ai/DeepSeek-R1-Distill-Llama-8B, optimized for efficient inference while maintaining model quality.

## License

This model inherits the license from the original model: [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)

## Citation

If you use this model, please cite the original model:

```bibtex
@misc{deepseek-ai-DeepSeek-R1-Distill-Llama-8B,
  author = {Original Model Authors},
  title = {deepseek-ai/DeepSeek-R1-Distill-Llama-8B},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B}
}
```

## Converted with

This model was converted using [llama.cpp](https://github.com/ggml-org/llama.cpp)'s `convert_hf_to_gguf.py` script.

---

**Note**: GGUF models are compatible with llama.cpp, Ollama, LM Studio, and other GGUF-compatible inference engines.