--- license: apache-2.0 library_name: gguf tags: - gguf - llama.cpp - quantized base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B --- # deepseek-ai-deepseek-r1-distill-llama-8b-f16 This repository contains GGUF quantized models converted from [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B). ## Model Details - **Original Model**: [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) - **Quantization**: F16 - **File Size**: ~14.97 GB - **Format**: GGUF (llama.cpp compatible) - **Converted by**: Kaleemullah ## Quantization Information - **F16**: Half precision (16-bit floating point) ## Usage ### With llama.cpp ```bash # Download the model huggingface-cli download Kaleemullah/deepseek-ai-deepseek-r1-distill-llama-8b-f16 deepseek-ai-deepseek-r1-distill-llama-8b-f16.gguf --local-dir ./models # Run inference ./llama-cli -m ./models/deepseek-ai-deepseek-r1-distill-llama-8b-f16.gguf -p "Your prompt here" ``` ### With Python (llama-cpp-python) ```python from llama_cpp import Llama # Load the model llm = Llama( model_path="./models/deepseek-ai-deepseek-r1-distill-llama-8b-f16.gguf", n_ctx=2048, # Context window n_gpu_layers=-1 # Use GPU if available ) # Generate text output = llm("Your prompt here", max_tokens=100) print(output['choices'][0]['text']) ``` ### With Ollama ```bash # Create a Modelfile echo 'FROM ./models/deepseek-ai-deepseek-r1-distill-llama-8b-f16.gguf' > Modelfile # Create the model ollama create my-model -f Modelfile # Run the model ollama run my-model ``` ## Model Architecture This is a quantized version of deepseek-ai/DeepSeek-R1-Distill-Llama-8B, optimized for efficient inference while maintaining model quality. ## License This model inherits the license from the original model: [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) ## Citation If you use this model, please cite the original model: ```bibtex @misc{deepseek-ai-DeepSeek-R1-Distill-Llama-8B, author = {Original Model Authors}, title = {deepseek-ai/DeepSeek-R1-Distill-Llama-8B}, year = {2024}, publisher = {HuggingFace}, url = {https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B} } ``` ## Converted with This model was converted using [llama.cpp](https://github.com/ggml-org/llama.cpp)'s `convert_hf_to_gguf.py` script. --- **Note**: GGUF models are compatible with llama.cpp, Ollama, LM Studio, and other GGUF-compatible inference engines.