Instructions to use helizac/Novus-7b-tr_v1-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use helizac/Novus-7b-tr_v1-GGUF with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("helizac/Novus-7b-tr_v1-GGUF", dtype="auto") - llama-cpp-python
How to use helizac/Novus-7b-tr_v1-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="helizac/Novus-7b-tr_v1-GGUF", filename="Novus-7b-tr_v1.F32.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use helizac/Novus-7b-tr_v1-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf helizac/Novus-7b-tr_v1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf helizac/Novus-7b-tr_v1-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf helizac/Novus-7b-tr_v1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf helizac/Novus-7b-tr_v1-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf helizac/Novus-7b-tr_v1-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf helizac/Novus-7b-tr_v1-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf helizac/Novus-7b-tr_v1-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf helizac/Novus-7b-tr_v1-GGUF:Q4_K_M
Use Docker
docker model run hf.co/helizac/Novus-7b-tr_v1-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use helizac/Novus-7b-tr_v1-GGUF with Ollama:
ollama run hf.co/helizac/Novus-7b-tr_v1-GGUF:Q4_K_M
- Unsloth Studio
How to use helizac/Novus-7b-tr_v1-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for helizac/Novus-7b-tr_v1-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for helizac/Novus-7b-tr_v1-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for helizac/Novus-7b-tr_v1-GGUF to start chatting
- Docker Model Runner
How to use helizac/Novus-7b-tr_v1-GGUF with Docker Model Runner:
docker model run hf.co/helizac/Novus-7b-tr_v1-GGUF:Q4_K_M
- Lemonade
How to use helizac/Novus-7b-tr_v1-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull helizac/Novus-7b-tr_v1-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Novus-7b-tr_v1-GGUF-Q4_K_M
List all available models
lemonade list
Novus-7b-tr_v1 - GGUF
Model creator: mlabonne
Original model: Daredevil-7B
Model Fine-Tuner: Novus Research
Fine-tuned model: Novus-7b-tr_v1
Description
This repo contains GGUF format model files for mlabonne's Daredevil-7B model, fine-tuned to create Novus-7b-tr_v1 by Novus Research.
About GGUF
GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
Here is an incomplete list of clients and libraries that are known to support GGUF:
- llama.cpp. The source project for GGUF. Offers a CLI and a server option.
- text-generation-webui, the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
- KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for storytelling.
- GPT4All, a free and open-source local running GUI, supporting Windows, Linux, and macOS with full GPU accel.
- LM Studio, an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. Linux available, in beta as of 27/11/2023.
- LoLLMS Web UI, a great web UI with many interesting and unique features, including a full model library for easy model selection.
- Faraday.dev, an attractive and easy-to-use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
- llama-cpp-python, a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
- candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use.
- ctransformers, a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. Note, as of time of writing (November 27th 2023), ctransformers has not been updated in a long time and does not support many recent models.
Novus Research
Novus Research is committed to pushing the boundaries in natural language processing by collaborating with the open-source community through innovative research. This commitment is coupled with our focus on empowering businesses with tailored, on-site AI and large language model solutions.
Compatibility
These quantized GGUF files are compatible with candle from Hugging Face.
Provided Files
| Name | Bit | Quant Method | Size | Use case |
|---|---|---|---|---|
| Novus-7b-tr_v1.Q2_K.gguf | 2 | Q2_K | 2.72G | Smallest size, lowest precision |
| Novus-7b-tr_v1.Q3_K.gguf | 3 | Q3_K | 3.16G | Very low precision |
| Novus-7b-tr_v1.Q3_K_S.gguf | 3 | Q3_K_S | 3.52G | Low precision, level 0 |
| Novus-7b-tr_v1.Q3_K_M.gguf | 3 | Q3_K_M | 3.82G | Slightly better than Q4_0 |
| Novus-7b-tr_v1.Q3_K_L.gguf | 3 | Q3_K_L | 3.47G | Kernel optimized, low precision |
| Novus-7b-tr_v1.Q4_0.gguf | 4 | Q4_0 | 4.11G | Moderate precision, level 0 |
| Novus-7b-tr_v1.Q4_K_M.gguf | 4 | Q4_K_M | 4.37G | Better than Q5_0 |
| Novus-7b-tr_v1.Q5_0.gguf | 5 | Q5_0 | 5.00G | Kernel optimized, moderate precision |
| Novus-7b-tr_v1.Q5_K_S.gguf | 5 | Q5_K_S | 5.00G | Higher precision than Q5_K |
| Novus-7b-tr_v1.Q5_K_M.gguf | 5 | Q5_K_M | 5.13G | Higher precision, level 0 |
| Novus-7b-tr_v1.Q6_K.gguf | 6 | Q6_K | 5.94G | Highest precision, level 1 |
| Novus-7b-tr_v1.Q8_0.gguf | 8 | Q8_0 | 7.77G | Kernel optimized, high precision |
| Novus-7b-tr_v1.F32.gguf | 32 | F32 | 29.00G | Single-precision floating point |
How to Download
To download the models, you can use the huggingface-cli command or the equivalent Python code with hf_hub_download.
Using huggingface-cli command:
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF <model_file>
For example, to download the Q2_K model:
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q2_K.gguf
Downloading all models:
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q2_K.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q3_K.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q4_0.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q4_1.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q4_K.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q5_0.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q5_1.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q5_K.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q6_K.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q8_0.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q8_1.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_Q8_K.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_F16.gguf
huggingface-cli download helizac/Novus-7b-tr_v1-GGUF Novus-7b-tr_v1_F32.gguf
Using Python:
from huggingface_hub import hf_hub_download
hf_hub_download("helizac/Novus-7b-tr_v1-GGUF", "<model_file>")
To download all models, you can run:
model_files = [
"Novus-7b-tr_v1_Q2_K.gguf",
"Novus-7b-tr_v1_Q3_K.gguf",
"Novus-7b-tr_v1_Q4_0.gguf",
"Novus-7b-tr_v1_Q4_1.gguf",
"Novus-7b-tr_v1_Q4_K.gguf",
"Novus-7b-tr_v1_Q5_0.gguf",
"Novus-7b-tr_v1_Q5_1.gguf",
"Novus-7b-tr_v1_Q5_K.gguf",
"Novus-7b-tr_v1_Q6_K.gguf",
"Novus-7b-tr_v1_Q8_0.gguf",
"Novus-7b-tr_v1_Q8_1.gguf",
"Novus-7b-tr_v1_Q8_K.gguf",
"Novus-7b-tr_v1_F32.gguf"
]
for model_file in model_files:
hf_hub_download("helizac/Novus-7b-tr_v1-GGUF", model_file)
You can also specify a folder to download the file(s) to:
hf_hub_download("helizac/Novus-7b-tr_v1-GGUF", "<model_file>", local_dir="<output_directory>")
Usage
!pip install llama-cpp-python
from llama_cpp import Llama
# Download the model from Hugging Face (replace URL with the actual one)
model_url = "https://huggingface.co/helizac/Novus-7b-tr_v1-GGUF/blob/main/Novus-7b-tr_v1.Q8_0.gguf"
model_path = "Novus-7b-tr_v1.gguf" # Local filename
# Function to download the model (optional)
def download_model(url, filename):
import urllib.request
if not os.path.isfile(filename):
urllib.request.urlretrieve(url, filename)
print(f"Downloaded model: {filename}")
download_model(model_url, model_path)
# Load the model
llm = Llama(model_path=model_path)
prompt = "Büyük dil modelleri nelerdir?"
# Adjust these parameters for different outputs
max_tokens = 256
temperature = 0.7
output = llm(prompt, max_tokens=max_tokens, temperature=temperature)
output_text = output["choices"][0]["text"].strip()
print(output_text)
Acknowledgements
This model is built on top of the efforts from the NovusResearch and mlabonne teams, and we appreciate their contribution to the AI community.
GGUF model card:
{Furkan Erdi}
- Downloads last month
- 98
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
32-bit
Model tree for helizac/Novus-7b-tr_v1-GGUF
Base model
mlabonne/Daredevil-7B