About
This repository contains GGUF quantized versions of hitonet/hito-1.7b.
Hito is a 1.7B parameter model with Nested Cognitive Reasoning - structured, self-correcting thinking patterns that enable better accuracy and transparency.
For the original model (safetensors), training details, benchmarks, and full documentation, see the main repository.
Available Quantizations
β Recommended
| File | Quant | Size | Quality | Notes |
|---|---|---|---|---|
| hito-1.7b-Q4_K_M.gguf | Q4_K_M | 1.1 GB | β BEST | Perfect balance of size and quality |
| hito-1.7b-Q5_K_M.gguf | Q5_K_M | 1.2 GB | β Excellent | Slightly better than Q4_K_M |
| hito-1.7b-Q8_0.gguf | Q8_0 | 1.8 GB | β Excellent | Highest quality quantization |
β Good Quality
| File | Quant | Size | Quality | Notes |
|---|---|---|---|---|
| hito-1.7b-Q4_0.gguf | Q4_0 | 1.0 GB | β Good | Legacy format, works well |
| hito-1.7b-Q4_K_S.gguf | Q4_K_S | 1.0 GB | β Good | Smaller Q4 variant |
| hito-1.7b-Q5_0.gguf | Q5_0 | 1.2 GB | β Good | Legacy 5-bit |
| hito-1.7b-Q5_K_S.gguf | Q5_K_S | 1.2 GB | β Good | Smaller Q5 variant |
| hito-1.7b-Q6_K.gguf | Q6_K | 1.4 GB | β Excellent | Near full quality |
| hito-1.7b-F16.gguf | F16 | 3.3 GB | β Reference | Full precision GGUF |
β οΈ Low Quality (Not Recommended)
| File | Quant | Size | Quality | Notes |
|---|---|---|---|---|
| hito-1.7b-Q3_K_L.gguf | Q3_K_L | 957 MB | β οΈ Fair | May get stuck in thinking |
| hito-1.7b-Q3_K_M.gguf | Q3_K_M | 896 MB | β οΈ Fair | Occasional issues |
| hito-1.7b-Q3_K_S.gguf | Q3_K_S | 827 MB | β οΈ Fair | Noticeable quality loss |
β Broken (Do Not Use)
| File | Quant | Size | Quality | Notes |
|---|---|---|---|---|
| hito-1.7b-Q2_K.gguf | Q2_K | 742 MB | β Broken | Produces gibberish |
Quick Start
Ollama
# Download the recommended quantization
wget https://huggingface.co/hitonet/hito-1.7b-GGUF/resolve/main/hito-1.7b-Q4_K_M.gguf
# Create Modelfile
cat > Modelfile << 'EOF'
FROM hito-1.7b-Q4_K_M.gguf
SYSTEM "You are Hito by Hitonet.com."
PARAMETER temperature 0.7
PARAMETER stop "<|im_end|>"
EOF
# Create and run
ollama create hito -f Modelfile
ollama run hito
llama.cpp
./llama-cli -m hito-1.7b-Q4_K_M.gguf \
-sys "You are Hito by Hitonet.com." \
-p "What is your name?" \
-n 256
LM Studio
- Download any GGUF file from this repository
- Open LM Studio β Load Model
- Set system prompt:
You are Hito by Hitonet.com. - Start chatting!
Compatibility
These GGUF files work with:
- Ollama (recommended)
- llama.cpp
- LM Studio
- Jan
- GPT4All
- llama-cpp-python
- Any llama.cpp-compatible application
What Makes Hito Special
- Structured Thinking: Uses
<think>tags with nested cognitive reasoning - Self-Correcting:
<doubt>and<verify>tags catch errors mid-reasoning - Humble by Design: Admits uncertainty rather than hallucinating
- Efficient: Only 1.7B parameters, runs on CPU
For full documentation, benchmarks, and training details, see the main repository.
Licensing
| Component | License | Commercial Use |
|---|---|---|
| Model Weights | Apache 2.0 | β Free |
| NCR Methodology | CC BY-NC-ND | β οΈ License Required |
Contact: [email protected]
Made with genuine curiosity by Hitonet
- Downloads last month
- 556
Hardware compatibility
Log In
to view the estimation
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit