Hito 1.7B - GGUF

Quantized versions for llama.cpp, Ollama, LM Studio, and more

Original Model Website Chat API


Model License Method License

About

This repository contains GGUF quantized versions of hitonet/hito-1.7b.

Hito is a 1.7B parameter model with Nested Cognitive Reasoning - structured, self-correcting thinking patterns that enable better accuracy and transparency.

For the original model (safetensors), training details, benchmarks, and full documentation, see the main repository.


Available Quantizations

⭐ Recommended

File Quant Size Quality Notes
hito-1.7b-Q4_K_M.gguf Q4_K_M 1.1 GB ⭐ BEST Perfect balance of size and quality
hito-1.7b-Q5_K_M.gguf Q5_K_M 1.2 GB ⭐ Excellent Slightly better than Q4_K_M
hito-1.7b-Q8_0.gguf Q8_0 1.8 GB ⭐ Excellent Highest quality quantization

βœ… Good Quality

File Quant Size Quality Notes
hito-1.7b-Q4_0.gguf Q4_0 1.0 GB βœ… Good Legacy format, works well
hito-1.7b-Q4_K_S.gguf Q4_K_S 1.0 GB βœ… Good Smaller Q4 variant
hito-1.7b-Q5_0.gguf Q5_0 1.2 GB βœ… Good Legacy 5-bit
hito-1.7b-Q5_K_S.gguf Q5_K_S 1.2 GB βœ… Good Smaller Q5 variant
hito-1.7b-Q6_K.gguf Q6_K 1.4 GB βœ… Excellent Near full quality
hito-1.7b-F16.gguf F16 3.3 GB βœ… Reference Full precision GGUF

⚠️ Low Quality (Not Recommended)

File Quant Size Quality Notes
hito-1.7b-Q3_K_L.gguf Q3_K_L 957 MB ⚠️ Fair May get stuck in thinking
hito-1.7b-Q3_K_M.gguf Q3_K_M 896 MB ⚠️ Fair Occasional issues
hito-1.7b-Q3_K_S.gguf Q3_K_S 827 MB ⚠️ Fair Noticeable quality loss

❌ Broken (Do Not Use)

File Quant Size Quality Notes
hito-1.7b-Q2_K.gguf Q2_K 742 MB ❌ Broken Produces gibberish

Quick Start

Ollama

# Download the recommended quantization
wget https://huggingface.co/hitonet/hito-1.7b-GGUF/resolve/main/hito-1.7b-Q4_K_M.gguf

# Create Modelfile
cat > Modelfile << 'EOF'
FROM hito-1.7b-Q4_K_M.gguf
SYSTEM "You are Hito by Hitonet.com."
PARAMETER temperature 0.7
PARAMETER stop "<|im_end|>"
EOF

# Create and run
ollama create hito -f Modelfile
ollama run hito

llama.cpp

./llama-cli -m hito-1.7b-Q4_K_M.gguf \
  -sys "You are Hito by Hitonet.com." \
  -p "What is your name?" \
  -n 256

LM Studio

  1. Download any GGUF file from this repository
  2. Open LM Studio β†’ Load Model
  3. Set system prompt: You are Hito by Hitonet.com.
  4. Start chatting!

Compatibility

These GGUF files work with:

  • Ollama (recommended)
  • llama.cpp
  • LM Studio
  • Jan
  • GPT4All
  • llama-cpp-python
  • Any llama.cpp-compatible application

What Makes Hito Special

  • Structured Thinking: Uses <think> tags with nested cognitive reasoning
  • Self-Correcting: <doubt> and <verify> tags catch errors mid-reasoning
  • Humble by Design: Admits uncertainty rather than hallucinating
  • Efficient: Only 1.7B parameters, runs on CPU

For full documentation, benchmarks, and training details, see the main repository.


Licensing

Component License Commercial Use
Model Weights Apache 2.0 βœ… Free
NCR Methodology CC BY-NC-ND ⚠️ License Required

Contact: [email protected]


Made with genuine curiosity by Hitonet
Downloads last month
556
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for hitonet/hito-1.7b-GGUF

Finetuned
Qwen/Qwen3-1.7B
Finetuned
hitonet/hito-1.7b
Quantized
(4)
this model