GGUF
imatrix
conversational
Thireus commited on
Commit
40bb2c2
·
1 Parent(s): c505819

Update README.md and tensors.map(.sig) files

Browse files
Files changed (3) hide show
  1. README.md +15 -1
  2. tensors.map +0 -0
  3. tensors.map.sig +0 -0
README.md CHANGED
@@ -22,7 +22,7 @@ cd ~
22
  # Make sure to install all ik_llama.cpp compilation dependencies...
23
  apt install python3-dev python3-pip python3-venv python3-wheel python3-setuptools git acl netcat-openbsd cmake # pipx
24
 
25
- # Obtain ik_llama's Thireus version - Windows builds available at https://github.com/Thireus/ik_llama.cpp/releases
26
  git clone https://github.com/Thireus/ik_llama.cpp
27
  cd ik_llama.cpp
28
  git pull
@@ -131,4 +131,18 @@ cd kitchen
131
  ../quant_downloader.sh bf16.recipe
132
  ```
133
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
134
  Enjoy optimized quantization! 🎉
 
22
  # Make sure to install all ik_llama.cpp compilation dependencies...
23
  apt install python3-dev python3-pip python3-venv python3-wheel python3-setuptools git acl netcat-openbsd cmake # pipx
24
 
25
+ # Obtain ik_llama's Thireus version - Windows/macOS/Linux builds available at https://github.com/Thireus/ik_llama.cpp/releases
26
  git clone https://github.com/Thireus/ik_llama.cpp
27
  cd ik_llama.cpp
28
  git pull
 
131
  ../quant_downloader.sh bf16.recipe
132
  ```
133
 
134
+ You can also quantize individual BF16 tensors without the need to download every BF16 .gguf shard:
135
+
136
+ BF16 model shards can also be individually quantized using a special version of ik_llama.cpp's `llama-quantize` utility which comes with the `--individual-tensors` option.
137
+
138
+ - Source code: https://github.com/Thireus/ik_llama.cpp/tree/th/quantize_individual_tensors
139
+ - Builds (macOS, Windows and Linux): https://github.com/Thireus/ik_llama.cpp/releases/tag/th-quantize_individual_tensors-b4210-7a44805
140
+
141
+ Usage example:
142
+ ```
143
+ ./llama-quantize --keep-split --imatrix imatrix_ubergarm.dat --individual-tensors 2,3,1094 Kimi-K2-Thinking-THIREUS-BF16-SPECIAL_TENSOR-00001-of-01097.gguf my_new_shards.gguf iq3_s 12
144
+ ```
145
+
146
+ For more information about how to use it: https://github.com/Thireus/GGUF-Tool-Suite/issues/45
147
+
148
  Enjoy optimized quantization! 🎉
tensors.map CHANGED
The diff for this file is too large to render. See raw diff
 
tensors.map.sig CHANGED
Binary files a/tensors.map.sig and b/tensors.map.sig differ