linzhao-amd commited on
Commit
469c869
·
verified ·
1 Parent(s): b35e9d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -12,7 +12,7 @@ base_model:
12
  - **Output:** Text
13
  - **Supported Hardware Microarchitecture:** AMD MI350/MI355
14
  - **ROCm**: 7.0-Preview
15
- - **Preferred Operating System(s):** Linux
16
  - **Inference Engine:** [SGLang](https://docs.sglang.ai/)
17
  - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html)
18
  - **Weight quantization:** OCP MXFP4
@@ -33,6 +33,7 @@ This model was obtained by quantizing [DeepSeek-R1](https://huggingface.co/deeps
33
  cd Quark/examples/torch/language_modeling/llm_ptq/
34
  python3 quantize_quark.py --model_dir $MODEL_DIR \
35
  --quant_scheme w_mxfp4_a_mxfp4 \
 
36
  --num_calib_data 128 \
37
  --exclude_layers "*mlp.gate.*" "*lm_head" \
38
  --multi_gpu \
 
12
  - **Output:** Text
13
  - **Supported Hardware Microarchitecture:** AMD MI350/MI355
14
  - **ROCm**: 7.0-Preview
15
+ - **Operating System(s):** Linux
16
  - **Inference Engine:** [SGLang](https://docs.sglang.ai/)
17
  - **Model Optimizer:** [AMD-Quark](https://quark.docs.amd.com/latest/index.html)
18
  - **Weight quantization:** OCP MXFP4
 
33
  cd Quark/examples/torch/language_modeling/llm_ptq/
34
  python3 quantize_quark.py --model_dir $MODEL_DIR \
35
  --quant_scheme w_mxfp4_a_mxfp4 \
36
+ --group_size 32 \
37
  --num_calib_data 128 \
38
  --exclude_layers "*mlp.gate.*" "*lm_head" \
39
  --multi_gpu \