amd
/

DeepSeek-R1-MXFP4-Preview

8-bit precision

Model card Files Files and versions

linzhao-amd commited on Aug 1

Commit

bd3481d

·

verified ·

1 Parent(s): 469c869

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -19,12 +19,11 @@ base_model:
   - **Activation quantization:** OCP MXFP4
 - **Calibration Dataset:** [Pile](https://huggingface.co/datasets/mit-han-lab/pile-val-backup)
-The model is the quantized version of the [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) model, which is an auto-regressive language model that uses an optimized transformer architecture. The MXFP4 model is quantized with [AMD-Quark](https://quark.docs.amd.com/latest/index.html).
 # Model Quantization
-This model was obtained by quantizing [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)'s weights and activations to MXFP4, using AutoSmoothQuant algorithm in [AMD-Quark](https://quark.docs.amd.com/latest/index.html).
 **Quantization scripts:**
 ```

   - **Activation quantization:** OCP MXFP4
 - **Calibration Dataset:** [Pile](https://huggingface.co/datasets/mit-han-lab/pile-val-backup)
+This model is a quantized version of [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)，optimized using [AMD-Quark](https://quark.docs.amd.com/latest/index.html) framework with MXFP4 quantization.
 # Model Quantization
+The model was quantized from [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) using [AMD-Quark](https://quark.docs.amd.com/latest/index.html). Weights and activations were quantized to MXFP4. The AutoSmoothQuant algorithm was applied to enhance accuracy during quantization.
 **Quantization scripts:**
 ```