Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -292,10 +292,16 @@ Our INT4 model is only optimized for batch size 1, so expect some slowdown with
 ## Results (A100 machine)
 | Benchmark (Latency)              |                |                            |
 |----------------------------------|----------------|----------------------------|
-|                                  | Phi-4 mini-Ins | phi4-mini-INT4       |
 | latency (batch_size=1)           | 2.46s          | 2.2s (1.12x speedup)       |
 | serving (num_prompts=1)          | 0.87 req/s     | 1.05 req/s (1.20x speedup) |
 Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
 Int4 weight only is optimized for batch size 1 and short input and output token length, please stay tuned for models optimized for larger batch sizes or longer token length.
 <details>

 ## Results (A100 machine)
 | Benchmark (Latency)              |                |                            |
 |----------------------------------|----------------|----------------------------|
+|                                  | Phi-4 mini-Ins | phi4-mini-INT4             |
 | latency (batch_size=1)           | 2.46s          | 2.2s (1.12x speedup)       |
 | serving (num_prompts=1)          | 0.87 req/s     | 1.05 req/s (1.20x speedup) |
+## Results (H100 machine)
+| Benchmark (Latency)              |                |                            |
+|----------------------------------|----------------|----------------------------|
+|                                  | Phi-4 mini-Ins | phi4-mini-INT4             |
+| latency (batch_size=1)           | 1.61s          | 1.08s (1.49x speedup)      |
 Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
 Int4 weight only is optimized for batch size 1 and short input and output token length, please stay tuned for models optimized for larger batch sizes or longer token length.
 <details>